Language Technology at UiT

The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages

View GiellaLT on GitHub divvungiellatekno/giellalt.uit.no

Speller meeting

What is needed for a new release:

Installer

Variablar for installeraren:

Verkty for debugging:

Tested divvunsme.msi, Kukkuniiaat.msi and divvun-exe package on Windows 7 32-bit/MS Office: exe-package works for both the user installing the speller and another user that hasn’t installed the speller, the .msi packages do not work for the second user.

Setting

[HKEY_CURRENT_USER\Software\Microsoft\Shared Tools\Proofing Tools\1.0\Override\se-NO]
"DLL"="C:\\Program Files\\Divvun\\mssp3samiNorthern-NO.dll"
"LEX"="C:\\Program Files\\Divvun\\mssp3samiNorthern.lex"

manually when logged in as the other user makes spelling available to the other user.

DONE:

TODO:

POSTPONED:

PLX conversion

One ommission - that can’t explain the reduced size - is that particles like almma, bat, ges, han have been lost in the latest spellers. — FIXED in the latest svn, but no speller yet, and thus no speller test results.

Remaining PLX bugs:

First PRI

Words are accepted and suggested in their compounding-form

| cmp-form-suggestions | váigas, suolo | 581,909,803,912 Tomi: I haven’t found the reason for this

REGRESSION - ALPH + NOUN NOT ACCEPTED  
a-muorra 785,818
REGRESSION - DOESN’T RECOGNIZE SOME ACRO/PROP+NOUN, seems random    
PROP+NOUN Koskivuori-plánenreaiddut 611
ACRO+NOUN NOUN+ACRO AP-rávvagat SF-muorra muorra-NRK 611,633,805,931
PREF+PROP NOUN+PROP ovda-Lot psykiatriija-Álaš 595
REGRESSION - HYPHENED SUGGESTION W CLITIC  
birgen-nai 637 FIXED

SUB accepted

| NRK-Finnmarku | 805

This is difficult. Clitic is N+ which applies to all N* wordclasses.

bir-get--	WIX	// birget+V+Inf-

bir-get--	NIX,NtPABX	// birget+V+Inf-
bir-get	VI	// birget+V+Inf
bir-gen	VIBOE	// birget+V+Actio+Nom
bir-gen	VIBOE	// birget+V+Actio+Gen
bir-gen	VI	// birget+V+PrfPrc
bir-gen	NAIBE	// birget+V+PrfPrc

nai	V+W,J+W,U+W,W+W,Ja+W,Jp+W,Jn+W,N+W,Na+W,Np+W,Nn+W

Prompt: Check returns 0[0:0]/1, @0:10 for 'birgen-nai'
Getting suggestions for birgen-nai...
Suggestions:
230	birget-nai
Prompt: Terminate returns 0

We can’t have both laibi-sátni and avoid birget-nai

REGRESSION - CLITIC SHOW UP AS MIDDLE PART OF COMPOUND    
muorrastávrrágeádnu 786 FIXED

Second Last PRI

Compounds - num/noun+noun hyph suggestions masks for right sugg

| hyphened suggestions | guollibusse>guolli-busse (right:guollebusse) guokte-dássásaš (right:guovttedássásaš) | 397,426,489,535,536,540,619 | FIXED

Section for Tags-not-working

| doesn’t follow cmp-tags | sámedikkepresideanta | 489 | +CmpN/None in comp-sugg | 1883-as | 508 | +CmpN/None in comp-sugg | Juovla-CD-as | 717 | FIXED | +Use/SpellNoSugg | alhpabet gets suggested: a,đ etc | 461 | FIXED

Bugs below this line can be left out of the next release if we are short on time.

Compounds

| imposs” cmps along w num.| 0-geažideapmigárvu (geažideapmigárvu is impossible) | 536,1145 | num cmp:s on 0- | 051-nummarat | 631 | name/noun+adv cmps | Kuorak-ain | 642 | name/noun+adv cmps | NRK-an | 913 | FIXED | hyphened suggestions | deahtta-samus +A+Attr +Noun | 940 | non-ex. word accepted | saame | 658

Last PRI

| doesn’t understand caps | 1700-LOHKU | 647 | recognized, but not suggested | biilarievttijođiheaddjái | 819 | láibi-sánis not recognized |   | 380,452

DONE:

TODO:

Release plan