Language Technology at UiT The Arctic University of Norway

The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages

View GiellaLT on GitHub divvungiellatekno/giellalt.uit.no

Page Content

Speller meeting

What is needed for a new release:

Installer

There’s a separate document containing the test results for different combinations of MS Office and Windows versions.

DONE:

TODO:

POSTPONED:

PLX conversion

Nothing new since last meeting 14.12. Tomi has made a new speller, but the test results were identical to the previous version.

Remaining PLX bugs:

Second PRI

Strange PLX behaviour  
Mihkalmas-beaivi 593

Thomas has added some more test words to try to help find the issue with this bug. The result seems to be consistent: a lexicalised noun that is just a hyphen away from a dynamic compound, will block that dynamic compound. Thomas will add even more, and more different test words to try to confirm if this behaviour is consistent over more classes of compounds.

Here is the test result:

Mihkalmas-beaivi (253) Mihkalmasbeaivi Missing words in beta 2 WONT FIX IN PLX
Omma-goahti (253) Ommagoahti Missing words in beta 2
Pieski-lávvu (253) Pieskilávvu, (251) Pieskilávu, (249) Pieskilávvo, (247) Pieskilávo Missing words in beta 2
Gaup-áiti (253) Gaupáiti Missing words in beta 2
Moshagen-buvda (253) Moshagenbuvda Missing words in beta 2

We conclude that it is ok to block dynamic hyphen compounds when we have lexicalised non-hyphen variants. It could be considered a feature. The fact that it seems to behave consistently indicates that we can leave it as it is. A nice, undocumented PLX feature :)

REGRESSIONS    
doesn’t follow cmp-tags OR vowel-shortening searvipresideanta > searvepresideanta, sámediggepresideanta Sámediggeáirrasin 489
alphabet as non-first compound part & suggested CV-s 913
noun+Pro Num+Noun/Prop wo hyph máliSoussiid, guovttiolbmo 397,461,642,721,804,805
doesn’t follow cmp-tags ránubiellu > rátnobiellu beavddeguorra > beavdeguorra 489,535,539,604
prop+noun not rec Koskivuori-plánenreaiddut 611,633
non-ex word accepted loahpet, duvnnii, njealjat 909,962,1143

Second Last PRI

Bugs from here on can be left out of the next release if we are short on time.

Compounds    
num cmp:s on 0- 051-nummarat 631
non-ex. word accepted saame 658

Last PRI

  | Capitals | — | doesn’t understand caps | 1700-LOHKU | 647

 !REGRESSION

Compounds

| imposs” cmps along w num. | 0-geažideapmigárvu (geažideapmigárvu is impossible) | 536,1145 | NO SUGGESTIONS - GOOD - BUT: | imposs” cmps sákkasteapmifierbmi> | (225) aseákkasteapmifierbmi ase- | 536 | –”– | (225) asiákkasteapmifierbmi asi- | –”– | (225) ásaákkasteapmifierbmi ása- | –”– | (225) áseákkasteapmifierbmi áse- | –”– | (225) ásoákkasteapmifierbmi áso- | –”– | (225) ásuákkasteapmifierbmi ásu- | –”– | (221) ášoákkasteapmifierbmi ášo- | –”– | (221) ášuákkasteapmifierbmi ášu-

DONE:

TODO:

Release plan

The December 1 goal has passed, without meeting the targets. On the plus side is that the number of open PLX bugs have been greatly reduced, and Tomi are squashinhg PLX bugs all the time. It just takes more time than anticipated.

The installer was not easily solved by the WiX alternative - it turned out that the Greenlandic proofing tools installer has the same problems as we have.

Re-scheduling the release plan: