Language Technology at UiT The Arctic University of Norway

The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages

View GiellaLT on GitHub divvungiellatekno/giellalt.uit.no

Page Content

Speller meeting

What is needed for a new release:

Installer

DONE:

TODO:

POSTPONED:

PLX conversion

The latest PLX file is substantially smaller than earlier, only 6,5 Gb vs 25+ Gb earlier. The change has happened during the last day or two.

Remaining PLX bugs:

First PRI

Words are accepted and suggested in their compounding-form

| cmp-form-suggestions | guollegukse, ovda, duoppemuorr | 397,461 | FIXED | cmp-form-suggestions | váigas, suolo | 581,909,803,912

Discovered one case of compounds without hyphen:

| non-hyph noun+acro cmps | muoreNRK | 805

First and a half PRI

compounds

| generated cmps not rec. | some not recognized: guollebusse | 397 | FIXED | prefix-compounds | not recognized:julevbargodepartemeanta | 408 | noun+clit not rec. | vástidanproseantage | 451

Second Last PRI

Section for Tags-not-working

| doesn’t follow cmp-tags | niibbispiidni, sámedikkepresideanta, beavddeguorra | 489,539,709 | +CmpN/None in comp-sugg | 1883-as, guovttenuppelotsahán, ovdamearkadihti | 508,619,717,908 | +Use/LexSub accepted | ovttatládje, duoppemuorr | 593,646 | FIXED | +Use/-Spell accepted | muorrajogat(muorra) | 840 | FIXED | +Use/SpellNoSugg | alhpabet gets suggested: a,đ etc | 461

Bugs below this line can be left out of the next release if we are short on time.

Compounds

| imposs” cmps along w num.| 0-geažideapmigárvu (geažideapmigárvu is impossible) | 536,1145 | num cmp:s on 0- | 051-nummarat | 631 | name/noun+adv cmps | Kuorak-ain, NRK-an | 642,913 | hyphened suggestions | deahtta-samus +A+Attr +Noun | 940 | non-ex. word accepted | saame | 658

Last PRI

| doesn’t understand caps | 1700-LOHKU | 647 | SgNomadj-cmps not work | láikiriehpu (suggestions:0-láikiriehpu…) | 421

DONE:

TODO:

Release plan