Language Technology at UiT The Arctic University of Norway

The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages

View GiellaLT on GitHub divvungiellatekno/giellalt.uit.no

Page Content

Speller meeting

What is needed for a new release:

Installer

Variablar for installeraren:

Verkty for debugging:

Tested divvunsme.msi, Kukkuniiaat.msi and divvun-exe package on Windows 7 32-bit/MS Office: exe-package works for both the user installing the speller and another user that hasn’t installed the speller, the .msi packages do not work for the second user.

Greenlandic works the same as our Sámi installer, that is, it is non-functional for other local users but the installing user, unless the registry entries are manually updated for each user.

DONE:

TODO:

POSTPONED:

PLX conversion

Remaining PLX bugs:

First PRI

Words are accepted and suggested in their compounding-form

| cmp-form-suggestions | váigas, suolo | 581,909,803,912 Tomi: I haven’t found the reason for this

REGRESSION - ALPH + NOUN NOT ACCEPTED

| a-muorra | 785,818

REGRESSION - DOESN’T RECOGNIZE SOME ACRO/PROP+NOUN, seems random

| PROP+NOUN | Koskivuori-plánenreaiddut | 611 | ACRO+NOUN NOUN+ACRO | AP-rávvagat SF-muorra muorra-NRK | 611,633,805,931 | PREF+PROP NOUN+PROP | ovda-Lot psykiatriija-Álaš | 595

Unsolvable. If we fix these, we will get back the clitic-within-compound bug. The question is what is worse, and we have decided to accept some of the bugs above.

The middle ground we settle on is to accept hyphen-final words for word types that always require hyphens, like proper nouns, abbreviations and acronyms, but NOT for other words.

SUB accepted

| NRK-Finnmarku | 805

Second Last PRI

Section for Tags-not-working

| doesn’t follow cmp-tags | sámedikkepresideanta | 489 | +CmpN/None in comp-sugg | 1883-as | 508

Bugs below this line can be left out of the next release if we are short on time.

Compounds

| imposs” cmps along w num.| 0-geažideapmigárvu (geažideapmigárvu is impossible) | 536,1145 | num cmp:s on 0- | 051-nummarat | 631 | name/noun+adv cmps | Kuorak-ain | 642 | hyphened suggestions | deahtta-samus +A+Attr +Noun | 940 | non-ex. word accepted | saame | 658

Last PRI

| doesn’t understand caps | 1700-LOHKU | 647 | recognized, but not suggested | biilarievttijođiheaddjái | 819 | láibi-sánis not recognized | - | 380,452

DONE:

TODO:

Release plan