Language Technology at UiT The Arctic University of Norway

The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages

View GiellaLT on GitHub divvungiellatekno/giellalt.uit.no

Page Content

Speller meeting

What is needed for a new release:

Installer

Tino has mailed us about his latest finding: his WiX installer works if the user has never opened MS Office before, but after the first invocation of any Office application, it doesn’t work anymore using HKLM. Instead the tools must be registered for each user. He is working on a solution.

DONE:

TODO:

POSTPONED:

Remaining PLX bugs:

First PRI

REGRESSIONS    
pron+clit not acc dutnjege, monnai, datnai 524,637,655

Second PRI

| prop-cmp-suggestions | Anár-julggaštusa, Máhtte-nammasaš | 609,649 | because of Sirpmá-skuvla-fix | cmp-tags | muorrajogamuorra not accepted | 840 | suggestions | maŋá1-D-soahteáigái | 1544

REGRESSIONS - compounds      
not accepted filbmačeahppi gets sugg *filbma-čeahppi 1544 FIXED
alph+clitic *sbat - we don’t want clitics here 1544 FIXED
alph+noun/adj not rec a-muorra, d-beakkálmasat 785,818,1544 FIXED
Gen-name+noun *Sirpmá-skuvla 1544 FIXED (second part gets Both Nom and Acc/Gen suggestions, but I think we can live with it)

Second Last PRI

Bugs from here on can be left out of the next release if we are short on time.

Compounds    
num cmp:s on 0- 051-nummarat 631
multipart/long cmps Left stobustávrrádilli 786
multipart/long cmps Left Ássanrievttijođiheaddjái 819

Last PRI

  | Capitals | — | doesn’t understand caps | 1700-LOHKU | 647


Low priority regressions      
single letter suggestions đ 461 not a big deal

DONE:

TODO:

Release status

We are very close to release quality, there is good hope that the remaining bugs will be fixed this week.

DONE:

TODO:

  1. build a new speller with the remaining bug fixes (Tomi)
  2. test against gold standard corpus (Sjur)
  3. release RC2 January 21 (Sjur)
  4. if no new serious bugs are found, release as Divvun 2.3 next week (all)
    1. update list of known bugs
    2. write a short press release emphasising the linguistic updates for North Sámi, and noting that we still have certain problems with Win7/Office2010

Updated release plan

Next meeting

Monday 15.1 at 13.30