Language Technology at UiT The Arctic University of Norway

The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages

View GiellaLT on GitHub divvungiellatekno/giellalt.uit.no

Page Content

Speller meeting

What is needed for a new release:

Installer

Tino has mailed us about his latest finding: his WiX installer works if the user has never opened MS Office before, but after the first invocation of any Office application, it doesn’t work anymore using HKLM. Instead the tools must be registered for each user. He is working on a solution.

DONE:

TODO:

POSTPONED:

Remaining PLX bugs:

First PRI

| REGRESSIONS | — | cases not recognized | čikčiin, bahkkejeaddjit, geavaheddjiide, buvttadeddjiide, láidesteaddjin | 376,412,426,452,932 | FIXED already

REGRESSIONS    
doesn’t follow cmp-tags sámedikkepresideanta, ránubiellu, eaŋkiladoaibma etc etc etc 489,535,359,1544

Second PRI

| prop-cmp-suggestions | Anar-julggaštusa>ASA-julggaštusa (Anár-)| 609 | right cmp-tags | muorrajogamuorra not accepted | 840 | FIXED | suggestions | dábálaš0:in | 642

REGRESSIONS - compounds      
alph+clitic *sbat - we don’t want clitics here 1544 FIXED
alph+clitic *gčhan - we don’t want clitics here 1544  

Second Last PRI

Bugs from here on can be left out of the next release if we are short on time.

Compounds      
num cmp:s on 0- 051-nummarat 631  
multipart/long cmps Left stobustávrrádilli 786 FIXED
multipart/long cmps Left Ássanrievttijođiheaddjái 819 FIXED

Last PRI

  | Capitals | — | doesn’t understand caps | 1700-LOHKU | 647


Low priority regressions      
single letter suggestions đ 461 FIXED

Why sámedikkepresideanta is so problematic - it conflicts with other patterns for nouns in the same class/position:

stobustávrrádilli - *NB GaBO NaAE
stobustávrádilli  - *NB NABO NaAE

DONE:

TODO:

Release status

We are very close to release quality, there is good hope that the remaining bugs will be fixed this week.

DONE:

TODO:

  1. build a new speller with the remaining bug fixes (Tomi)
  2. test against gold standard corpus (Sjur)
  3. release RC2 January 21 (Sjur)
  4. if no new serious bugs are found, release as Divvun 2.3 next week (all)
    1. update list of known bugs
    2. write a short press release emphasising the linguistic updates for North Sámi, and noting that we still have certain problems with Win7/Office2010

Updated release plan

Next meeting

Monday 15.1 at 13.30