Language Technology at UiT

The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages

View GiellaLT on GitHub divvungiellatekno/giellalt.uit.no

Speller meeting

What is needed for a new release:

Installer

Nothing new since the previous meeting.

There’s a separate document containing the test results for different combinations of MS Office and Windows versions.

DONE:

Z'KpbtLFg=q~u8wq9[\[]6fSpeller>o?4P8a4R~=zb`Eohjn9!

TODO:

POSTPONED:

PLX conversion

Nothing new since last meeting 14.12. Tomi has made a new speller, but the test results were identical to the previous version.

Remaining PLX bugs:

Second PRI

REGRESSIONS      
doesn’t follow cmp-tags OR vowel-shortening sámediggepresideanta Sámediggeáirrasin 489 THIS ONE!THIS ONE!THIS ONE!
Fixed regressions      
doesn’t follow cmp-tags OR vowel-shortening searvvepresideanta > searvepresideanta 489 FIXED
doesn’t follow cmp-tags OR vowel-shortening rájirastema 535,604,639 FIXED
alph+nouns not rec a-muorra 785,818 FIXED
double hyph-sugg SF–muorra 818 FIXED

TODO - ALT.1:

TODO - ALT.2:

We decided to follow Alternative 1, but regard Alternative 2 as an option after a release based on Alternative 1.

With the latest speller only samediggepresideantta is left as the single outstanding bug. Thomas and I identified the change that fixed it last time, and redid that chagne. Now we need a new speller + test, to see if the change had the intended effect.

Second Last PRI

Bugs from here on can be left out of the next release if we are short on time.

Compounds    
num cmp:s on 0- 051-nummarat 631
non-ex. word accepted saame 658

Last PRI

  | Capitals | — | doesn’t understand caps | 1700-LOHKU | 647


Low priority regressions      
imposs” cmps along w num. 0-geažideapmigárvu (geažideapmigárvu is impossible) 536,1145 NO SUGGESTIONS - GOOD - BUT:
imposs” cmps sákkasteapmifierbmi > aseákkasteapmifierbmi etc 536 not a big deal
single letter suggestions đ 461 not a big deal

DONE:

TODO:

Release plan

The December 1 goal has passed, without meeting the targets. On the plus side is that the number of open PLX bugs have been greatly reduced, and Tomi are squashinhg PLX bugs all the time. It just takes more time than anticipated.

The installer was not easily solved by the WiX alternative - it turned out that the Greenlandic proofing tools installer has the same problems as we have.

Release status

We now have a speller with only a few known bugs, and most of them have fixes. By tomorrow we should have a speller with only one disturbing bug (the sámediggepresideantta bug). This is our release candiate! This is good enough, and contains a number of important linguistic updates for our users.

TODO:

  1. build a new speller with the remaining bug fixes (Tomi)
  2. tag the present state with the string Divvun2.3RC1 (Tomi)
  3. update the adjectives with the noun changes (Tomi)
  4. make new installers with latest SME speller (Sjur)
  5. test newest speller in Word (Thomas)
  6. test against gold standard corpus (Sjur)
  7. release a public release candidate tomorrow morning (Sjur)
  8. if no new serious bugs are found, release as Divvun 2.3 tomorrow afternoon (all)
    1. update list of known bugs
    2. write a short press release emphasising the linguistic updates for North Sámi, and noting that we still have certain problems with Win7/Office2010

Re-scheduling the release plan:

Next meeting

Friday 21.12 at 10.00