Language Technology at UiT

The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages

View GiellaLT on GitHub divvungiellatekno/giellalt.uit.no

Speller meeting

What is needed for a new release:

Installer

Nothing new since the previous meeting.

There’s a separate document containing the test results for different combinations of MS Office and Windows versions.

DONE:

TODO:

POSTPONED:

PLX conversion

Remaining PLX bugs:

Second PRI

|  REGRESSIONS - compounds
| ---
|  cmp-tags            | *Bu-ollibusse Bu +CmpN/None             | 397
|  name+clit           | Máretgo not accepted, only w hyph       | 415
|  multipart/long cmps | *humanisttalašearutantihkalaš           | 1544
|  not accepted        | filbmačeahppi gets sugg *filbma-čeahppi | 1544
|  alph+clitic         | *sbat                                   | 1544
|  hyphen suggestion   | *njunuš-ulbmiliin                       | 1544
|  should not compound | *maŋá-soahteáigái                       | 1544

|  REGRESSIONS
| ---
|  Koskivuori-plánenreaiddut | not accepted                                    | 611
|  Mihkalmas-beaivi !        | allowed w hyph, ok as long as wo hyph is ok too | 593
|  word not recognized       | čuohte                                          | 1544

|  REGRESSIONS
| ---
|  doesn't follow cmp-tags OR vowel-shortening | sámediggepresideanta Sámediggeáirrasin | 489 | THIS ONE!THIS ONE!THIS ONE! | FIXED

Second Last PRI

Bugs from here on can be left out of the next release if we are short on time.

Compounds      
num cmp:s on 0- 051-nummarat 631  
non-ex. word accepted saame 658 FIXED FIXED FIXED!

Last PRI

Capitals    
doesn’t understand caps 1700-LOHKU 647

Low priority regressions      
imposs” cmps along w num. 0-geažideapmigárvu (geažideapmigárvu is impossible) 536,1145 NO SUGGESTIONS - GOOD - BUT:
imposs” cmps sákkasteapmifierbmi > aseákkasteapmifierbmi etc 536 not a big deal
single letter suggestions đ 461 not a big deal

DONE:

TODO:

Release plan

The December 1 goal has passed, without meeting the targets. On the plus side is that the number of open PLX bugs have been greatly reduced, and Tomi are squashinhg PLX bugs all the time. It just takes more time than anticipated.

Release status

DONE:

  1. build a new speller with the remaining bug fixes (Tomi)
  2. tag the present state with the string Divvun2.3RC1 (Tomi)
  3. update the adjectives with the noun changes (Tomi)
  4. make new installers with latest SME speller (Sjur)
  5. test newest speller in Word (Thomas)
  6. release a public release candidate tomorrow morning (Sjur)

TODO:

  1. build a new speller with the remaining bug fixes (Tomi)
  2. test against gold standard corpus (Sjur)
  3. release RC2 January 15 (Sjur)
  4. if no new serious bugs are found, release as Divvun 2.3 tomorrow afternoon (all)
    1. update list of known bugs
    2. write a short press release emphasising the linguistic updates for North Sámi, and noting that we still have certain problems with Win7/Office2010

Updated release plan:

Next meeting

Monday 7.1 at 14.00 (without Sjur)