Language Technology at UiT

The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages

View GiellaLT on GitHub divvungiellatekno/giellalt.uit.no

Hunspell

Status

áigegárživuohta --≥ áige-gárživuohta/áige gárživuohta

áige-gárživuohta
áige-gárživuohta	áigi+Time+N+SgNomCmp+Hyph+Cmp#gárži+A+Der/vuohta+N+Sg+Nom

albmaládje --≥ albma-ládje

albma
albma	albma+Adv
albma	albma+A+Sg+Nom
albma	albma+A+Attr

ládje
ládje	ládje+Po
ládje	ládje+Adv

albma-ládje
albma-ládje	albma-ládje	+?

ávčču-han

Single-letter entries that should be removed - use a Use/-Spell filter

Æ+Use/-Spell+N+ABBR+RCmpnd
Ø+Use/-Spell+N+ABBR+Gen
Ŧ+Use/-Spell+N+ABBR+Ill

Debug-session on the compound-stem-without-hyphen bug

Using xfst inspect net on spellernonrec we find that:

Inspect: 1
b e a i v:0 0:^ i:v +CmpN/SgN:e +CmpN/SgG:0 +CmpN/PlG:0 +Time:0 +N:> +SgNomCmp:0
@U.NeedsVowRed.ON@ +RCmpnd:0 @U.NeedsVowRed.ON@ @C.NeedsVowRed@ @D.NeedNoun.ON@
--> Level 19 (final)

b e a i v 0 i +CmpN/SgN +CmpN/SgG +CmpN/PlG +Time +N +SgNomCmp +RCmpnd
b e a i 0 ^ v e         0         0         0     >  0         0

beai^ve>

That is, the RCmpnd form has no hyphen! It turns out that word-final hyphens are removed when building spellernonrec. We change the Makefile and the result is:

Inspect: 1
b e a i v:0 0:^ i:v +CmpN/SgN:e +CmpN/SgG:0 +CmpN/PlG:0 +Time:0 +N:> +SgNomCmp:0
@U.NeedsVowRed.ON@ +RCmpnd:- @U.NeedsVowRed.ON@ @C.NeedsVowRed@ @D.NeedNoun.ON@
--> Level 19 (final)

b e a i v 0 i +CmpN/SgN +CmpN/SgG +CmpN/PlG +Time +N +SgNomCmp +RCmpnd
b e a i 0 ^ v e         0         0         0     >  0         -

beai^ve>-

The RCmpnd form now has a hyphen, and the Hunspell conversion should be good to go again:)

TILTAK:

Tekniske problem

Hyphen

Verkar bra i dei fleste tilfella. Det er berre ein stor miss:

Problem:

Allment

Vi treng installeringspakke, fordi det ikkje er opplagt for folk kor filene skal liggja, og fordi det er ei plist-fil som skal endrast.