The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages
Apertium sme-sma 28.8.
Present: Francis, Lene, Trond.
bidix fst result
---------------------------------------------------------------
skuvla<n> .INTERSECT. skuvla<sem_org><sem_plc><n> = ()
skuvla<n> .INTERSECT. skuvla<n><sem_org><sem_plc> = (skuvla<n>)
skuvla<n> .INTERSECT. skuvla<n> = (skuvla<n>)
---------------------------------------------------------------
But:
solutions:
So:
We do the following
here is the explicit description of what we want:
---------------------------------------------------------------
fst:
<n><sem_org><sem_plc><sg>... after reshuffle
<n><prop><sem_plc><sg>... from the fst
<adv><sem_time>
cg: semtags in, nosemtags out
bidix: no sem-tags
---------------------------------------------------------------
Son lea riegádan "jagi 1974" ja bajásšaddan "Romssas".
Dihte "jaepien 1974" reakedi jih "Tromsene" byjjesovvi.
^Son/Son<pron><pers><sg3><nom><@SUBJ→>$ ^lea/leat<v><iv><ind><prs><sg3><@+FAUXV>$ ^riegádan/riegádit<v><iv><prfprc><@-FMAINV>$ ^jagi/jahki<n><sg><gen><@←ADVL>$ ^1974/1974<num><sg><nom><@N←>$ ^ja/ja<cc><@CNP>$ ^bajásšaddan/bajásšaddat<v><iv><prfprc><@-FMAINV>$ ^Romssas/Romsa<n><prop><sg><loc><@←ADVL>$^../..<clb>$
$ echo "Son lea riegádan jagi 1974 ja bajásšaddan Romssas." | apertium -d . sme-sma
Dïhte jaepien 1974 reakadamme jïh Tromsøesne byjjenamme.
^pron-pers<SN><@SUBJ→><PX>{^Dïhte<pron><pers><sg3><nom>$}$ ^v<SV><@+FMAINV>{^reakadidh<v><iv><prfprc>$}$ ^n-num<SN><@←ADVL><PX>{^jaepie<n><sg><gen>$ ^1974<num><sg><nom>$}$ ^cc<CC><@CNP>{^jïh<cc>$}$ ^prfprc<SV><@-FMAINV>{^byjjenidh<v><iv><prfprc>$}$ ^n<SN><@←ADVL><PX>{^Tromsø<n><prop><sg><ine>$}$^sent<SENT><@X>{^..<clb>$}$
"<Son>"
"son" Pron Pers Sg3 Nom @SUBJ>
"<lea>"
"leat" V IV Ind Prs Sg3 @+FAUXV
"<riegádan>"
"riegádit" V IV PrfPrc @-FMAINV
"<jagi>"
"jahki" N Sg Gen @<ADVL
"<1974>"
"1974" Num Sg Nom @N<
"<ja>"
"ja" CC @CNP <==================== @CVP would tell that there is coming a new finite verb
"<bajásšaddan>"
"bajásšaddat" V IV PrfPrc @-FMAINV
"<Romssas>"
"Romsa" N Prop Sg Loc @<ADVL
"<.>"
"." CLB
Links:
Before:
$ echo "Nieiddat leat čeahpit. " | apertium -d . sme-sma
Nïejth leah væjkelh.
After:
$ echo "Nieiddat leat čeahpit. " | apertium -d . sme-sma
Nïejth leah væjkele.
^buot/buot<pron><indef><attr><@→N>$ ^gielaid/giella<n><pl><acc><@←OBJ>$
^attr-n<SN><@←OBJ><PX>{^gaajhke<pron><indef><pl><acc>$ ^gïele<n><pl><acc>$}$
gaajhkide gïelide
echo "Mun boađán boahtte jagi." | apertium -d . sme-sma
Manne båetije jaepien båatam.
båetije båetedh+V+IV+PrsPrc
båetije båetedh+V+IV+Der/NomAg+N+Sg+Nom
Here is a solution
<e><p><l>boahtte<s n="a"/></l><r>båetije<s n="a"/></r></p></e>
Lexicon entry form:
<e r="LR"><p><l>boahtte<s n="a"/><s n="attr"/></l><r>båetije<s n="a"/><s n="ND"/><s n="CD"/></r></p></e>
<e r="RL"><p><l>boahtte<s n="a"/><s n="attr"/></l><r>båetije<s n="a"/></r></p></e>
Now:
^pron-pers<SN><@SUBJ→><PX>{^Manne<pron><pers><sg1><nom>$}$ ^v<SV><@+FMAINV>{^båetedh<v><iv><ind><prs><sg1>$}$ ^a-n<SN><@←ADVL><PX>{^båetije<a><sg><gen>$ ^jaepie<n><sg><gen>$}$^sent<SENT><@X>{^..<clb>$}$
$ echo "Mun boađán boahtte jagi." | apertium -d . sme-sma
Manne #båetije jaepien båatam.
So, what is the role of båetijen?
boahtte jagis -> båetijen jaepesne
attr loc gen.attr ine
båetije båetije+A+Attr <=== remove Attr and give cases
båetijen båetije+A+Gen+Attr
båetije båetije+A+Sg+Nom
båetijen båetije+A+Sg+Acc Use/MT
båetijen båetije+A+Sg+Gen
båetijen båetije+A+Sg+Ine
båetijen båetije+A+Sg+Ela
båetijen båetije+A+Sg+...
sme:
buori buorre+A+Sg+Gen
buori buorre+A+Sg+Acc
buorre
buorre buorre+A+Sg+Nom
boahtte boahtte+A+Attr => båetije/båetijen
Pron+Dem @→N = +Det+Dem (?)
Pron+Dem +Attr @→N = +Det+Dem (?)
We then have four types:
<e r="LR"><p><l>boahtte<s n="a"/><s n="attr"/></l><r>båetije<s n="a"/><s n="ND"/><s n="CD"/></r></p></e>
<e r="RL"><p><l>boahtte<s n="a"/><s n="attr"/></l><r>båetije<s n="a"/></r></p></e>
-
-
<e r="LR"><p><l>dehálaš<s n="a"/></l><r>vihkeles<s n="a"/><s n="indecl"/></r></p></e>
<e r="RL"><p><l>dehálaš<s n="a"/></l><r>vihkeles<s n="a"/></r></p></e>
He sees the X cat.
in langs/sma:
./configure --with-hfst --enable-apertium --enable-oahpa
the file is
sma/src/morphology/*/adjectives-oahpa.lexc
boahtit oidnosii -> våajnoes sjïdtedh
Default translation for boahtit is båetedh, alternative translation is sjïdtedh
<e slr="1"><p><l>boahtit<s n="v"/><s n="iv"/></l><r>båetedh<s n="v"/><s n="iv"/></r></p></e>
<e slr="2"><p><l>boahtit<s n="v"/><s n="iv"/></l><r>sjïdtedh<s n="v"/><s n="iv"/></r></p></e>
apertium-sme-sma.sme-sma.lrx
<rule>
<match lemma="boahtit" tags="v.*"><select lemma="båetedh" tags="v.*"/></match>
</rule>
<rule>
<match lemma="boahtit" tags="v.*"><select lemma="sjïdtedh" tags="v.*"/></match>
<match lemma="oidnosii"/>
</rule>
^boahtit
^Dat/dat
#Artihkele 7:m *geatnegahttá nasjonaalestaatide tjïrrehtidh konkreetide råajvarimmide vaarjeleminie åvteste unnebelåhkoengïeli nimhtie ahte dah våajnoes sjidtieh dovne politihkesne, laakine jïh åtnosne.
^pron-dem