The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages
sme-sma-mt meeting 15.8.2013
Francis, Lene, Trond.
Now also on [http://wiki.apertium.org]
[http://pastebin.com/raw.php?i=kZDbYK1z]
1 ^vïjhte<num><sg><ill><attr>$ #vïjhte\<num\>\<sg\>\<ill\>\<attr\> ^vïjhte/vïjhte<num><sg><acc>/vïjhte<num><sg><nom>$^./.<clb>$
1 ^göökte<num><sg><ill><attr>$ #göökte\<num\>\<sg\>\<ill\>\<attr\> ^göökte/göökte<num><sg><acc>/göökte<num><sg><nom>$^./.<clb>$
1 ^gellie<num><sg><ill><attr>$ #gellie\<num\>\<sg\>\<ill\>\<attr\> ^gellie/gellie<num><sg><nom>/gellie<pron><indef><sg><nom>$^./.<clb>$
1 ^akte<num><sg><ill><attr>$ #akte\<num\>\<sg\>\<ill\>\<attr\> ^akte/akte<num><sg><nom>$^./.<clb>$
1 ^akte<num><sg><gen><der_lágan><a><attr>$ #akte\<num\>\<sg\>\<gen\>\<der_lágan\>\<a\>\<attr\> ^akte/akte<num><sg><nom>$^./.<clb>$
1 ^golme<num><sg><gen><cmp>$ #golme\<num\>\<sg\>\<gen\>\<cmp\> ^golme/golme<num><sg><acc>/golme<num><sg><nom>$^./.<clb>$
<e><p><l>viiddidit<s n="v"/><s n="tv"/><s n="der_passl"/><s n="v"/><s n="iv"/></l><r>væjranidh<s n="v"/><s n="iv"/></r></p></e>
Sámegiela dilli olggobealde hálddašanguovllu gal ii leat buorre, ja evalueren konkludere dainna ahte hálddašanguovlu ferte viiddiduvvot.
Variants getting the same lemma. - looking for the lemma in the bidix.
Fran: make analysis use desc fst.
Declaring symbols
We get source forge mail, but not svn mail. Fran to have a look.
https://sourceforge.net/auth/subscriptions/
“Apertium: machine translation toolbox svn direct day All artifacts “
echo "Mun oasttán láibbi buvddas." | preprocess | usme | lookup2cg | vislcg3 -g src/sme-dis.rle | vislcg3 -g src/smi-syn.rle
"<Mun>"
"mun" Pron Pers Sg1 Nom @SUBJ>
"<oasttán>"
"oastit" V TV Ind Prs Sg1 @+FMAINV
"<láibbi>"
"láibi" Food N Sg Acc @<OBJ
"<buvddas>"
"buvda" Org Build N Sg Loc @<ADVL
"<.>"
"." CLB
"<Mus>"
"mun" Pron Pers Sg1 Loc @HAB> <====== Gen
"<lea>"
"leat" V IV Ind Prs Sg3 @+FMAINV
"<láibi>"
"láibi" Food N Sg Nom @<SPRED
"<.>"
"." CLB
Results:
echo "Mus lea láibi." | apertium -d . sme-sma
Mannesne lea laejpie.
$ echo "Mus lea láibi" | apertium -d . sme-sma
Mov (lea) laejpie.
echo "Mus leat guokte láibbi." | apertium -d . sme-sma
Mov (leah) göökte laejpien. <=== laejpieh
[http://wiki.apertium.org/wiki/North_S%C3%A1mi_and_South_S%C3%A1mi/Pending_tests]
fran@eki:~/source/apertium/nursery/apertium-sme-sma$ bash pending-tests.sh Running Pending-tests with mode “sme-sma” with updated tests……
We copy old gt/sme/src/smi-syn.rle to functions.cg3
it is here:
langs/sme/src/syntax/functions.cg3
cat generation-report.sme-sma.txt | grep "#"
hash gives
freq postchunk form generated lemma analysed
-------------------------------------------------------------------------------------------
9 ^jienebe<a><comp><sg><nom>$ #jienebe\<a\>\<comp\>\<sg\>\<nom\> ^jienebe/jienebe<pron><indef><sg><nom>$^./.<clb>$
7 ^rolle<n><sg><acc>$ #rolle\<n\>\<sg\>\<acc\> ^rolle/*rolle$^./.<clb>$
6 ^njoetseldh<a><attr><cmp>+almetje<n><pl><ill>$ #njoetseldh\<a\>\<attr\>\<cmp\>almetjidie ^njoetseldh/njoetsedh<v><iv><der_l
dh><n><sg><nom>/njoetseldh<a><sg><nom>$^./.<clb>$
<e><p><l>eambbo<s n="a"/></l><r>jienebe<s n="a"/></r></p></e>
<e><p><l>eanet<s n="a"/></l><r>jienebe<s n="a"/></r></p></e>
find line with output:
cat texts/samediggidiedadus_samegiela_birra_2012.sme.txt | apertium -d . sme-sma-postchunk | cat -n | grep "jïjtje<pron><refl><sg><gen><pxpl1>"
find input:
head -n 17 texts/samediggidiedadus_samegiela_birra_2012.sme.txt | tail -1 | apertium -d . sme-sma-tagger