Language Technology at UiT

The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages

View GiellaLT on GitHub divvungiellatekno/giellalt.uit.no

Møte T&S 22.9.2015

python på stallo

No news are not good news.

make check (sma):

  File "/home/sjur/langtech/main/gtcore/scripts/morph-test.py", line 419, in run_test
    raise LookupError('`%s` had an error:\n%s' % (self.program, self.results['err']))
**main**.LookupError: `['/home/sjur/bin/lookup', '-flags', 'mbTT']` had an error:
0%>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>100%
(Error code: -11)
FAIL

Problemet kokar ned til:

$ echo gïeli | lookup -flags mbTT src/analyser-gt-desc.xfst
0%>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>100%
ERROR in 'LOOKUP' at input line 0 :
output buffer overflow

For trond:

[trond@stallo-2 sma]$ echo gïeli | lookup -flags mbTT src/analyser-gt-desc.xfst
0%>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>100%
gïeli	gïele+N+Pl+Gen
gïeli	gïeledh+V+TV+Ind+Prt+Sg3

Sjur med hfst:

$ echo gïeli | hfst-lookup -q ../hfst/src/analyser-gt-desc.hfst
gïeli	aarph+N+Cmp#aarph+N+Cmp#aarph+N+Cmp#aarph+N+Cmp#aarph+N+Cmp#gïele+N+Pl+Gen	0,000000
gïeli	aarph+N+Cmp#aarph+N+Cmp#aarph+N+Cmp#aarph+N+Cmp#gïele+N+Pl+Gen	0,000000
gïeli	aarph+N+Cmp#aarph+N+Cmp#aarph+N+Cmp#gïele+N+Pl+Gen	0,000000
gïeli	aarph+N+Cmp#aarph+N+Cmp#gïele+N+Pl+Gen	0,000000
gïeli	aarph+N+Cmp#gïele+N+Pl+Gen	0,000000
gïeli	gïele+N+Pl+Gen	0,000000
gïeli	gïeledh+V+TV+Ind+Prt+Sg3	0,000000
gïeli	[...cyclic...]

Jf. src/morphology/stems/nouns.lexc:aarph+N+CmpNP/Pref+CmpN/SgN:aarph R ;

  2186  tsuepie+N+CmpNP/Pref+CmpN/SgN:tsuepie R ;
  2187  goelke+N+CmpNP/Pref+CmpN/SgN:goelke R ;
  2188  aarph+N+CmpNP/Pref+CmpN/SgN:aarph R ;
  2189  svaale+N+CmpNP/Pref+CmpN/SgN:svaale R ;
  2190  geesjele+N+CmpNP/Pref+CmpN/SgN:geesjele R ;

$ echo gïeli | hfst-lookup -q ../hfst/src/analyser-gt-desc.hfstol
^C
$ echo gïeli | hfst-optimized-lookup -w ../hfst/src/analyser-gt-desc.hfstol
        - glibc detected *** hfst-optimized-lookup: malloc(): memory corruption (fast): 0x0000000003ddff20 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3bc1475e66]
/lib64/libc.so.6[0x3bc1479cdf]
/lib64/libc.so.6(__libc_malloc+0x71)[0x3bc147a751]
/usr/lib64/libstdc++.so.6(_Znwm+0x1d)[0x3bc4cbd0bd]
hfst-optimized-lookup[0x4097ea]
hfst-optimized-lookup[0x4090ba]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409121]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409121]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409121]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409121]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409050]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409050]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409050]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409050]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409050]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409050]
hfst-optimized-lookup[0x403f85]
hfst-optimized-lookup[0x409121]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409050]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409050]
hfst-optimized-lookup[0x403f85]
hfst-optimized-lookup[0x409121]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409121]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409121]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409121]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409121]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409121]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409050]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409050]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409050]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409050]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409050]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409050]
hfst-optimized-lookup[0x403f85]
hfst-optimized-lookup[0x409121]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409050]
hfst-optimized-lookup[0x403e08]
hfst-optimized-lookup[0x409050]
hfst-optimized-lookup[0x403f85]
======= Memory map: ========
00400000-00425000 r-xp 00000000 00:13 3016317549                         /home/sjur/bin/hfst-optimized-lookup
00624000-00626000 rw-p 00024000 00:13 3016317549                         /home/sjur/bin/hfst-optimized-lookup
016de000-03e14000 rw-p 00000000 00:00 0                                  [heap]
3bc1000000-3bc1020000 r-xp 00000000 08:01 5076                           /lib64/ld-2.12.so
3bc121f000-3bc1220000 r--p 0001f000 08:01 5076                           /lib64/ld-2.12.so
3bc1220000-3bc1221000 rw-p 00020000 08:01 5076                           /lib64/ld-2.12.so
3bc1221000-3bc1222000 rw-p 00000000 00:00 0
3bc1400000-3bc158a000 r-xp 00000000 08:01 5100                           /lib64/libc-2.12.so
3bc158a000-3bc178a000 ---p 0018a000 08:01 5100                           /lib64/libc-2.12.so
3bc178a000-3bc178e000 r--p 0018a000 08:01 5100                           /lib64/libc-2.12.so
3bc178e000-3bc178f000 rw-p 0018e000 08:01 5100                           /lib64/libc-2.12.so
3bc178f000-3bc1794000 rw-p 00000000 00:00 0
3bc1800000-3bc1883000 r-xp 00000000 08:01 5486                           /lib64/libm-2.12.so
3bc1883000-3bc1a82000 ---p 00083000 08:01 5486                           /lib64/libm-2.12.so
3bc1a82000-3bc1a83000 r--p 00082000 08:01 5486                           /lib64/libm-2.12.so
3bc1a83000-3bc1a84000 rw-p 00083000 08:01 5486                           /lib64/libm-2.12.so
3bc1c00000-3bc1c02000 r-xp 00000000 08:01 5489                           /lib64/libdl-2.12.so
3bc1c02000-3bc1e02000 ---p 00002000 08:01 5489                           /lib64/libdl-2.12.so
3bc1e02000-3bc1e03000 r--p 00002000 08:01 5489                           /lib64/libdl-2.12.so
3bc1e03000-3bc1e04000 rw-p 00003000 08:01 5489                           /lib64/libdl-2.12.so
3bc3800000-3bc3816000 r-xp 00000000 08:01 5487                           /lib64/libgcc_s-4.4.7-20120601.so.1
3bc3816000-3bc3a15000 ---p 00016000 08:01 5487                           /lib64/libgcc_s-4.4.7-20120601.so.1
3bc3a15000-3bc3a16000 rw-p 00015000 08:01 5487                           /lib64/libgcc_s-4.4.7-20120601.so.1
3bc4c00000-3bc4ce8000 r-xp 00000000 08:01 282857                         /usr/lib64/libstdc++.so.6.0.13
3bc4ce8000-3bc4ee8000 ---p 000e8000 08:01 282857                         /usr/lib64/libstdc++.so.6.0.13
3bc4ee8000-3bc4eef000 r--p 000e8000 08:01 282857                         /usr/lib64/libstdc++.so.6.0.13
3bc4eef000-3bc4ef1000 rw-p 000ef000 08:01 282857                         /usr/lib64/libstdc++.so.6.0.13
3bc4ef1000-3bc4f06000 rw-p 00000000 00:00 0
2b99405b1000-2b99405ba000 rw-p 00000000 00:00 0
2b99405c6000-2b99408f7000 rw-p 00000000 00:00 0
2b99410dd000-2b99432da000 rw-p 00000000 00:00 0
7fff81db2000-7fff81e14000 rw-p 00000000 00:00 0                          [stack]
7fff81fc4000-7fff81fc5000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
Aborted
$

Hfst for Trond:

configure: error: You requested --with-hfst: hfst is too old. OR: no other fst tools were found (Xerox, Foma).
checking whether hfst is at least 3.8.2 and has the required tools... /home/sjur/bin/hfst-info:
error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory

Konklusjon:

  1. Trond køyrer og testar med xerox (funkar for Trond sjølv om det ikkje funkar for Sjur)
  2. Trond testar hfst for aarph-feilen
  3. Deretter ser vi

gt-servarane

Problem den siste tida. Oppsummering for arbeidet vidare:

tal-fst-ane

Bug nr. 1538. Det er tre ulike fst-typar (tal, dato, klokke). Retninga høgre:venstre er spesifisert i filnamna, dvs. vi vil ha tal:tekst i lexc-koden:

Dette er berre delvis gjennomført, og oppsettet for oahpa har dessverre tatt omsyn for den varierande praksisen.

Konklusjon:

Vi må rydde opp i dette:

  1. Vi inviterer relevante personar til dugnad for sine respektive språk
  2. … og ber Heli om å rydde opp i Oahpa-konfigureringa

bugzilla

Aktivitet i det siste, bra. Vi følgjer opp.

stavekontroll og vektar

Bug i OpenFST med negative vektar i regulære uttrykk. Buggen hindrar oss i å laga ferdig stavekontrollar, Sjur ser på om det er mogleg å jobba rundt problemet.

Søknader

ML - enare-samisk

Søknadsfrist neste veke.

Jaska -

Jack Rueter:
	Hi, Sjur
	I just got a message from Ivan Ryabov in Saransk.
Sjur Nørstebø Moshagen:
	ok!
Jack Rueter:
	He would like  TWO pieces of information from each of us.
Sjur Nørstebø Moshagen:
	ok
Jack Rueter:
	(1) information of who each one of us is (cf. Doc1 point 1)
	(2) резюме CV (Continental English)
Sjur Nørstebø Moshagen:
	ok
	when does he need this?
Jack Rueter:
	As soon as possible.
Sjur Nørstebø Moshagen:
	day / time
Jack Rueter:
	Everything has to be turned in before the 25th to be safe
	the final time is some time on the 25th, I’ll have to look inthe pdf-s

Trond og Sjur sender CV, seinast i morgon.

see-modi

Utsett til neste gong.