Language Technology at UiT

The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages

View GiellaLT on GitHub divvungiellatekno/giellalt.uit.no

Meeting setup

Agenda

  1. Opening, agenda review
  2. Reviewing the task list from two weeks ago
  3. Documentation - divvun.no
  4. Corpus gathering
  5. Corpus infrastructure
  6. Infrastructure
  7. Linguistics
  8. name lexicon infrastructure
  9. Spellers
  10. Other issues
  11. Summary, task lists
  12. Closing

1. Opening, agenda review, participants

Opened at 09:43.

Present: Børre, Thomas, Tomi, Trond

Absent: Maaren (sick leave), Sjur (paternal leave), Saara (not available on iChat)

Main secretary: Børre

Agenda accepted with additions under “Other”.

2. Reviewing the task list from the last meeting

Børre

Maaren

Saara

Sjur

Thomas

Tomi

Trond

3. Documentation

TODO:

4. Corpus gathering

Collecting

See a previous meeting memo for what’s to be done.

TODO: Send out the rest of the letters (Børre)

Odin

Sæth replied by e-mail, hasn’t had time to follow-up, but will try to include us in their plans.

KIO Grafisk and the Iđut books

TODO:

Bible texts

We will get text from Finland, but still haven’t received any. We have got the Swedish text from Sweden. As for the last html versions from Norway, Trond has not contacted them last week.

Swedish html has arrived, no paratext. Norsk bibelselskap has not sent corrected New Testament versions for sme, and not paratext for nno/nob.

TODO:

Min Áigi

Everything ok here.

TODO:

Kåfjord

Promised to send us texts, but nothing has arrived yet.

Sámi Instituhtta

Audhild Schanche promised to sign the contract and send us texts they have available.

5. Corpus infrastructure

TODO:

Changes and updates because of the Divvun public tender

User account admin and infra: see [previous memo|/admin/weekly/2006/Meeting_2006-03-06.html].

TODO: see above under Documentation.

Automatic build of the content of our corpus repo: also see [previous memo|/admin/weekly/2006/Meeting_2006-03-06.html].

TODO:

Free and non-free texts

More info in a [previous meeting memo.|/admin/weekly/2006/Meeting_2006-03-13.html]

TODO:

Linking parallel files

DECISION: We’ll keep the original filename, and store linking info in the header (has to be added manually).

TODO:

More texts to the graphical corpus interface:

TODO:

Top-two priorities:

  1. Linda and Trond to find text to add.
  2. Lars to add them.

Language recognition

TODO:

Now, there are gross errors in the material, the texts must be reconverted, it seems.

6. Infrastructure

Aligner

Today, we have two anchor files in addition to the original one.

TODO:

Not much has happened, we must work more on this issue.

Hyphenator

TODO:

Trond and Sjur have had a look at the issue. We postpone it until Sjur is back.

7. Linguistics

General - hyphenation

See discussion, open questions and decission in the [previous meeting memo.|/admin/weekly/2006/Meeting_2006-04-03.html]

TODO:

TODO:

Fix the rest of the propernoun file (Trond, Thomas)

North Sámi

Semantic feature system

Further discussion and details in this and this meeting memo.

Lule Sámi

TODO:

8. Name lexicon infrastructure

TODO:

  1. refactor and prepare risten.no for multiple collections:
    1. refactor the code into more and more specific components according to our folder hierarchy (Tomi, Sjur)
      1. things are moving forward
  2. develop the needed XQueries and interface (Sjur, Tomi)
    1. developing
  3. data synchronisation between risten.no and the cvs repo (Tomi)
    1. nothing this week
  4. test and review when ready

9. Spellers

Nothing until the new proper noun lexicon is in place. We don’t have enough people to do both.

10. Other

Bug fixing

47 open Divvun/Disamb bugs, and 25 risten.no bugs

After the corpus issues have been somewhat settled, we should do a bug barnraising. … and then a new one after the name lexicon is fixed.

11. Summary, task list

Børre

Maaren

Saara

Sjur

Thomas

Tomi

Trond

12. Next meeting, closing

24.04.2006 09:30

Sjur is on paternal leave.

Closed at 11:47