Language Technology at UiT

The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages

View GiellaLT on GitHub divvungiellatekno/giellalt.uit.no

Meeting setup

Agenda

  1. Opening, agenda review
  2. Reviewing the task list from a week ago
  3. Documentation - divvun.no
  4. Corpus gathering
  5. Corpus infrastructure
  6. Linguistics
  7. Term db
  8. Other issues
  9. Summary, task lists
  10. Closing

1. Opening, agenda review, participants

Opened at 10.00. Agenda accepted as is.

Present: Børre, Maaren, Sjur, Thomas, Tomi, Trond

Main secretary: Tomi

2. Reviewing the task list from the last meeting

Børre

Maaren:

Sjur:

Thomas:

Tomi:

Trond

All:

3. Documentation - divvun.no

Deadline 22.6.

Børre will be the main responsible for getting it all in place till the deadline.

Things required before the opening:

  1. Press release text, conforming to what we actually release (in 3 languages? More?)
  2. divvun.no front pages up and running, with a nice look and interlinking in place, as site-generated pages.
  3. front page and general info in all languages
  4. general outline of the project, with deliveries and main timetable
  5. giellatekno.uit.no front pages up and running, with a nice look and interlinking in place, as site-generated pages.
  6. The actual content of the documentation, our common child, should be gone through with the forthcoming public release in mind.

Requirements for the opening:

The separate pages:

Deadlines:

Authoring

Translations

The files will be written in nob, and made available via cvs, in xtdocs/sd/…/xdocs/

Files

Børre will make document stubs in the correct places, with the correct filenames. Today! (right after the meeting:-).

4. Corpus gathering

The license contract is being translated. Maaren will check today whether it is finished.

5. Corpus infrastructure

catxml: some problems with section extraction/deletion. Tomi will continue the work on it.

We need routines for handling other formats than Microsoft Word:

6. Linguistics

We should postpone the substantial issues to after the divvun release. The bug list should still be followed.

7. Term db / risten.no

Preparing the release, solving bugs.

8. Other issues

Bugzilla for sme and smj parsers.

When we set up Bugzilla, we didn’t think of the fact that there would be two languages in this project, sme and smj, and more in the future. Now, smj bugs must be included. The question is, should we have smj and sme as top components (“products” in the Bugzilla terminology), or as bottom options in the decision tree (“component (of product)”, in the Bugzilla terminology)?

Here comes an overview of status quo, with the sme/smj relevance indicated.

Suggestion: bottom level (components). Agreed.

The sme/smj issue will be handled as indicated, by splitting products according to language when required, i.e. mainly in the linguistic products.

Language(s) of RSS Newsfeed

Forrest can generate RSS news for us, which we will use to broadcast project events. What should the language be? What should the content be, ie the type of events that we publish this way?

To be decided by next meeting at latest. We (Børre/Sjur) need to find out what is possible in Forrest.

9. Summary, task list

All:

Børre

Maaren:

Sjur:

Thomas:

Tomi:

Trond

10. Next meeting, closing

20.06.2005 10.00

Closed at 12.08