Language Technology at UiT The Arctic University of Norway

The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages

View GiellaLT on GitHub divvungiellatekno/giellalt.uit.no

Page Content

Meeting setup

Agenda

  1. Opening, agenda review
  2. Reviewing the task list from a week ago
  3. Documentation - divvun.no
  4. Corpus gathering
  5. Corpus infrastructure
  6. Speller infrastructure
  7. risten.no
  8. Other issues
    1. CVS Mailing
    2. Helsinki gathering
  9. Summary, task lists
  10. Closing

1. Opening, agenda review, participants

Opened at 12.10 after some technical hurdles (tried to run the whole meeting online, that is, without a phone). Agenda accepted as is.

Present: Sjur, Tomi, Børre

Absent: Maaren, Thomas, Trond

Main secretary: Børre

2. Reviewing the task list from the last meeting

Sjur

Tomi

Trond

On vacation last week

Børre

Maaren

Thomas

3. Documentation - divvun.no

crontab

Tomi has made a CVS update script, but it is not automatized yet => crontab is missing. Børre will take over this task.

wiki + UTF-8

There’s no wiki support in the official site => noone can read our weekly meeting memos.

This works using linux:

export LC_ALL=no_NO.UTF-8
forrest run

It is unclear if we are able to do the same using Mac OS X.

It should work irrespective of the locale of the host OS, and the above is really a hack to work around a bug in Forrest/Cocoon or the JSPWiki reader (Chaperon).

status.xml & RSS feed

Sjur has added support for status.xml, which implies RSS news feeds of the changes we document.

We will for the time being not use the todo section, as our weekly memos and Bugzilla provides enough feedback and support for our tasks, and what needs to be done. Eventually it could be used for more long term goals that do not fit within usage of Bugzilla and meeting memos.

Regarding the changes section, have a look, and start using it. The RSS newsfeed is an excellent tool for keeping interested persons and parties up-to-date with little or no effort on their part. Documentation can be found at Status.xml how-to

Forrest

Forrest 0.7 is officially released. The HEAD of the trunk seems to be unstable wrt creating war files. Be careful!

4. Corpus gathering

Add the Oslo license model to xdocs/adm/legal/. Then we will start to compare, and work out our own license text.

5. Corpus infrastructure

Directory structure

Not optimal. Tomi will continue looking for good examples of internal dir structure.

catxml

Updated to use the language of the pwd as default when extracting corpus texts.

6. Speller infrastructure

Common Makefile

We should finalize a common and language-independent Makefile as a prerequisite for the speller infrastructure, such that what we do for building spellers for North Sámi will automatically be available to other languages.

Saara Huhmarniemi is a makefile expert, but she isn’t back at work yet. Maybe we postpone the makefile makeover till she is back.

Aspell

We will soon have the unrestricted Xerox tools, and we’ll then be able to generate the word list for Aspell. What we need then is the infrastructure to go from a word list to an Aspell dictionary. This should be high priority ahead of the Helsinki demo.

First version: simple word list based speller, word list generated by the Xerox tools. Makefile target should be something like “make aspell”, with the final, ready to use files in a build directory.

7. risten.no

Sjur will continue to use some time on the following issues:

8. Other issues

CVS Mailing

CVS mailing works to the local accounts on cochise, but forwarding isn’t working for all of us. Tomi just received a mail daemon message, saying “no route to host”, thus undelivered e-mail for trond.trosterud@helsinki.fi, sjur.moshagen@kolumbus.fi and sjurnm@mac.com.

Sjur will follow up on this one.

Helsinki gathering

Dates

Thomas did not want to go earlier. We extend the meeting with a technical day on Monday, and all of us on Tuesday. Thomas can then travel on Monday, as originally planned.

Dates now also confirmed with Børre. Thus decided:

Demo

We will get the Xerox tools. See above under Speller Infrastructure for more details.

9. Summary, task list

Børre

Sjur

Tomi

On vacation

Tasks are transferred to the following weeks until they return from vacation, to keep the tasks alive and updated. New tasks may be added:-)

Maaren

Thomas

Trond

10. Next meeting, closing

25.07.2005 10:00

Closed at 13:45