Meeting setup
- Date: 15.08.2005
- Time: 10.00 Norw. time
- Place: Wherever we are :-)
- Tools: Phone, iChat, SubEthaEdit
Agenda
- Opening, agenda review
- Reviewing the task list from a week ago
- Documentation - divvun.no
- Corpus gathering
- Corpus infrastructure
- Linguistics
- Speller infrastructure
- Other issues
- Summary, task lists
- Closing
1. Opening, agenda review, participants
Opened at 10:20.
Present: Tomi, Sjur, Børre, Maaren, Thomas, Trond
Main secretary: Sjur
2. Reviewing the task list from the last meeting
Børre
- Add crontab specification for the cvs update/export script Tomi made
- investigate Forrest war + Tomcat issue
- Working on it. Thor-Øyvind
- reopen the jspwiki + UTF-8 issue
- Contact Svenska bibelsällskapet
- Add issue to forrest issue tracker about utf-8 ihtml documents.
- Check for existing bug reports, it might already be in there
- Didn´t find any, but didn´t add an issue.
- Continue forrest i18n research
- Sometimes abides the =locale=se/smj/fi in the url (the document gets
correctly selected). The menus and tabs are translated to the language
that the browser asks for.
- Sometimes not. Sjur has added an issue to their issue tracker.
- discuss with Anders Kintel about whether we could get the Sámi-Norwegian dictionary
ahead of acceptance
- wait till Thomas is back from vacation
- add the Oslo contract to cvs, and begin writing a contract draft
- Added it. If we are to use the Helsinki model, we should probably use the
Helsinki contract as our draft.
- What is the best document format, considering that the lawyer is probably using MS Word?
- Follow up on CVS mailing:
- check that all received forwarded mail
- set up Thomas and Maaren when they’re back from vacation
- Now that they are back, I´ll have a look at it.
Sjur
- risten.no bugs and fixes
- complete the action summary after our half-year evaluation
- follow up on:
- voice group-chat not working to Sámediggi
- Maaren has problems with SubEthaEdit (can’t connect)
- To the board:
- write proposal for permanent maintenance organisation
- write draft specification for the outsourced tasks
- write half-yearly project report with progress and bugdet status
- write agenda
- Deadline for the board tasks: 3 weeks ahead of meeting (date not decided, but
in September)
- add Divvun to the UNESCO Hall-of-Fame list
- Other tasks done:
- monthly report
- meeting with Trond to prepare Helsinki Gathering
- some i18n issues with Forrest
Tomi
- Aspell: Create the affix file, work on automatic conversion from LEXC to affix file
- IMO, automatic conversion is not plausible, doing manual affix file
- investigate Aspell/MySpell/OOo issues
- Not really (only OOo (including NeoOffice?) & Mozilla products using MySpell)
- three-part compounding
- decapitalisation of proper nouns when (if?) compounded, and when derived
(the oslolaš issue)
- corpus infrastructure: dtd location (both public and internal)
- corpus infrastructure: file and dir organisation
- Follow up on CVS mailing:
- check that all received forwarded mail
- set up Thomas and Maaren when they’re back from vacation
On vacation last week
Trond
- Participated in the meeting with Sjur in Helsinki
- On vacation, issues from last week repeated at the end of the memo
Maaren
- On vacation, issues from last week repeated at the end of the memo
Thomas
- On vacation, issues from last week repeated at the end of the memo
3. Documentation - divvun.no
We are waiting for the Linux server (will solve our UTF-8 and automatic update issues).
The giellatekno site is converted to the Forrest format, and is in principle ready for
publication. Trond has to read through it and do some QA/editing before being released.
The standing invitation remains: Document what you do, when you do it.
4. Corpus gathering
The Oslo contract is added. We now need to work out our own suggestion for a contract. Børre,
Trond and Sjur will have a meeting Tuesday 16.8. at 10.00 Norwegian Time to work on that draft.
Sjur sends out an e-mail with details before the meeting.
5. Corpus infrastructure
Do documentation.
Naming conventions and directory structure
Principle:
incoming directory tree
- according to collector (having an incoming basket)
- according to..
permanent directory tree according to author institution,
Perhaps according to text genre, and thereafter author institution:
The discussion is moved to the discussion list, eventually to a separate meeting. But we’ll
start (or rather continue) in the newsgroup.
6. Linguistics
North Sámi
- The missing list should be reworked and cleaned.
- Sámi place names in Norway should updated, contact Jonny Andersen at Statens Kartverk.
Lule Sámi
When do we get the lexicon from Anders Kintel?
7. Speller infrastructure
aSpell
Write documentation here as well.
Is working, see above. Needs further optimisation before it is useful, cf the size
discussion. Size target: At least below 10 Mb, wish: 6 Mb or less. It started at 580Mb,
and now it is around 380 MB. Declining, that is.
Tomi is working on the affix file, and by that reducing the size of the speller file.
The affix file will so-to-speak recreate the inflection of our LEXC lexicon in the Aspell
format (similar to MySpell?), in a format that could be called one-level (there is no concept
of two levels with regular mappings between them in Aspell, or any of the *spell engines, for that
matter).
Useful links:
MS Office spellers
To be specified in the outsourcing tech-spec. What
about direct integration with MS Office? See
this page from MS.
Discussion left for the future, and when we have the offers to compare with.
OpenOffice.org
Questions:
1) Is it possible to use Aspell with OOo?
No. From the site:
Background
Lingucomponent was started by Kevin Hendricks to integrate various open source spell
checkers into the OpenOffice.org build. With a little prodding from Kevin Atkinson,
the author of Pspell, a new spell checker called MySpell was written in C++ that supported
affix compression, based on Ispell. This code has been given to Pspell and will eventually
make it into a future Pspell release after it has been integrated.
Later note: Pspell is dead and Aspell is its replacement. I think Aspell is found at
aspell.sourceforge.net. There is no OOo interface to Aspell (no way to use Aspell with OOo).
kh.
MySpell, which builds on almost any system, is used to support spell checking in
OpenOffice.org.
Links to possibly useful stuff:
2) Are the Aspell source files compatible with MySpell?
To be investigated. Goal: to simplify
maintenance as much as possible, ie to have as few source and target formats as possible.
The Debian Finnish aspell page
seems to point to source files that are common to both iSpell,
Aspell and Myspell. Follow the link near the bottom of the page named
developer information for aspell-fi.
[http://hunspell.sourceforge.net/] is supposed to become the new
OpenOffice.org spellchecker
8. Other
Helsinki gathering
Program draft available at [/admin/physical_meetings/helsinki-2005-08.html]
To bring from Kautokeino:
- headphones
- battery pack for Sjur
- registration money for FSMNLP
Are the hotel rooms reserved? Probably not. Maaren will reserve rooms
([Arthur|http://www.hotelarthur.fi/english/] or Helka).
Flights should be booked
individually, with the bill sent to Sámediggi.
Børre in Kautokeino
First week there about a week after the Helsinki gathering. Then every third week by default.
Technical issues
- The mac os / perl bug (at least Trond has it):
- utf8 “\xC4” does not map to Unicode at /Users/trond/gt/script/preprocess line 82.
This msg did not show up in 10.3 (perl 5.8.1), but does so in 10.4 (perl 5.8.6).
It is probably a perl - OS mismatch.
- 25 open bugs - too much! Have a look at what you can fix.
9. Summary, task list
Børre
- Add crontab specification for the cvs update/export script Tomi made
- Adjust it so it will work with the new server
- reopen the jspwiki + UTF-8 issue
- Add issue to forrest issue tracker about utf-8 ihtml documents.
- Contact Svenska bibelsällskapet
- discuss with Anders Kintel about whether we could get the Sámi-Norwegian dictionary
ahead of acceptance
- Begin writing a contract draft, together with Trond and Sjur. Coordinate with Kimmo Koskenniemi
- Follow up on CVS mailing:
- check that Thomas and Maaren receive forwarded mail from cochice
- set up Thomas and Maaren
- Investigate compatibility between MySpell and Aspell source files
- Book Helsinki flight via Via :-)
- Help Trond with Saara’s computer
- meeting about corpus contract
- Finish giellatekno forrest transition.
Maaren
- The missing list, both the overall missing list from our xml corpus, and a file-for-file
review, in order to get different terminology
- check whether work is checked in
- Go through the missing list from risten.no
- book rooms in Helsinki
- remember headphones, money and batteries for Helsinki
Sjur
- risten.no bugs and fixes
- complete the action summary after our half-year evaluation
- follow up on:
- voice group-chat not working to Sámediggi
- Maaren has problems with SubEthaEdit (can’t connect)
- To the board:
- write proposal for permanent maintenance organisation
- write draft specification for the outsourced tasks
- write half-yearly project report with progress and bugdet status
- write agenda
- Deadline for the board tasks: 3 weeks ahead of meeting (date not decided, but
in September)
- add Divvun to the UNESCO Hall-of-Fame list
- Plan the Helsinki meeting
- meeting about corpus contract
- project planning with Trond
Thomas
- work on Lule Sami verbs
- contact Jonny Andersen about place names
Tomi
- Aspell: Create the affix file, work on automatic conversion from LEXC to affix file
- investigate Aspell/MySpell/OOo issues
- three-part compounding
- decapitalisation of proper nouns when (if?) compounded, and when derived
(the oslolaš issue)
- corpus infrastructure: dtd location (both public and internal)
- corpus infrastructure: file and dir organisation
- Follow up on CVS mailing:
- check that all received forwarded mail
- set up Thomas and Maaren when they’re back from vacation
Trond
- Work on the bug list (Lule Sámi).
- Work on compounds (three-part + oslolaš, with Tomi)
- Work on the corpus interface (with Lars)
- Corpus infrastructure: dtd location
- Get the new version of the New Testament
- Check Hans-Ragnars names.
- Look at the Sámi names issue
- Plan the Helsinki meeting
- New coworker
- meeting about corpus contract
- check the new giellatekno site
- project planning with Sjur
10. Next meeting, closing
22.08.2005 10:00
Closed at 12:55