The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages
View GiellaLT on GitHub divvungiellatekno/giellalt.uit.no
This document contains an overview of the topics and training given in Nuuk in June, 2017. It contains links to the relevant parts of the existing documentation. The idea is that this page can be used as an overview document also in other cases.
Basic organisation: presentation of topic, followed by excercises. Roughly one topic + excercise before lunch and one topic after.
Topics (with a roughly schedule with time left at the end - to be adjusted as needed):
Optional:
This varies from product to product. In most cases (like spellers, keyboards and more) all these dependencies are precompiled by programmers and added as library resources to be included automatically so that the linguists do not have to have everything installed and correctly configured for building the end user products (s)he wants.
The exact details for each product is listed separately in each case, on individual pages.
See a separate document.
Tips for in-surce documentation:
make
in the otherTo run forrest:
forrest run -Dforrest.jvmargs="-Dfile.encoding=utf-8 -Djava.awt.headless=true"
To debug, edit the generated jspwiki file till the error is found, then correct it in lexc.
Guidelines for clean code:
See this document
Short intro to the tools in the devtools/
directory:
check_analysis_regressions.sh
generate-*-wordforms.sh
test_ospell-office_suggestions.sh
When the Yaml files are covering the relevant parts of lexc, one can rewrite the
lexc code and at the same time ensure that the analysis doesn’t change. One can
also use one of the devtools/
tools to help in that process.
This can also be seen the other way: when you know what area of the grammar you want to change, you write yaml tests to specify the intended output, and work on the grammar until you get there.