The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages
fao
, which is reasonably big but still not too complex. Create the basic dir tree, and use svn copy
to copy over the fao
sources, so that the old fao
dir remains intact and usable all the time (only when everything is working ok, the old dir will be removed).
fao
fao
has)
kal
(which is using xfst
instead of twolc
and thus provides a slightly new use case). Make sure all build targets are working as they should, and extend the build system, template files, etc as needed. kal
has probably more requirements than fao
.
A working new infrastructure using autotools are now in place. Tommi has experimented a bit with CMake, but nothing has been committed.
Faroese is working, but no other languages. This is too little to really test it all, and the next important goal is to add at least one other language. Tommi’s plan was to add Greenlandic.
There are many tools out there for solving these tasks. Our requirement is that it should be open source, available for as many platforms as possible, preferably including Windows, and have a large user base (=ease of getting support and find solutions). Our two candidates at the moment:
We probably do not have a full overview of all our dependencies, but here are at least some:
Not all tools are required for all parts of our source base. We probably need to split up some parts into separate subprojects, with their own requirements.
It is essential that we can support the following:
This can be supported through a combination of central and local makefiles. The idea is that local makefiles include central makefiles. The central makefiles carries the burden of the build machinery, whereas the local ones can be enhanced with additional machinery for language-specific processing and experiments.
The central makefiles should be part of the core module, and thus always available.
Implemented? Doesn’t look like it based on the faroese build files.
Another crucial feature of the new infrastructure is that it should be easy to add new languages, and easy to add new functionality to existing languages. This can be done through a templating system combined with the central build system.
There should exist - within the core - a fully functional template for all languages, functional in the sense that all files compile and constitute a working environment for linguistic development. The linguistric content is of course only placeholder text, to be replaced with real content for any new language.
The template structure should also be used to propagate new shared resources to all languages. The idea here is that when a new feature is developed enough for a single language, the files making up the new feature should be copied to the template tree (with boilerplate content), and the corresponding build instructions should be moved to the central makefiles. Then, the next time another language is built, the build system should note the new files (or the non-existence of the template files in the language dir tree), and copy the template files into the dir tree for the language, and build them. From there on the “only” thing needed is to populate the new files with real content for the language in question.
Populating files with real content is of course the main job, but with a template structure as described above we ensure two things: