The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages
This page is a part of the overall Getting started page. It describes what you need to install on the Mac to be ready to develop language tools for your language.
Note that this documentation is relevant when you want to participate in ‘building and developing the grammatical tools’ yourself. If you only want to use the ready-made grammatical analysers, see the Linguistic analysis page.
Then you need a number of tools for the build chain. On the Mac, you can get them by running the following commands:
Catalina comes with Python 3.7 built-in, and also Perl 5.18. By default, that should be good enough, and the commands below work with the pre-installed versions of both. If you require newer versions of Perl or Python, you probably know what you need to do.
sudo port install autoconf automake pkgconfig libtool python39 py39-pip py39-yaml wget \
bison cmake gawk saxon antiword wv libxslt poppler tidy subversion
sudo port select --set pip3 pip39
sudo port select --set python3 python39
You also need to ensure that the following is set in .profile
or similar:
export LC_ALL=no_NO.UTF-8
export LANG=no_NO.UTF-8
export LOCALE=no_NO.UTF-8
Adapt the actual locale to whatever you want, but the variables MUST be set, and the locale MUST be UTF-8.
You need tools to convert your linguistic source code (lexicons, morphology, phonology, syntax, etc.) into usefull tools like analysers, generators, hyphenators and spellers. Install the following linguistic programming tools:
Now go back to to Getting Started page for the next step towards building, using and developing the linguistic analysers.
There is also a page giving the overview for linguistic download in order to download and compile the analysers. TODO (write these two together).
Developing special tools in addition to the core linguistic analysers can require additional software. Here’s some additional software you might need depending on what you need to do.
If you want to work with proofing tools, see Proofing tools to install here
If you want to have documentation pages locally on your own machine, you need Forrest:
sudo port install \
TeXShop3 \
texlive-basic \
texlive-bin-extra \
texlive-latex-extra
Some of the tools above require or use Java, notably Saxon and Forrest. Saxon is used to convert XML-based source files into Lexc files, and Forrest is used to validate documentation extracted from the source files.
None of these functions are strictly required for developing language tools. The lexc files converted from XML are stored in svn, and if Saxon is not available, the lexc files will be used as is. And if Forrest is not available, the step for building documentation out of source code comments will just be skipped.
That is, Java is not required to do development using the Divvun/Giellatekno infrastructure, unless you specifically work with xml-based lexicons.