Language Technology at UiT

The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages

View GiellaLT on GitHub divvungiellatekno/giellalt.uit.no

Getting Started On The Mac

This page is a part of the overall Getting started page. It describes what you need to install on the Mac to be ready to develop language tools for your language.

Note that this documentation is relevant when you want to participate in ‘building and developing the grammatical tools’ yourself. If you only want to use the ready-made grammatical analysers, see the Linguistic analysis page.

System setup

Then you need a number of tools for the build chain. On the Mac, you can get them by running the following commands:

Catalina and newer (macOS 10.15+)

Catalina comes with Python 3.7 built-in, and also Perl 5.18. By default, that should be good enough, and the commands below work with the pre-installed versions of both. If you require newer versions of Perl or Python, you probably know what you need to do.

Minimal install

sudo port install autoconf automake pkgconfig libtool python39 py39-pip py39-yaml wget \
bison cmake gawk saxon antiword wv libxslt poppler tidy subversion

sudo port select --set pip3 pip39
sudo port select --set python3 python39

You also need to ensure that the following is set in .profile or similar:

export LC_ALL=no_NO.UTF-8
export LANG=no_NO.UTF-8
export LOCALE=no_NO.UTF-8

Adapt the actual locale to whatever you want, but the variables MUST be set, and the locale MUST be UTF-8.

Linguistic software

You need tools to convert your linguistic source code (lexicons, morphology, phonology, syntax, etc.) into usefull tools like analysers, generators, hyphenators and spellers. Install the following linguistic programming tools:

Now go back to to Getting Started page for the next step towards building, using and developing the linguistic analysers.

There is also a page giving the overview for linguistic download in order to download and compile the analysers. TODO (write these two together).

Additional software

Developing special tools in addition to the core linguistic analysers can require additional software. Here’s some additional software you might need depending on what you need to do.

Software for proofing development

If you want to work with proofing tools, see Proofing tools to install here

Documentation web server locally

If you want to have documentation pages locally on your own machine, you need Forrest:

Article authoring using LaTeX

sudo port install \ TeXShop3 \ texlive-basic \ texlive-bin-extra \ texlive-latex-extra

Note for Java avoiders

Some of the tools above require or use Java, notably Saxon and Forrest. Saxon is used to convert XML-based source files into Lexc files, and Forrest is used to validate documentation extracted from the source files.

None of these functions are strictly required for developing language tools. The lexc files converted from XML are stored in svn, and if Saxon is not available, the lexc files will be used as is. And if Forrest is not available, the step for building documentation out of source code comments will just be skipped.

That is, Java is not required to do development using the Divvun/Giellatekno infrastructure, unless you specifically work with xml-based lexicons.