The Divvun and Giellatekno teams build language technology aimed at minority and indigenous languages
View GiellaLT on GitHub divvungiellatekno/giellalt.uit.no
As we continue to move to GitHub, we also need to update our documentation infrastructure. The basic ideas are as follows:
gawk -f $GIELLA_CORE/scripts/jspwiki2md.awk WhatIsThis.jspwiki > WhatIsThis.md
Must be done in two steps:
The commands are:
saxonXSL -s:docu-smj-lex.xml -xsl:$GIELLA_CORE/devtools/forrest_xml2plain_xml.xsl > test.html
pandoc -f html -t gfm test.html -o test.md
Information on pandoc
is found at the bottom.
To process many files at a time, wrap the above commands in a for
loop or similar:
for i in *.xml; do echo $i; saxonXSL -s:$i -xsl:$GIELLA_CORE/devtools/forrest_xml2plain_xml.xsl -o:$i.html; done
find . -name "*.ht*" | while read i; do pandoc -f html -t gfm "$i" -o "${i%.*}.md"; done
And finally, to retain document history, it is best to do content change and document renaming as two distinct operations, due to git
s unwillingness to track files. That is, do as follows:
This way git
will be able to track the file history across the file renames.
When all documents are converted, one needs to check and update links. Documentation internal links should point directly to the Markdown files (link to test.md
, not to test.html
), while external links should be complete URL’s.
Beware of html
files that should NOT be converted, e.g. speller test result pages. Such pages will be rendered as is, with the information given in the html source, using CSS, JS and everything. If the automatic processing above have turned such pages into Markdown, the change must be reversed before committing - GitHub Pages can’t handle custom JS at all AFAIK.
Install pandoc
using MacPorts, Brew or download package:
sudo port install pandoc
brew install pandoc
More info on the home page.