PyTeX – Python plus TeX |
updated 20 Mar 2005 |
At EuroTeX 2005 we presented QaTeX.
(La)TeX macro programming is hard. Python is a powerful and easy to use scripting language. QaTeX allows Python modules to be used instead of (La)TeX style files. With QaTeX (pronounced `kwa-tech') TeX asks Questions and Python provides Answers.
QaTeX on Sourceforge | EuroTeX 2003 paper - 'TeX forever!' | (Added 20 March 2005)
We've given TeX a Python callable function interface, with the TeX daemon running behind the scenes. This is proof of concept, but might already be useful. Download demo01.py on Sourceforge and follow the instructions there. (Added 17 Mar 2003.)
PyTeX is now on Sourceforge http://sourceforge.net/projects/pytex. (Added 10 Mar 2003).
On Sourceforge you'll find PyTeX source code, developer info, and mailing lists. (Added 10 Mar 2003).
Follow Documents for documents relating for PyTeX. (Added 3 Mar 2003).
PyTeX is Python programming plus TeX typesetting.
PyTeX is an open source project
With PyTeX, Python programmers can write
from tex import tex, plain document = 'My beautiful \\TeX\ document.\n' (dvi, log) = tex(plain, document)
to use TeX from within Python.
We are now working on setting up Python as a scripting language, for generating input strings for TeX. This is an alternative to writing TeX macros.
Think of Tcl/Tk. Tcl is a scripting language and Tk is a toolkit for building GUI programs. Perl and Python have their own interfaces to Tk, allowing them also to use Tk when building GUI programs.
Now think of LaTeX as La/TeX. ‘La’ is a front end to Don Knuth's typesetting program TeX. ‘La’ is written in TeX's macro language.
PyTeX, or Py/TeX is you prefer, is to be a front end to TeX, written in Python. Classes, objects, methods and exceptions can be used instead of TeX macros.
A word about TeX macros. In 1996 Don Knuth was asked [1] if he had thought of giving TeX a better programming language, something that was easier to use. Don replied
It would be nice if there was a well-understood standard for interpretive programming languages inside of an arbitary application. [...] Now, if there were a universal simple interpretive language that was common to other systems, naturally I would have latched onto that right away.
Things have changed since the 1980s, when TeX was written. For a start, there's Python. Python is a universal simple interpretive language. It even runs on PDAs. PyTeX will make TeX's timeless typesetting algorithms available to the Python programmers.
PyTeX will convert suitable Python objects into TeX typesetting commands, which it will then pass to TeX. PyTeX will return to Python TeX's typeset output, in the form of dvi. In other words, PyTeX will make TeX available as a callable function.
TeX is slow to start, but quick once it gets going. Here is an example.
guest@host:~$ time tex \\end This is TeX, Version 3.14159 (Web2C 7.3.7) No pages of output. Transcript written on texput.log. real 0m0.110s guest@host:~$ time tex story \\end This is TeX, Version 3.14159 (Web2C 7.3.7) (/usr/share/texmf/tex/plain/base/story.tex [1]) Output written on story.dvi (1 page, 668 bytes). Transcript written on story.log. real 0m0.114s
On this small file, startup is 96.5% of the running time!
PyTeX will run TeX as a daemon, to avoid this startup cost. Daemon mode allows TeX to be used as the typesetting engine in interactive programs.
Plain TeX and LaTeX use a ‘backslash and braces’ input syntax. These macro packages define not only the typesetting outputs but also the user's input syntax.
PyTeX does not have an input syntax. It will take as input any suitable Python object, not a text file. Here's an example.
from xml.dom import minidom from tex import tex from somewhere import mystyle dom = minidom.parse("myfile.xml") dvi = tex(mystyle, dom)
In plain and LaTeX the syntax and the processing are combined. In PyTeX they are separate components.
Often, typesetting is not enough. HTML is also required. There are LaTeX to HTML translators, but on complex documents they are somewhat delicate.
Here's why. The translator has to emulate the backslash and braces input syntax of TeX the program. It also has to generate section numbers, equation numbers and so on using the same rules as LaTeX. So there are two piece of software implementing the same informally defined syntax.
A wiki is a collection of web pages that can be edited by their readers. A wiki uses a small markup language for its pages, not full-blown HTML.
A wiki for PyTeX will be set up soon.
XML is hard work to key by hand. It lacks the mark-up minimization that SGML has. It is verbose. Easy to write a parser for, but verbose.
Wikis are easy to use. It's easy to put links in a wiki page, or to generate a list. Wikis can be used to generate XML and well as HTML. The Linux Documentation Project are setting up a Wiki for writing DocBook XML.
For complex documents, one wants both the rigor of XML and the usability of a wiki. These wiki developments will make TeX easier to use.
For more information, visit this page again in a week or two. Or email us info@pytex.org.
[1] Donald E. Knuth, Digital Typography, CSLI (1999), pp648-9.