Archive for August, 2007

Programming packages

Saturday, August 11th, 2007 | LaTeX | 1 Comment

Now that we have seen how to both write commands and environments, it is time to look at how we can reuse this code in several different documents. The basis of reusability in LaTeX is handled through packages, otherwise known as .sty-files. A copious amount of packages exist for LaTeX at CTAN, and we have seen several of them already: tikz, graphicx, fontenc, mathdesign, and many others. Common to each package is that it provides a number of commands and environments that gives you some form of functionality. Tikz allows us to create vector-based drawings directly from inside LaTeX, graphicx allows us to interface with a number of existing graphics formats, mathdesign provides new math fonts that correspond to a number of free and non-free serif fonts.

So, if you have some form of functionality that you use repeatedly in different projects and would like to abstract it away in a separate package that you can easily use in your other projects, a package is what you are looking for. The basis of creating a package is to pick a filename, say, mypackage.sty (on some systems you may have to keep to the 8.3 filename standard, so it may be prudent to name your packages to fit inside that, but it is not a requirement).

Every package starts with either the first or both of the following lines:

\ProvidesPackage{mypackage}[2006/05/01 version 1.0 by Someone]
\NeedsTeXFormat{LaTeX2e}

\ProvidesPackage’s name is required to be the same as the file of the package, and \NeedsTeXFormat specifies what document format is required in order to use the package (if it is using functionality specific to a certain format). Here, LaTeX2e is what most of us use when we write documents, but plenty of others exist as well.

Packages can, of course, also depend on other packages, but unlike in the main document, we do not use \usepackage to include these, rather, a separate command, \RequirePackage exist instead that otherwise does the same thing (more or less). So if we were to require some higher level constructs for directing command flow using the ifthen package, we could use the following line of code inside our package:

\RequirePackage{ifthen}

With some packages it is also possible to pass them options, for instance for the babel package that takes care of renaming various commands to a specific language:

\usepackage[british,french]{babel}

These options dictate that we should load both the british and french modules and that we want to pre-select french (as that is the last option) for the language to write in. Likewise it might be relevant for your package to provide a number of options that can be used to control overall functionality of the package or pre-define certain environments or commands. Declaring a package option is done using the \DeclareOption command. So if we were to declare a theorem option, it might look like this:

\DeclareOption{theorem}{%
  % code goes here
}

And when we subsequently use the package in a document we can write:

\usepackage[theorem]{mypackage}

All the options aren’t executed at the point of declaration though, it requires a separate command, \ProcessOptions that can occur anywhere in the package code (though naturally after the \DeclareOption and before the end of the package). A reason to place it later in the document than immediately after your declared options could be that some options redefine commands that are introduced later on in the package, for instance.

Most of the remaining code in a package is just a collection of introducing new commands and environments. There is one thing that is worthy of notice, though, namely that command names that are internal to packages are typically given a @-sign inside them, as these are not directly available in normal documents, as we have seen before. Inside a package the @-sign behaves as a letter and you do not need to surround all such uses with \makeatletter and \makeatother. Also, remember that commands must be unique, there are no overloading, so try to pick names that don’t conflict with other packages. Of course, given the sheer number of packages in existence, this may prove rather difficult.

This is basically all there is to creating packages. The rest is pretty much up to what you need to abstract for kinds of behaviour. Looking back at the code we have seen in this blog, we might, for instance, pick out the layout of a book to package. That way we can reuse this layout at a later time, merely by including a package. And that is, in short, the basic idea of packages.

Tags: , , ,

Programming LaTeX — writing environments

Saturday, August 4th, 2007 | LaTeX | No Comments

We have previously seen how to create commands in LaTeX. Today, we will be looking at how we can create environments, that is the things that are typically started with a \begin and ended with an \end command. In the course of writing a document with a barebones LaTeX setup, we may see several of these environments: document, figure, table, itemize and enumerate. Each of these environments dictates some form of structure that helps abstract the actual layout of your document from the contents and structure of your document, an important tenant in the LaTeX world.

We can, for instance, imagine that we are writing a document where all figures are centered before they are drawn. In many cases this is accomplished by people using roughly the following code:

\begin{figure}
  \begin{center}
    % figure code goes here
  \end{center}
  \caption{...}
  \label{...}

\end{figure}

But with this, we have mixed together structure and layout, so let us drag the author out back and shoot him before it is too late. Rather, what we would want to do is somehow create a new environment, say myfigure, that takes care of centering the contents. In order to introduce a new environment, we may use the \newenvironment command, which functions very much like \newcommand except that it takes two arguments: what happens before the content, and what happens after the content. So using this knowledge, we can wrap up the centering code:

\newenvironment{myfigure}{
  \begin{figure}\begin{center}
}{
  \end{center}\end{figure}
}

Do note that the order of appearance of the figure and center environments matter. If you try to end a figure environment with a center environment, LaTeX will throw errors your way. With our brand new environment, we can now write our figure using our own environment:

\begin{myfigure}
  % figure code goes here
  \caption{...}
  \label{...}
\end{myfigure}

Thus, we nicely separate our layout and our structure. Normally it would seem prudent to find a better name than myfigure, one that better describes the purpose of the structural component, but sometimes no nifty names come to mind.

Like with commands, environments can also be given a number of parameters. These parameters are, however, only available in the ‘before content’ code. If we often customise our itemize environments to use different kinds or itemisation symbols, we could create an environment that lets us specify these easily:

\makeatletter
\newenvironment{myitemize}[1]{
  \begin{itemize}
    \expandafter\renewcommand\expandafter{
      \csname labelitem\romannumeral\the\@itemdepth\endcsname}{#1}
}{
  \end{itemize}
}

\makeatother

Now we can use this to change the item symbol like this:

\begin{myitemize}{$\star$}
  \item Test
\end{myitemize}

And there would be a nice little star to the left of ‘Test’.

This is basically what there is to environments at the surface. This leaves the technical details of how they are actually implemented (as they are not a part of plain TeX). In fact, \newenvironment is just a cover around creating two different commands: \environmentname and \endenvironmentname. Here the parameters are passed to the \environmentname exclusively, as that is typically where they are needed. Now, invoking \begin{environmentname} is not entirely the same as invoking \environmentname as \begin also sets up some other variables and starts a new group (scope in regular programming).

The primary goal of using environments, though, is to separate the content from the layout. If you keep this in mind when creating documents it will be extremely easy to tweak the layout without having to go through the entire document every time you wish to change all your figures to be centered, for instance.

Tags: ,

Programming LaTeX — writing commands

Friday, August 3rd, 2007 | LaTeX | No Comments

Most typesetting software lets the user operate within its functionality and thorough programming knowledge is required to write any form of extension to the system, typically in the form of a module written in C. With LaTeX this form of extensibility is built into the language, allowing you to program your own solutions, and to easily use other peoples’ solutions to problems. One of the key places to find these other solutions is at places like CTAN (the Comprehensive TeX Archive Network).

While many premade solutions exist, there will invariably be times where you need something custom made. Today we will look at the basics of creating new commands in LaTeX. Previous familiarity with programming will be a benefit, but, I hope, not an absolute requirement.

A command in LaTeX can accomplish pretty much anything (for the really interested, LaTeX is Turing complete), but for our purposes it is mostly used to abstract away formatting or change the values of counters or lengths. As such, familiar things in LaTeX such as \section, \usepackage and the like are all commands that abstract some behaviour. To familiarise ourselves with how to introduce commands, we will look at a fictional problem: We are writing a larger document containing a lot of acronyms, and we’d like to be able to standardise on the typography for printing acronyms, and ensure that the first time an acronym is used its full definition is written out.

Introducting a command in LaTeX is done using the \newcommand command. Its full definition is on the following form:

\newcommand{command-name}[number-of-arguments]{body}

Inside the body, the arguments can be referenced by typing #n where n is the n’th argument to the command. With this in mind, let us create an \ac command for typesetting acronyms.

\newcommand{\ac}[1]{\textbf{#1}}

What this does it create a command, \ac that takes one argument, and it expands to boldfacing the argument in the text. If you find using bold type for typesetting acronyms bad style, there is, fortunately, a single place to correct the formatting with this command, rather than having to go through the entire document and correct every single acronym.

While this lays the foundation for the command, what we would really like is to be able to define a number of acronyms, i.e. both their short forms and their expanded forms, and have LaTeX make sure that we have always presented the long form of an acronym the first time we use it. This is a good deal tricker, as we will have to introduce commands that are constructed dynamically. Basically we would like to be able to define an acronym like: \acronym{ETA}{Expected Time of Arrival}, and when we use it the first time have it print something like: ETA (Expected Time of Arrival) and just ETA at subsequent occurrances in the text.

So at the core we have something like:

\newcommand{\acronym}[2]{
}

And we need to fill something into the body. If we just do something like this:

\newcommand{\acronym}[2]{
  \newcommand{\acronym#1}{#2}
}

Then we’re told that the command \acronym already exists. So we need some way of splicing acronym and our command together to form a whole command. This can be done with the \csname and \endcsname constructs, like this:

\newcommand{\acronym}[2]{
  \newcommand{\csname acronym#1\endcsname}{#2}
}

However, now this tells us that the \csname command is already defined. So we need some way to indicate to LaTeX that it should resolve and concatenate acronym#1 first, then invoke \csname and finally call \newcommand. We can accomplish this using the fairly low-level command \expandafter. This command does, when placed in front of a command indicate that what comes after it should be expanded before we call the command. So we have to jump over \newcommand and \csname, so we will need two uses of \expandafter, like this:

\newcommand{\acronym}[2]{
  \expandafter\newcommand\expandafter{\csname acronym#1\endcsname}{#2}
}

Using this with our call to \acronym{ETA}{Expected Time of Arrival} creates a new command, \acronymETA, which we can call normally, and this command will print ‘Expected Time of Arrival’.

The next thing we need to accomplish is to only print this expanded text the first time we use the acronym. To aid us slightly, we use the ifthen package’s commands \newboolean and \setboolean as follows:

\newcommand{\acronym}[2]{
  \newboolean{acronym#1}
  \setboolean{acronym#1}{true}
  \expandafter\newcommand\expandafter{\csname acronym#1\endcsname}{#2}
}

Here the \newboolean figures out that the name of the boolean should be expanded before the creation of a new boolean on its own, so we don’t have to place manual \expandafter commands throughout the code.

This means that we have everything in place to create a new command that prints the actual acronym:

\newcommand{\ac}[1]{
  \ifthenelse{\boolean{acronym#1}}
  {#1 (\csname acronym#1\endcsname)
    \setboolean{acronym#1}{false}}
  {#1}
}

Using this as follows:

The \ac{ETA} is now 5 minutes. The \ac{ETA} is now 4 minutes.

Would result in the following: ‘The ETA (Expected Time of Arrival) is now 5 minutes. The ETA is now 4 minutes.’ This is, of course, all great and well, but recreating all of \ac can be a bit taxing, if we want to change the formatting of parts of the word. So let us create two more commands to typeset the acronym with and without the expanded text:

\newcommand{\printlongacronym}[2]{#1 (#2)}
\newcommand{\printshortacronym}[1]{#1}

And we change the definition of \ac to use these as follows:

\newcommand{\ac}[1]{
  \ifthenelse{\boolean{acronym#1}}
  {\printlongacronym{#1}{\csname acronym#1\endcsname}
    \setboolean{acronym#1}{false}}
  {\printshortacronym{#1}}
}

This now lets us change the formatting of acronyms merely by changing the definition of \printlongacronym and \printshortacronym, thus abstracting away all the technical issues of changing boolean values and whatnot from the end user. The user just needs to focus on how to adapt these two commands for his formatting needs. And, if these things are wrapped away in a package, the user can use \renewcommand to change the meaning of the commands.

This is basically all there is to commands: give it a name, some arguments and do something in the body. Even a simple command such as typesetting some code inline in the text might be a boon if you have to go over the document at a later time to fix up the layout.

Advanced commands and TeXing

Now that we have our fancy \ac function, it is time to consider what happens if we write something like: ‘Some of the \ac{CIA}'s archives have recently been opened up.’ The outcome could be one of two things, depending on whether this is the first use of the CIA acronym or not, namely:

Some of the CIA’s archives have recently been opened up.

or

Some of the CIA (Central Intelligence Agency)’s archives have recently been opened up.

In the latter case, the possessive should still have been attached to CIA rather than the parenthesis. One way to cope with this is to use TeX’s mechanism for defining functions and in particular functions with optional parameters. This is done using the TeX function \def. If we were to code a command with optional parameters (keeping in mind that optional parameters are typically given in []’s as the first part of a command, in TeX), it would look something like this:

\def\acopt[#1]#2{
 \ifthenelse{\boolean{acronym#1}}
  {\printlongacronym{#2#1}{\csname acronym#2\endcsname}
    \setboolean{acronym#2}{false}}
  {\printshortacronym{#2#1}}
}

This basically means we can call it like this: \acopt['s]{CIA} and we’d get:

CIA’s (Central Intelligence Agency)

if this was the first use of the acronym. Now, we could just call \acopt whenever we need to add some fancy possessive or what have you to an acronym. However, we’d like to be able to extend \ac so that we only have to type one command, and that’ll figure out whether to call \acopt sensibly. In order to do this, we must use a built-in command \@ifnextchar. This command is primarily meant to be used from inside a package, but we can use it in normal document code by surrounding it by the two commands \makeatletter and \makeatother. These two commands are necessary, as @ isn’t treated as a normal letter in LaTeX commands and thus we’re required to change @’s code group (if you’re getting confused now, don’t worry, you can just use the two commands around the location where you need to use a command with an @ in it and worry about the details at some later time). With this we can define \ac like this:

\makeatletter
\def\ac{\@ifnextchar[\acopt\acnoopt}
\makeatother

What this does it it queries the token stream (the next bunch of characters in the document, for a very loosely hand-waved definition) and if the next character is a left-bracket we call \acopt, or otherwise we call \acnoopt, the latter of which we can define very easily:

\def\acnoopt#1{\acopt[]{#1}}

With this we can now both write \ac['s]{CIA} and \ac{CIA} and everything will sort itself out nicely.

On the horison

Apart from these few commands we’ve seen, it is also possible to define commands globally using \gdef (commands are otherwise local to the environment they’re declared in), redefine LaTeX commands using \renewcommand, create commands that cannot include paragraph changes using \newcommand* and many, many others. However, the command construction you’ve seen above goes for pretty much all the other commands as well, so presuming you’ve gleaned some meaning from my ramblings, you should be able to get somewhere fast no matter the situation.

Tags: ,