⟦148ed817a⟧

TextFile

% for `long' items that get a line to themselves:
\newcommand\lin[1]{\item[#1]\leavevmode\\}

The \mctex\ library is itself divided into various subsystems.
We will take these from lowest-level to highest,
and otherwise in no particular order.

\subsection{Types}
The most basic part of \mctex\ consists of just a header file,
{\tt h/types.h},
which contains all---or as much as possible---of the machine-dependent
type and macro definitions.
The most important of these are:
\begin{center}
\begin{tabular}{@{\coltt}rl}
i16 & a 16-bit signed integral type, usually {\tt short} \\
i32 & a 32-bit signed integral type, usually {\tt long} \\
ui32 & the unsigned variant of {\tt i32}, usually {\tt unsigned long} \\
Sign8 &	a macro to sign-extend an 8 bit value to 32 bits \\
Sign16 & a macro to sign-extend a 16 bit value to 32 bits \\
Sign24 & a macro to sign-extend a 24 bit value to 32 bits \\
UnSign8 & a macro to extend an unsigned 8 bit value to 32 bits \\
	& (without sign extension: e.g., $254$ does {\em not} become $-2$) \\
UnSign16 & a macro to extend an unsigned 16 bit value to 32 bits \\
UnSign24 & a macro to extend an unsigned 24 bit value to 32 bits
\end{tabular}
\end{center}
Each of the extension macros
takes a single argument giving the value to be extended
and yeilds a result of type {\tt i32}.
These arguments can be any integral C expression
that does not have side effects.
The unsigned-extension macros must truncate overlarge arguments;
for instance, {\tt UnSign16(-6)} must produce the value $65530$.
There are other macros in {\tt types.h}
which are used by other part of \mctex\
to isolate system dependencies;
they are described in {\tt types.h} itself
and occasionally referred to here
in the descriptions of routines that use them.
More macros may be added as they become necessary.

\subsection{Conversions}\label{sec:conv}
The header {\tt h/conv.h} and the library file {\tt lib/conv.c}
together define types, macros, and routines for {\em conversions\/}
from \dvi\ units (scaled points)
to device-dependent units (pixels).
There is a single distinguished global conversion
based on the resolution of the device
and the magnifications being applied;
other conversions may also be created
by declaring variables of type {\tt Conv}.

Within {\tt conv.h} there are two `local' macros,
{\tt ROUND} and {\tt CEIL},
which round off (round) and up (ceil)
a floating-point value to the nearest {\tt i32} equivalent.
For instance, {\tt ROUND(1.2)} produces~$1$,
{\tt ROUND(1.501)} produces~$2$,
{\tt ROUND(-1.2)} produces~$-1$,
and {\tt ROUND(-1.501)} produces~$-2$,
but {\tt CEIL(1.2)} produces $2$.
Rounding must work for all values,
while {\tt CEIL} is never handed negative numbers.
These macros may ultimately prove to be machine-dependent,
and when porting,
it would not be a bad idea to verify that they really work
on the target system.
It may also be possible to do the rounding
in a fast, machine-dependent manner
(e.g., by using special machine instructions);
this will improve performance,
but be sure to enclose any system-dependent changes to {\tt conv.h}
within {\tt \#ifdef}/{\tt \#endif} pairs.

\begin{dtt}
\lin{CSetConversion(Conv *\var*{c}, int \var*{dpi}, int \var{usermag},
i32 \var{num}, i32 \var{denom}, i32 \var{dvimag})}
	sets up a conversion.
	The first argument is the address of the conversion to be set
	(and thus has type {\tt Conv *}).
	\var{Dpi} is the device resolution in dots per inch.
	\var{Usermag} is the user's magnification (1000 = unmagnified).
	\var*{Num}, \var*{denom}, and \var{dvimag} are the three values
	from a \dvi\ file preamble or postamble.
\lin{SetConversion(int \var*{dpi}, int \var{usermag},
i32 \var{num}, i32 \var{denom}, i32 \var{dvimag})}
	sets the global conversion.
	It is exactly like {\tt CSetConversion}
	but without the first argument.
	This is normally invoked indirectly via {\tt DVISetState}.
\item[i32 cfromSP(\var*c, \var v)]
	converts a value \var v from scaled points to pixels
	according to the conversion \var* c.
\item[i32 ctoSP(\var*{c}, \var{v})]
	converts a value \var v from pixels to scaled points
	according to the conversion \var* c.
\item[i32 fromSP(\var v)]
	converts its argument from scaled points to pixels
	according to the global conversion.
\item[i32 toSP(\var v)]
	converts its argument from pixels to scaled points
	according to the global conversion.
\item[i32 CConvRule(\var* c, \var v)]
	converts a value \var v
	from scaled points to pixels
	according to the conversion \var*{c}.
	Unlike {\tt cfromSP}, however, {\tt CConvRule} rounds up;
	this must be done by all \dvi\ interpreters
	for all rule dimensions.
\item[i32 ConvRule(\var v)]
	converts its argument from scaled points to pixels
	according to the global conversion.
	It rounds up like {\tt CConvRule}.
\end{dtt}
The conversions from and to scaled points
use the {\tt ROUND} macro.
They work with value arguments of any type (including floating point)
but are normally applied only to {\tt i32} values.
{\tt ConvRule} and {\tt CConvRule} use the {\tt CEIL} macro,
and must not be handed negative values.

\subsection{\dvi\ Codes and Classes}
The header file {\tt h/dvicodes.h}
defines all the \dvi\ opcodes.
All the definitions are of the form {\tt DVI_X}
for the opcode {\em x}.
For instance, {\tt DVI_SETRULE} is 132, the {\em setrule} opcode.
More interesting are the {\em classification tables\/}
defined by {\tt h/dviclass.h} and {\tt lib/dviclass.c}.
The header defines the values for each class code,
and also defines two boolean macros
{\tt DVI_IsChar} and {\tt DVI_IsFont},
which test whether their argument opcode represents
a parameterless set-character or font-changing opcode
respectively.
The classification tables
reduce the 256 possible opcode values
down to 8 {\em operand length\/} codes
(obtained by the macro {\tt DVI_OpLen})
and 28 {\em operator class\/} codes
(obtained with {\tt DVI_DT}).
The operand length codes are listed in Table~\ref{tab:paramlen},
and the class codes are in Table~\ref{tab:classcodes}.
The operand length code of any opcode
describes the length in bytes of the (usually first and only) operand
and whether it is signed;
the class describes what kind of command it is.
In most cases each will be obvious.
For instance, the {\tt DVI_RIGHT2} opcode
has a {\tt DVI_OpLen} of {\tt DPL_SGN2}
and a {\tt DVI_DT} of {\tt DT_RIGHT}.
\begin{table*}
\centering
\begin{tabular}{|@{\coltt}ll|}
\hline
DPL\_NONE & No operand, or complex operands \\
DPL\_SGN1 & One-byte operand, signed (range $-128 \le n \le 127$) \\
DPL\_SGN2 & Two-byte operand, signed (range $-32768 \le n \le 32767$) \\
DPL\_SGN3 & Three-byte operand, signed (range $-8388608 \le n\le8388607$) \\
DPL\_SGN4 & Four-byte operand, signed (range
		$-2147483648\le n\le2147483647$) \\
DPL\_UNS1 & One-byte operand, unsigned (range $0 \le n \le 255$) \\
DPL\_UNS2 & Two-byte operand, unsigned (range $0 \le n \le 65535$) \\
DPL\_UNS3 & Three-byte operand, unsigned (range $0 \le n \le 16777215$) \\
\hline
\end{tabular}
\caption{Operand parameter lengths from {\tt DVI\_OpLen}}\label{tab:paramlen}
\end{table*}
\begin{table*}
\centering
\begin{tabular}{|@{\coltt}ll|}
\hline
DT\_CHAR	& A character opcode ($0 \le \hbox{\var{code}} \le 127$) \\
DT\_SET		& One of the four `set' opcodes \\
DT\_SETRULE	& The {\tt DVI\_SETRULE} opcode \\
DT\_PUT		& One of the four `put' opcodes \\
DT\_PUTRULE	& The {\tt DVI\_PUTRULE} opcode \\
DT\_NOP		& The {\tt DVI\_NOP} opcode \\
DT\_BOP		& The {\tt DVI\_BOP} opcode \\
DT\_EOP		& The {\tt DVI\_EOP} opcode \\
DT\_PUSH	& The {\tt DVI\_PUSH} opcode \\
DT\_POP		& The {\tt DVI\_POP} opcode \\
DT\_RIGHT	& One of the four `move right' opcodes \\
DT\_W0		& The {\tt DVI\_W0} opcode \\
DT\_W		& One of the four `w' opcodes \\
DT\_X0		& The {\tt DVI\_X0} opcode \\
DT\_X		& One of the four `x' opcodes \\
DT\_DOWN	& One of the four `move down' opcodes \\
DT\_Y0		& The {\tt DVI\_Y0} opcode \\
DT\_Y		& One of the four `y' opcodes \\
DT\_Z0		& The {\tt DVI\_Z0} opcode \\
DT\_Z		& One of the four `z' opcodes \\
DT\_FNTNUM	& One of the sixty-four `fntnum' opcodes \\
DT\_FNT		& One of the four `fnt' opcodes \\
DT\_XXX		& One of the four `xxx' (\verb|\special|) opcodes \\
DT\_FNTDEF	& One of the four `fntdef' opcodes \\
DT\_PRE		& The {\tt DVI\_PRE} opcode \\
DT\_POST	& The {\tt DVI\_POST} opcode \\
DT\_POSTPOST	& The {\tt DVI\_POSTPOST} opcode \\
DT\_UNDEF	& One of the six undefined \dvi\ opcode values \\
\hline
\end{tabular}
\caption{Opcode classes from {\tt DVI\_DT}}
\label{tab:classcodes}
\end{table*}
There are only a few opcodes
that have more than one parameter,
specifically {\em bop},
{\em fntdef1}, {\em fntdef2},
{\em fntdef3}, and {\em fntdef4}.
Here {\tt DVI_OpLen(DVI_BOP)} is {\tt DPL_NONE},
as its parameters are fixed and can be gathered easily later,
while {\tt DVI_OpLen(DVI_FNTDEF3)},
for instance, is {\tt DPL_UNS3}.
The rest of the parameters of a `fntdef' opcode
are harder to describe,
but do not depend on the opcode itself,
so once the first parameter (the font number) has been read,
the rest can be done in the same way for all four `fntdef' opcodes.

\subsection{\dvi\ File Input/Output}
The header {\tt h/fio.h} and its code file {\tt lib/fio.c}
declare and define functions and macros
to read and write one-, two-, three-, and four-byte \dvi\ values
from or to {\tt stdio} files.
The macros for reading defined here are intended for `inner loops'
where speed is important, and do not handle exceptions;
the function versions offer such end-of-file and error checking
at the cost of a function call.
The write macros---there are no write functions---also
do not check for errors.
The macros are listed in Table~\ref{tab:fio}.
\begin{table*}
\centering
\begin{tabular}{@{\coltt}ll}
\multicolumn{1}{c}{\bf Macro} &
	\multicolumn{1}{c}{\bf Action} \\
\hline
int fgetbyte(FILE *\var{fp}) &
	gets a byte just like {\tt getc} \\
\var?\ fGetWord(FILE *\var*{fp}, \var{var}) &
	sets \var{var} to a two byte value \\
\var?\ fGet3Byte(FILE *\var*{fp}, \var{var}) &
	sets \var{var} to a three byte value \\
\var?\ fGetLong(FILE *\var*{fp}, \var{var}) &
	sets \var{var} to a four byte value \\
void putbyte(FILE *\var*{fp}, \var{v}) &
	puts the one byte value \var{v} \\
void PutWord(FILE *\var*{fp}, \var{v}) &
	puts the two byte value \var{v} \\
void Put3Byte(FILE *\var*{fp}, \var{v}) &
	puts the three byte value \var{v} \\
void PutLong(FILE *\var*{fp}, \var{v}) &
	puts the four byte value \var{v} \\
%\hline
\noalign{\hrule height\arrayrulewidth width 4em}
\multicolumn2l{\quad Notes:} \\
\multicolumn2l{\qquad \var{v} may have any type, but {\tt i32} is usual} \\
\multicolumn2l{\qquad \var{var} should be an {\tt i32} lvalue} \\
\multicolumn2l{\qquad results marked \var? should be considered \void}
\end{tabular}
\caption{Fast I/O Macros}\label{tab:fio}
\end{table*}
The \var{v} and \var{var} arguments should have type {\tt i32},
but the result of {\tt fGetWord} will fit in an {\tt i16}.
Normally the results from {\tt fGetWord}, {\tt fGet3Byte},
and {\tt fGetLong} are ignored;
the macros are used only for the side effect of setting the variable.
Because of the way they are coded,
they wind up evaluating to the rvalue type
and final value of their \var{var} arguments,
but it is simpler to pretend that they return \void.
Except for {\tt fGetLong}, none of the results are signed;
they must be put through one of the sign extension macros
if they are ever to become negative.
The \var{var} arguments must be lvalues (variables),
since they will be modified by the read macros.
The {\tt putbyte} macro uses its \var{v} argument only once;
that is, it is function-like with respect to \var{v}
(but not to \var{fp}).
Note that {\tt fGetWord} and {\tt fGet3Byte}
may return values that are not in their normal ranges
($0$ through $65535$ and $0$ through $16777215$ respectively)
if they run into an end-of-file (or error).
The unsigned-extension macros will truncate such values,
limiting the potential for damage.
There are four functions that correspond to the `get' macros;
these check for, and abort on, end-of-file or error.
Their names are {\tt GetByte}, {\tt GetWord}, {\tt Get3Byte},
and {\tt GetLong},
and they take a single {\tt FILE *} parameter
and return an {\tt i32} result.
Unlike the macros,
the functions sign-extend the values before returning them.

\subsection{Errors}
\mctex\ programs report external errors---that is,
errors due to bad input files or options,
rather than internal consistency errors---in a uniform fashion
using the {\tt error} routine.
This is a variadic function
after the style of {\tt printf},
but it has two additional required arguments.
The programs report internal errors
(and, on \Unix\ machines, crash with a core dump;
the action on other machines is system-dependent)
using the routine {\tt panic}.
Both of these are found in {\tt lib/error.c},
with declarations in {\tt h/error.h}.

Unusual interactive programs (such as {\tt texx})
sometimes need to trap error messages,
so as to put them in a window,
and to keep from stopping entirely on `normal' aborts
(those not due to internal errors).
Such programs can register an {\em error trap function\/}
with {\tt SetErrorTrap}.
The registered function is called with the formatted error text,
and with the abort or panic code (see below);
it can use {\tt longjmp} to avoid an abort.
Unless the program is trapping errors,
error output is preceded by the program's name
as stored in the global variable {\tt ProgName}.
Programs must therefore set {\tt ProgName} to {\tt argv[0]}
in {\tt main}.
All untrapped output, and all panic output,
goes to {\tt stderr} (and thus includes the program's name).
\begin{dtt}
\item[error(int \var*{quit}, int \var*{syserr}, char *\var*{fmt}, \dots)]
	prints the string \var{fmt}
	and any additional arguments contained in `\dots',
	just as would printf.
	After the text from \var{fmt} and its arguments,
	if \var{syserr} is not zero,
	{\tt error} prints the associated system error string.
	If \var{syserr} is given as {\tt -1},
	{\tt error} uses the most recent error number
	contained in the global variable {\tt errno};
	the latter is set by all failed \Unix\ system calls
	and often by library routines.
	If \var{syserr} is zero,
	or if it is {\tt -1} but {\tt errno} is zero,
	no string is added.
	{\tt Error} then adds a newline,
	and if \var{quit} is nonzero,
	aborts (exits) with the status value in \var{quit}
	as an error code.\footnote
	{This is suitable for \Unix, but people using, e.g., VMS
	might want to change the code to call exit with {\tt 2*quit}.
	A better approach would be to fix the VMS C library
	to translate exit status 0
	to `success' and anything else to a non-specific `failure',
	but it is a bit late for that.}
	This sort of early exit is called `aborting' in this document,
	although it has nothing to do with the C library {\tt abort}.
	If \var{quit} is zero, {\tt error} returns to its caller.
\item[panic(char *\var*{fmt}, \dots)]
	prints the program's name from {\tt ProgName},
	then the word {\tt panic},
	then prints the string \var{fmt}
	(along with arguments in `\dots') as would {\tt printf}.
	All of this goes to {\tt stderr},
	regardless of any active error trap function.
	{\tt Panic} then calls the C library {\tt abort} routine
	to enable debugging (on \Unix, via core dumps).
	In this document, this is called `panicking'.
\item[SetErrorTrap(void (*\var{fn})(int \var*{quit}, char *\var{text}))]
	sets the error trap function to \var{fn},
	or, if \var{fn} is a nil pointer
	(\verb|(void (*)(int, char *))NULL|),
	clears the error trap.
	Later calls to {\tt error} (but not {\tt panic})
	will format the error text as usual
	(omitting the program's name, but including any system error,
	and including a final newline),
	but instead of printing the result to {\tt stderr},
	will collect the text into a buffer
	and pass that text to \var{fn}.
	The error trap function will also receive the \var{quit} flag.
	The trap function can use {\tt longjmp}
	to escape {\tt error}'s normal abort rule
	when \var{quit} is nonzero.

	If {\tt SetErrorTrap} is unable to create a temporary file
	to hold error texts,
	it will itself abort by calling {\tt error}
	after removing any current error trap.
	This can only happen if the user has set {\tt \$TMPDIR}
	or is not able to write in {\tt /tmp}
	(see \S\ref{sec:seek}, p.~\pageref{sec:seek}).
\end{dtt}

\subsection{Gripes}
There is a special class of errors called {\em gripes},
invented largely to reduce the amount of typing needed
to write a driver.
These are divided into two sub-classes:
`generic gripes'
and `\dvi\ gripes'.
The generic gripes are in {\tt lib/gripes0.c}
and the \dvi\ gripes are in {\tt lib/gripes1.c};
both kinds are declared by {\tt h/gripes.h}.
Each function has a name that begins with the word {\tt Gripe}.
They are described here
primarily by the format strings they pass to {\tt error}.
All the {\tt \%d} arguments come from an {\tt int} parameter;
the {\tt \%ld} arguments come from an {\tt i32} parameter
(cast to {\tt long} so that it may be printed portably),
and the {\tt \%s} arguments come from a {\tt char~*} parameter.
Where it is not obvious,
the function description will say which format arguments
correspond to which function arguments.
\begin{dtt}
\lin{GripeOutOfMemory(int \var* n, char *\var{why})}
	aborts with the message
	`{\tt ran out of memory allocating \%d bytes for \%s}'.
\lin{GripeCannotGetFont(char *\var{name},
i32 \var*{mag}, i32 \var*{dsz}, char *\var*{dev}, char *\var{fullname})}
	prints the message
	`{\tt cannot get font \%s scaled \%d}'
	(here the string is the \var{name} argument,
	and the number is the ratio $1000\var*{mag}/\var*{dsz}$,
	rounded;
	the word {\tt scaled} and the number are omitted
	if the scale is 1000).
	If \var{fullname} is not nil,
	it then prints `{\tt (wanted, e.g., "\%s")}'
	(where the string comes from \var{fullname}),
	and then returns to its caller.
	Otherwise it complains more strongly:
	if the \var{dev} parameter is not nil, it says
	`{\tt (there are no fonts for the \%s engine!)}'
	(where the string is from \var{dev});
	but if \var{dev} is nil, it says instead
	`{\tt (I cannot find any fonts!)}'
	and in either case it then aborts.
	This happens only if the fontdesc file
	does not list any fonts for the program's print engine
	(if non nil) or for any print engine.
\lin{GripeDifferentChecksums(char *\var{font},
i32 \var*{tfmsum}, i32 \var{fontsum})}
	prints the multi-line message
	\begin{quote}
	{\tt WARNING: TeX and I have different checksums for font\\
	"\%s"\\
	Please notify your TeX maintainer\\
	(TFM checksum = 0\%lo, my checksum = 0\%lo)}.
	\end{quote}
\lin{GripeMissingFontsPreventOutput(int \var n)}
	prints the message
	`{\tt \%d missing fonts prevent output (sorry)}'
	and aborts.
	The routine gets the English grammar right
	(`1 font prevents', `2 fonts prevent').
\end{dtt}
The \dvi\ gripes print complaints as above,
but then add the question
`{\tt (are you sure \%s is a DVI file?)}'
where the string argument comes from the global variable
{\tt char *DVIFileName}.
If {\tt DVIFileName} is nil,
these gripes substitute the words `the input' instead.
(Here the \dvi\ file name is shown as `file'
rather than `{\tt \%s}' to avoid confusion.)
\begin{dtt}
\lin{GripeNoSuchFont(i32 \var{n})}
	aborts with the message
	`{\em file\/ \tt wants font \%ld, which it never defined}'.
\lin{GripeFontAlreadyDefined(i32 \var{n})}
	aborts with the message
	`{\em file\/ \tt redefines font \%ld}'.
\lin{GripeUnexpectedDVIEOF()}
	aborts with the message
	`{\tt unexpected end of file in \em file}'.
\lin{GripeUnexpectedOp(char *\var{s})}
	aborts with the message
	`{\tt unexpected \%s in \em file}'.
\lin{GripeMissingOp(char *\var{s})}
	aborts with the message
	`{\tt missing \%s in \em file}'.
\lin{GripeCannotFindPostamble()}
	aborts with the message
	`{\tt cannot find postamble}'.
\lin{GripeMismatchedValue(char *\var{s})}
	aborts with the message
	`{\tt mismatched \%s in \em file}'.
\lin{GripeUndefinedOp(int \var{n})}
	aborts with the message
	`{\tt undefined DVI opcode \%d}'.
\lin{GripeBadGlyph(i32 \var*{c}, struct font *\var{f})}
	normally prints
	`{\tt there is no character \%ld in \%s!}'
	and then returns.
	The string comes from \var{f}'s {\tt f_path}
	(see~\S\ref{sec:fonts}, p.~\pageref{sec:fonts}).
	If, however, {\tt f_path} is nil,
	{\tt GripeBadGlyph} assumes that \var{f} has not been set
	for the current \dvi\ page
	(points to a static {\tt NoFont}, as in most drivers)
	and it prints instead the two-line message
	\begin{quote}
	\tt bad DVI file:\ char without setfont\\
	\tt (try checking {\em file\/} with dvitype)
	\end{quote}
	and then aborts.
\end{dtt}

\subsection{Random Reading}\label{sec:seek}
In order to read a \dvi\ file's postamble first,
the \dvi\ input has to be from a \Unix\ disk file,
not a pipe or socket or a terminal.
The library routine {\tt SeekFile}
will copy such `sequential-only' files to disk (random) files.
If the input is already a random file,
it returns immediately, avoiding work.
The copy is done in a very \Unix ish manner,
and the code to do it attempts to run fast,
as the time spent here can be significant.\footnote
{I got as much as 15\% in some profiles.}
The copy is placed in {\tt \$TMPDIR},
if the environment variable {\tt TMPDIR} is set;
otherwise it goes in {\tt /tmp}.
The file is kept open but {\em unlinked},
so that it does not appear in the file system
(except for a brief moment during creation).
None of this is likely to work on machines that do not run \Unix,
but such machines tend not to have pipes in the first place.
If you are porting the library,
the simplest approach may be to have {\tt SeekFile}
always return its argument.

\begin{dtt}
\item[FILE *SeekFile(FILE *\var{fp})]
	copies the {\tt stdio} file \var{fp}
	into a temporary seekable file if necessary,
	then returns a possibly different {\tt FILE} pointer,
	indicating that the (perhaps new) file is now randomly readable
	(that {\tt fseek} will work).
	{\tt SeekFile} returns {\tt NULL}
	(and sets {\tt errno})
	if it cannot make the copy
	and the original is not randomly readable.
	Usually this is because the temporary file area became full;
	in that case, setting {\tt TMPDIR} will probably fix it.
	If {\tt SeekFile} returns {\tt NULL},
	the original file \var{fp} cannot be used.
\item[FILE *CopyFile(FILE *\var{fp})]
	copies the {\tt stdio} file \var{fp}
	into a temporary seekable file unconditionally.
	It is otherwise like {\tt SeekFile}.
\end{dtt}

\subsection{Reading the Postamble First}\label{sec:spa}
Many programs need to read the postamble of \dvi\ files first.
Once the file is randomly readable, this is easy.
There are two routines that help out.
The simpler is {\tt FindPostAmble},
which simply seeks about until it locates the postamble.
This is usually called indirectly via the more complex
{\tt ScanPostAmble}
(which is itself usually called indirectly).
The former is found in {\tt lib/findpost.c},
while the latter uses declarations in {\tt h/postamble.h}
and is in {\tt lib/scanpost.c}.

\begin{dtt}
\item[int FindPostAmble(FILE *\var{f})]
	reads the {\tt stdio} file \var{f} backwards
	to find the postamble.
	If it finds the postamble, it returns 0;
	if the file is not a \dvi\ file,
	or is malformed,
	{\tt FindPostAmble} returns $-1$.
	On success, the file's current seek pointer
	is such that the first {\tt getc}
	will return the {\tt DVI_POST} opcode.
\end{dtt}

{\tt ScanPostAmble} uses two auxiliary structures
to transmit information found in the postamble.
The first, {\tt struct PostAmbleInfo},
holds fixed information;
its fields are named using the prefix {\tt pai}.
The second, {\tt struct PostAmbleFont},
holds information for each font definition;
its fields are named with with prefix {\tt paf}.
\begin{dtt}
\item[i32 pai_PrevPagePointer] is the `previous page pointer',
	which in a postamble points to the first opcode
	for the last page of the \dvi\ file.
\item[i32 pai_Numerator] is a copy of the file's numerator.
	For \TeX\ this is always the value $25400000$.
\item[i32 pai_Denominator] is a copy of the file's denominator.
	For \TeX\ this is always the value $473628672$.
	Multiplying a scaled-points value
	by $\hbox{\it num}/\hbox{\it denom}$ converts it
	to units of $10^{-7}$ meters.
\item[i32 pai_DVIMag] is a copy of the file's magnification.
\item[i32 pai_TallestPageHeight] is a copy of the file's
	`tallest page height' statistic,
	which is not necessarily a realistic assessment
	(since objects can exist `off the page'
	when \TeX\ is asked to smash the sizes of boxes).
\item[i32 pai_WidestPageWidth] is a copy of the file's
	`widest page width' statistic,
	which is not necessarily a realistic assessment.
\item[int pai_DVIStackSize]
	holds a copy of the file's `minimum \dvi\ stack size',
	which is (in all well-formed \dvi\ files)
	at least as big as the maximum depth of {\em push\/} opcodes.
\item[int pai_NumberOfPages]
	holds a copy of the file's `number of pages' statistic.
\item[char *paf_name] is a pointer to the font name, e.g., {\tt "cmr10"}.
\item[int paf_n1] is the length of the first part of the name.
\item[int paf_n2] is the length of the second part of the name.
\item[i32 paf_DVIFontIndex] is the font's \dvi\ index.
\item[i32 paf_DVIChecksum] is the font's checksum
	as of when \TeX\ last read the font.
\item[i32 paf_DVIMag] is the `at size'.
	The ratio of this to the design size,
	times 1000,
	is the font's scale.
\item[i32 paf_DVIDesignSize] is the design size.
	If this is equal to the `at size',
	the font has not been scaled up or down.
\item[ScanPostAmble(FILE *\var*{f},
%%% the next two overfill the line, alas
%void (*\var{hf})(struct PostAmbleInfo *),
%void (*\var{ff})(struct PostAmbleFont *)
\var*{headerfn}, \var{fontfn})]
	calls {\tt FindPostAmble(\var{f})} to locate the postamble.
	If {\tt FindPostAmble} fails,
	or if the next byte is not {\tt DVI_POST},
	{\tt ScanPostAmble} aborts by calling
	{\tt GripeCannotFindPostamble}.
	Otherwise it reads the fixed information
	into a {\tt struct PostAmbleInfo},
	and then calls {\tt void (*\var{headerfn})(struct PostAmbleInfo *)}
	with the address of this structure.
	It is up to \var{headerfn} to save the values it needs,
	as the structure is destroyed immediately afterwards.
	{\tt ScanPostAmble} then reads each font definition in turn,
	and calls {\tt void (*\var{fontfn})(struct PostAmbleFont *)}
	once for each,
	passing the address of a {\tt struct PostAmbleFont}
	containing the information from the \dvi\ file.
	Again, this structure is immediately destroyed,
	so it is up to the called function to save any values it needs.
\end{dtt}

\subsection{Searches: Mapping {\tt i32} Values to Data}\label{sec:search}
Many programs need to map from \dvi\ numbers ({\tt i32} values)
to some arbitrary data structure.
The \mctex\ library
provides a {\em search\/} system,
defined and declared in {\tt lib/search.c} and {\tt h/search.h},
for doing this.
A search table---a mapping from {\tt i32} keys to data---is created
with the {\tt SCreate} routine,
which returns a new object of type {\tt struct search *}.
None of the fields in this object
are directly accessible to user code.
The {\tt SSearch} function searches for, and optionally installs,
a key,
and the {\tt SClear} function empties out a search table
(wipes out all the mappings).
There is, at the moment, no way to completely discard a search table.
The {\tt SSearch} function also takes and returns a number of flags,
defined below.
\begin{dtt}
\item[struct search *SCreate(unsigned \var{dsize})]
	creates a new search table
	which will map {\tt i32} keys
	to objects of size \var*{dsize}.
	Typically this size comes from {\tt sizeof}.
	{\tt SCreate} arranges for these objects to be well-aligned;
	the alignment code may need adjustment
	on some machines,
	but the current code should work
	for machines with up to 128-bit-wide data paths.
	{\tt SCreate} returns the new table,
	or a nil pointer if it is not able to get enough memory.
\item[char *SSearch(struct search *\var*{s},
i32 \var*{key}, int *\var{disp})]
	searches the table \var{s}
	for the data associated with the key \var*{key}.
	The action taken upon finding or not finding the key
	depends on the value of {\tt *\var{disp}};
	{\tt *\var{disp}} is altered upon return
	to describe the ultimate disposition of the key.
	{\tt SSearch} returns a pointer to the associated data
	(if found or created),
	or a nil pointer
	(if not found, or if should not have been found).
	The disposition flags are as follows:
\begin{dtt}
\item[S_LOOKUP] is a flag that indicates a lookup operation.
	That is, {\tt SSearch} is not supposed to insert the key,
	but merely to see if it is already there.
\item[S_CREATE] is the counterpart to {\tt S_LOOKUP}:
	it indicates that {\tt SSearch} is to insert the key
	(creating space for its data)
	if it is not already there.
\item[S_EXCL] is only for use with {\tt S_CREATE}.
	It indicates that, if the key already exists,
	{\tt SSearch} should return a nil pointer
	and indicate that the key did indeed already exist.
\item[S_COLL] is set upon return
	if and only if {\tt S_EXCL} was given
	and the key already existed
	(and {\tt SSearch} therefore returned a nil pointer).
\item[S_FOUND] is set upon return
	if and only if {\tt S_EXCL} was not given
	and the key already existed
	(and {\tt SSearch} therefore returned a data pointer).
\item[S_NEW] is set upon return
	if and ony if {\tt S_CREATE} was given
	and the key did not already exist
	(and {\tt SSearch} therefore returned a data pointer).
\item[S_ERROR] is set upon return
	if and only if {\tt S_CREATE} was given
	and the key did not already exist,
	but something went wrong ({\tt malloc} failed)
	when creating space for the associated data.
	In this case, {\tt SSearch} returns a nil pointer;
	the existing key/data pairs always remain intact.
\end{dtt}
\item[SClear(struct search *\var{s})]
	is a macro that
	removes all the key/data pairs from \var*{s}.
\lin{SEnumerate(struct search *\var{s},
void (*\var{f})(char *\var*{data}, i32 \var{key}))}
	iterates over all the key/data pairs in \var*{s},
	calling the supplied function \var{f} on each pair.
\end{dtt}
Most programs use search tables only indirectly,
via {\tt DVISetState} and {\tt DVIFindFont}.

\subsection{Font Routines}\label{sec:fonts}
The font routines are built out of a number of smaller routines
that handle each font format (GF, PK, etc.),
with the details of the actual format
hidden from the code using the font.
All fonts share the basic information found in the header {\tt h/font.h}.
This contains definitions for a {\tt struct font},
which retains the details about a font in general,
and for a {\tt struct glyph},
which retains the specifics about one particular glyph in some font.
Fonts are obtained with the library routine {\tt GetFont}
(described below)
and glyphs are obtained by using a character code index into a font.

Some of the items in each glyph or font are labelled
`reserved to user code'.
This means that the \mctex\ library itself
makes no use of the item;
it is merely provided as a convenience to callers.
For instance, the \ps\ driver
uses the {\tt FF_USR0} flag in the {\tt f_flags} field
to remember whether some font
is a reference to a \TeX\ bitmap font
or to a built-in \ps\ font.

The basic font fields are listed here.
Except for the pointers and flags,
and except for those fields marked otherwise,
all have type {\tt i32}.
\begin{dtt}
\item[f_flags]
	can contain up to four user flags
	{\tt FF_USR0}, {\tt FF_USR1}, {\tt FF_USR2}, and {\tt FF_USR3}.
	It also retains the {\tt FF_RASTERS} flag,
	which is set if the font was read by {\tt GetFont}
	and clear if the font was read by {\tt GetRasterlessFont}.
	Thus, the user flags must be set and cleared
	using bitwise operations.
\item[f_un]
	is a union of three possibilities,
	{\tt int f_int}, {\tt i32 f_i32}, and {\tt char *f_ptr}.
	These are reserved to user code.
	\TeX\ numbers the fonts in \dvi\ files discontiguously
	and tends to use large numbers.
	Many devices have a limited set of fonts
	and require low numbers.
	Most drivers use the {\tt f_un.f_int} field to `renumber' the fonts
	in some manner.
\item[f_ops]
	is a pointer to the per-font operations:
	that is, the routines and data for using a font
	of whatever kind this may be (PK, GF, PXL, or TFM).
	These operations are not yet described in this manual.
\item[f_details]
	is a pointer to private data for use by the per-font operations.
\item[f_path]
	is a pointer to the path name of the \Unix\ file
	containing the font.
\item[f_dvimag]
	is the font magnification from a \dvi\ file.
\item[f_dvidsz]
	is the font design size from a \dvi\ file.
\item[f_font]
	is a pointer to \TeX 's name for the font (e.g., {\tt cmr10}).
\item[f_scaled]
	is an {\tt int} holding the ratio of the magnification
	to the design size, times 1000 (to make it match \TeX's
	\verb|\font| commands, e.g.,
	\verb|\font \cmr12 = cmr10 scaled 1200|).
\item[f_design_size]
	is the design size of the font as recorded in the font itself
	(which may, but should not, differ from what \TeX\ thought it
	was).
\item[f_checksum]
	is the checksum of the font as recorded in the font itself.
\item[f_pspace]
	is the amount of positive space
	that is considered to be a `kern'
	according to the \dvi\ translation rules.
\item[f_nspace]
	is the amount of negative space
	that is considered to be a `kern',
	and is equal to $-4 \cdot {\tt f\_pspace}$.
\item[f_vspace]
	is the amount of vertical space (positive or negative)
	that is considered to be `in-baseline' motion,
	and is equal to $5 \cdot {\tt f\_pspace}$.
\item[f_hppp]
	is the number of `horizontal pixels per (scaled) point'
	from the font.
	Some fonts do not say; these have an {\tt f_hppp} of 0.
\item[f_vppp]
	is the number of `vertical pixels per (scaled) point'
	from the font.
	Some fonts do not say; these have an {\tt f_vppp} of 0.
\item[f_lowch]
	is an {\tt int} giving the index of the lowest-numbered
	valid glyph, typically 0.
\item[f_highch]
	is an {\tt int} giving one more than the index of the highest-numbered
	valid glyph, typically 128.
	That is, the valid glyphs of any font are in
	the half-open interval
	$[\hbox{\tt f\_lowch}..\hbox{\tt f\_highch})$.
\item[f_gly]
	and {\tt f_glybase} are pointers to a block of glyph data.
	{\tt f_glybase} is the actual pointer,
	while {\tt f_gly} is offset by {\tt -f_lowch},
	to make the glyph accessor macro faster.
	This is not strictly kosher
	but has not yet broken anywhere;
	in the event that it does blow up,
	it can be fixed by redefining the {\tt GLYPH} macro
	to add the offset to the character code
	and eliding the adjustment of {\tt f_gly}
	in {\tt lib/font_subr.c}.
	These two fields are not used anywhere
	outside the internal font routines;
	all normal glyph access goes through the {\tt GLYPH} macro.
\end{dtt}
The {\tt f_pspace}, {\tt f_nspace}, and {\tt f_vspace} fields
are used to help drivers get the pixel spacing right.
Here is what \dvitype\ has to say about this:
\begin{quotation}
Rounding to the nearest pixel is best done in [this] manner\dots
so as to be inoffensive to the eye:
When [a] horizontal motion is small, like a kern,
[the device position in pixels] changes by rounding the kern;
but when the motion is large,
[the device position] changes by rounding the [\dvi\ position]
so that accumulated rounding errors disappear.
We allow a larger space in the negative direction
than in the positive one,
because \TeX\ makes comparatively large backspaces
when it positions accents.

Vertical motion is done similarly,
but with the threshold between ``small'' and ``large''
increased by a factor of five.
The idea is to make fractions like ``$1\over2$'' round consistently,
but to absorb accumulated rounding errors in the baseline-skip moves.\

\dots A sequence of consecutive rules,
or consecutive characters in a fixed-width font
whose width is not an integer number of pixels,
can cause [the device position] to drift far away
from a correctly rounded value.
\dvitype\ ensures that the amount of drift will never exceed
[a small number of] pixels.
\end{quotation}
\label{sec:drift}
{\tt font.h} defines two macros,
{\tt F_SMALLH(\var*{f}, \var{m})} and
{\tt F_SMALLV(\var*{f}, \var{m})},
which test whether the \dvi\ motion \var{m}
is `small' (should be treated like a kern)
in the horizontal and vertical directions
respectively.
The drift-correction code itself appears later,
in \S\ref{sec:ds}.

Each glyph includes the following fields.
Again, all are type {\tt i32},
except flags and pointers,
and except where noted.
\begin{dtt}
\item[g_flags]
	can contain up to four user flags
	{\tt GF_USR0}, {\tt GF_USR1}, {\tt GF_USR2}, and {\tt GF_USR3}.
	It also contains the flag {\tt GF_SEEN},
	which is needed by most drivers
	to avoid sending unused glyphs to the device;
	this flag is also reserved to user code.
	There is one more flag, however
	({\tt GF_VALID}, set if the glyph really exists),
	so these flags must be set and cleared only with bitwise
	operations.
\item[g_rotation]
	is a {\tt short} holding the current rotation index
	of the current raster (if any) for the glyph.
\item[g_raster]
	is a pointer to the glyph's current raster, if any.
\item[g_height]
	gives the height in pixels of the glyph's raster.
\item[g_width]
	gives the width in pixels of the glyph's raster.
\item[g_xorigin]
	gives the glyph's x origin.
\item[g_yorigin]
	gives the glyph's y origin.

	Small nonnegative x and y origin values
	are `within' the glyph;
	negative values are to the left of or above the pixels,
	and large positive values are to the right of or below the pixels.
	The raster for the glyph is `top down';
	that is, the first few bytes give the top part of a letter `T',
	for instance.
\item[g_rawtfmwidth]
	gives the glyph's `tfm' width in {\sc fix}es,
	which are units peculiar to \tfm\ files.
	This field can generally be ignored in favour of {\tt g_tfmwidth}.
\item[g_tfmwidth]
	gives the glyph's `tfm' width in scaled points.
	(The conversion from raw widths to tfm widths
	is done by {\tt lib/scaletfm.c}.)
	This is \TeX's idea of the width the character.
	The actual width in pixels will usually differ slightly,
	being rounded to the nearest whole number.
\item[g_xescapement]
	gives the width of the glyph in scaled\footnote
{Why this is so remains a mystery,
since it not generally possible to move by less than a whole pixel.}
	pixels (65536 scaled pixels = 1 pixel),
	unless it has the value {\tt NO_ESCAPEMENT}.
	The GF and PK font formats
	are the only ones with escapement values.
\item[g_yescapement]
	gives the `y escapement', in scaled pixels.
	\TeX\ does not deal with fonts that have y escapements,
	so this can generally be ignored.
\item[g_pixwidth]
	is reserved to user code,
	and intended to hold the width of each glyph in pixels,
	from {\tt (g->g_xescapement >> 16)}
	(if it is not the special value {\tt NO_ESCAPEMENT})\footnote
{If there are any values with something other than zero in the low 16
bits---the author has not seen any such---it might be better to round
the value by adding {\tt 1 << 15} before shifting right.
At the moment, however, all the drivers ignore {\tt g_xescapement}
entirely, computing {\tt g_pixwidth} from {\tt g_tfmwidth}.
It will be up to whoever eventually adds code to use {\tt g_xescapement}
to discover whether this rounding is in fact necessary.}
	or from {\tt fromSP(g->g_tfmwidth)};
	the results are normally identical.
	Most drivers will compute {\tt g_pixwidth}
	upon first encountering the glyph,
	possibly also downloading it to the device,
	and then set {\tt GF_SEEN} in {\tt g_flags}
	to avoid re-doing the computation (and the download).
\item[g_index]
	is the glyph index within the font.
	That is, {\tt g->g_index} for the glyph obtained with
	{\tt GLYPH(\var*{f}, \var{c})} is equal to \var*{c}.
\item[g_un]
	is a union of three possibilities.
	The first two,
	{\tt char *g_details} and {\tt i32 g_integer},
	are reserved to the font reading routines.
	The last, {\tt struct glyph *g_next},
	is used by the font code
	to keep a linked list of free glyph structures.
\end{dtt}

Glyph rasters come in four {\em rotations},
called `normal', `left', `down', and `right'
(see also Table~\ref{tab:rot}).
The normal rotation
puts the bits in the raster upright:
an `{\sf F}' looks like an `F'.
Rotating it left gives it $1/4$ turn counterclockwise:
the `{\sf F}' becomes a horizontal line
with two vertical lines jutting up,
the long one on the left and the short one in the middle.
Effectively, the characters `rest on their backs'.
Rotating it down gives it a $1/2$ turn,
inverting it---turning it upside down.
Rotating it right gives it a $3/4$ turn counterclockwise,
or a $1/4$ turn clockwise,
so that it `rests on its front',
and our `{\sf F}' becomes a horizontal line
with two vertical lines hanging down,
the long one on the right and the short one in the middle.
Rotations other than normal
are computed by {\tt lib/rotate.c}.
At present,
all such rotations are performed by repeating a rightward rotation
until the character is properly oriented,
so that the normal rotation is fastest,
right rotation is not as fast,
down rotation is slow,
and left rotation is very slow.
All the existing drivers that print landscaped output
use right rotation
(if any---some devices can rotate internally),
so this has not been a problem.

\begin{table*}
\centering
\begin{tabular}{@{\coltt}ll}
\multicolumn1c{\bf Name} & \multicolumn1c{\bf Rotation} \\
\hline
ROT\_NORM & Normal (unrotated) \\
ROT\_LEFT & Left ($1/4$ turn ccw) \\
ROT\_DOWN & Down ($1/2$ turn) \\
ROT\_RIGHT & Right ($3/4$ turn ccw, or $1/4$ turn cw)
\end{tabular}
\caption{Rotation Codes}\label{tab:rot}
\end{table*}

Returning to {\tt font.h},
we find various macros and function declarations.
The functions themselves are found in {\tt lib/font.c}
and {\tt lib/font_subr.c}.
The following are available to user code:
\begin{dtt}
\item[struct glyph *GLYPH(struct font *\var*{f}, \var{c})]
	is a macro that locates the glyph for character \var{c}
	(\var{c} being any integral expression) in font \var*{f}.
	The result is not necessarily valid.
	It is not necessary to check that \var{c}
	is between \var{f}'s {\tt f_lowch} and {\tt f_highch};
	if \var{c} is out of range,
	the result will be invalid.
\item[int GVALID(struct glyph *\var{g})]
	is a macro that is true if and only if the glyph \var{g}
	(a value produced by {\tt GLYPH}) is valid.
	No fields of \var{g} may be used unless it is valid.
\item[int HASRASTER(struct glyph *\var{g})]
	is a macro that is true if and only if the glyph \var{g}
	has an imageable raster (one with nonzero width and height).
\item[char *RASTER(struct glyph *\var*{g}, struct font *\var*{f}, int \var{r})]
	is a macro that obtains the raster
	for glyph \var{g} in font \var{f}
	in the rotation given by \var*{r},
	which must be one of the {\tt ROT} values
	from Table~\ref{tab:rot}.
	The result of this macro
	is a pointer to the topmost and leftmost byte of the raster.
	There are {\tt g_width} (`$w$') pixels
	in each row of the raster, with extra zero bits added
	to make each row a whole number of bytes,
	and there are {\tt g_height} (`$h$') rows.
	The total number of bytes in the raster is thus
	$h \cdot ((w+7)/8)$ (where `$/$' denotes integer division).
\item[struct font *GetFont(char *\var*{name},
i32 \var*{dvimag}, i32 \var*{dvidsz}, char *\var*{dev}, char **\var{path})]
\leavevmode\\ % change overfull hbox to quiet underfull   XXX
	is a function that obtains and returns a font.
	The global conversion (\S\ref{sec:conv})
	must be set at this time.
	The four arguments
	\var*{name}, \var*{dvimag}, \var*{dvidsz}, and \var{dev}
	determine which font is used.
	The first three come directly from the \dvi\ file
	(usually via {\tt DVISetState});
	the fourth is a string
	giving the name of the print engine used,
	to be matched against `spec' fields in a fontdesc file.
	The \var{path} argument
	must be the address of a character pointer;
	into this will be stored the path name of the font used
	(the same as {\tt f_path}), or, if no font could be found,
	a `canonical' path name to be used in error messages.
	If {\tt GetFont} returns {\tt NULL},
	the caller can use {\tt GripeCannotGetFont} to complain.
	Otherwise {\tt GetFont} returns a pointer to a font
	suitable for use with the macros and fields described above.
%\item[GetRasterlessFont(char *\var*{name},
%i32 \var*{dvimag}, i32 \var*{dvidsz}, char *\var*{dev}, char **\var{path})]
\item[struct font *GetRasterlessFont]
	is just like {\tt GetFont}---its arguments
	are identical---but the resulting font, if any,
	cannot be used with the {\tt RASTER} macro.
	A few drivers need glyph information, but not rasters;
	these can use {\tt GetRasterlessFont},
	which is in some cases faster than {\tt GetFont}.
\item[char *Font_TeXName(struct font *\var{f})]
	is a function that returns a pointer to a `\TeX\ style' name
	for the font \var*{f},
	such as {\tt cmr10 scaled 1200}.
	The return value points into static storage
	which will be overwritten by the next call
	to {\tt Font_TeXName}.
\item[FreeRaster(struct glyph *\var{g})]
	is a \void\ function that discards the raster for glyph \var*{g},
	making the space available for re-use.
	Drivers for devices that retain the rasters internally
	should free each raster after downloading it
	to reduce the demand for system memory.
\item[FreeGlyph(struct font *\var*{f}, struct glyph *\var{g})]
	is a \void\ function that discard the glyph \var{g},
	including its raster (if any),
	from font \var*{f},
	and makes the glyph structure available for re-use.
\item[FreeFont(struct font *\var{f})]
	is a \void\ function that discards the font \var*{f},
	including all its undiscarded glyphs and glyph-rasters,
	making its space available for re-use.
\item[fontinit(char *\var{file})]
	is a \void\ function that reads the named fontdesc file \var*{file},
	or the system default file if \var{file} is a nil pointer
	of type {\tt char~*}.
	If has not been already,
	this function is called automatically
	on the first {\tt GetFont};
	most programs will have no need to call this,
	particularly since it is (at least currently)
	an error to call it more than once.\footnote
	{Someday it may reset the configuration.}
\end{dtt}

\subsection{DVI Interpreter Global State}\label{sec:ds}
The header {\tt h/dvistate.h},
together with the library file {\tt lib/dvistate.c},
define most of the state variables a driver needs
to interpret \dvi\ files,
and declares and defines functions
that take care of much of the work involved.
The first definition is for a \dvi\ stack structure type {\tt DviStack},
which has eight {\tt i32} fields
{\tt h}, {\tt v}, {\tt w}, {\tt x}, {\tt y}, and {\tt z}
(corresponding to the values that appear in \dvi\ files)
and {\tt hh} and {\tt vv}
(versions of {\tt h} and {\tt v} rounded to pixels).
After this comes the global state structure {\tt ds},
which is a variable of type {\tt struct dvi_state}.
The first two members of {\tt ds} must be set by the driver
before calling {\tt DVISetState}.
The rest are then set by {\tt DVISetState} based on its arguments
and on the input \dvi\ file:
\begin{dtt}
\item[int ds_usermag]
	is the user-supplied magnification
	(magnification in addition to that found in the \dvi\ file),
	typically from a {\tt -m} flag.
	This should be set to 1000 if no magnification is to be applied.
\item[int ds_maxdrift]
	is the maximum drift value in pixels,
	normally enough to make $1/100$th of an inch.
	On a 300 dpi printer, for instance,
	{\tt ds_maxdrift} should be set to 3.
\item[i32 ds_num]
	is set to the numerator from the \dvi\ file.
	For \TeX\ this is always the value $25400000$.
\item[i32 ds_denom]
	is set to the denominator from the \dvi\ file.
	For \TeX\ this is always the value $473628672$.
	Multiplying a scaled-points value
	by $\hbox{\it num}/\hbox{\it denom}$ converts it
	to units of $10^{-7}$ meters.
\item[i32 ds_dvimag]
	is set to the magnification from the \dvi\ file,
	typically $1000$.
\item[i32 ds_maxheight]
	is set to the maximum page height value
	from the \dvi\ file postamble.
\item[i32 ds_maxwidth]
	is set to the maximum page width value
	from the \dvi\ file postamble.
\item[i32 ds_prevpage]
	is initially set to point to the last page of the \dvi\ file,
	according to the file's postamble.
	{\tt DVIBeginPage} updates it according to each page,
	and if such behaviour is appropriate for the device,
	a driver can use {\tt ds_prevpage}
	to iterate backwards through the file;
	cf.\ Matthew 19:30.\footnote{kilroy was here}
\item[int ds_npages]
	is set to the number of pages in the \dvi\ file,
	according to the file's postamble.
\item[DviStack *ds_stack]
	is set to point to a stack of {\tt DviStack} structures
	that is large enough to hold all values pushed and popped
	in the \dvi\ file (according to its postamble, at least).
\item[DviStack *ds_sp]
	is set by {\tt DVIBeginPage} to the base of the stack,
	as required by the \dvi\ interpretation rules.
\item[DviStack ds_cur]
	holds the current values
	of {\tt h}, {\tt v}, etc.
	There are some shorthand macros to obviate the need
	to type {\tt ds.ds_cur.h}, etc., constantly.
\item[DviStack ds_fresh]
	holds the `zero' values
	to be used at the beginning of each page.
	Its {\tt w}, {\tt x}, {\tt y}, and {\tt z} values
	are actually zero,
	but its {\tt h} and {\tt v} and {\tt hh} and {\tt vv} values
	have a margin precomputed.
	This works because all \dvi\ motion
	is relative to the current position;
	starting at a point
	one inch from the upper left corner of the page
	will automatically leave a proper margin.
	The position of this point
	is computed by {\tt DVISetState}; see below for details.
\item[struct search *ds_fonts]
	is a pointer to a search structure (\S\ref{sec:search}).
	{\tt DVIFindFont} uses this search structure
	to convert {\tt i32} \dvi\ font index values
	to pointers to fonts.
\item[FILE *ds_fp]
	is set to the stdio file to be read,
	which may differ from the one given to {\tt DVISetState}
	(possibly being a temporary copy instead).
\end{dtt}
The two maximum page size values should not be used
for positioning the output from a driver;
experience has shown that it is better to use a fixed margin
and allow the user to adjust it if necessary.
See the source to \dvitype\ for more specific arguments.
However, they are here if needed.

Following these declarations are some shorthand macros,
which define the names {\tt dvi_h}, {\tt dvi_v},
{\tt dvi_hh}, {\tt dvi_vv},
{\tt dvi_w}, {\tt dvi_x}, {\tt dvi_y}, and {\tt dvi_z}
as their longer counterparts in {\tt ds.ds_cur}.
Finally, {\tt dvistate.h} declares
the routine {\tt DVIFindFont},
which converts its single {\tt i32} argument
into a pointer to a font
(without fail, as it aborts on error),
the routines {\tt DVIRule} and {\tt DVIBeginPage},
and the macro {\tt FIXDRIFT},
which limits (to {\tt ds.ds_maxdrift}) the amount of drift
(see \S\ref{sec:drift}, p.~\pageref{sec:drift})
between {\tt dvi_h} and {\tt dvi_hh}
and between {\tt dvi_v} and {\tt dvi_vv}.
The first argument to {\tt FIXDRIFT} must be the device location
({\tt dvi_hh} or {\tt dvi_vv});
the second must be the value of {\tt h} or {\tt v}
rounded to the nearest pixel.
An example that uses these functions appears in \S\ref{sec:tour};
here are their definitions.
\begin{dtt}
\item[DVISetState(FILE *\var*{fp},
%struct font *(*\var{fn})(char *\var*{name},
%	i32 \var*{mag}, i32 \var{dsz}),
\var*{fn},
int \var*{dpi}, int \var*{xoff}, int \var{yoff})]
	is a \void-valued function that
	reads the postamble and preamble of the \dvi\ file
	given by the {\tt stdio} file \var*{fp}.
	It stores in {\tt ds.ds_fp}
	a version of \var{fp} that is randomly readable (\S\ref{sec:seek}),
	sets up the search structure {\tt ds.ds_search},
	and then scans through the file's postamble (\S\ref{sec:spa})
	to find out which fonts the \dvi\ file needs.
	At this time the \var{dpi} argument
	is combined with the numerator and denominator
	from the postamble
	to set the global conversion (\S\ref{sec:conv}).
	The margins in {\tt ds.ds_fresh} are set based on \var{dpi}
	and on x and y offsets \var{xoff} and \var*{yoff}.
	These offsets are measured in $1/1000$ths of an inch;
	positive values move the margin down and right.
	Each font listed in the postamble
	is obtained by calling
	{\tt struct font *(*\var{fn})(char *\var*{name},
		i32 \var*{mag}, i32 \var{dsz})},
	passing the font's name,
	\dvi\ magnification, and \dvi\ design size;
	\var{fn} must return a pointer to a font,
	or, if no font can be found, a nil pointer.
	It is the responsibility of \var{fn}
	to generate an appropriate error if no font is found;
	{\tt DVISetState} will only summarise such errors
	(and subsequently abort).
	{\tt DVISetState} verifies the checksums of the fonts returned,
	producing a warning if there is a mismatch
	between the \dvi\ file and the font itself.
\item[struct font *DVIFindFont(i32 \var{n})]
	returns the same font pointer for \dvi\ index \var{n}
	that the font function given to {\tt DVISetState} returned
	when font \var{n} was defined in the postamble.
	If font \var{n} was not defined,
	{\tt DVIFindFont} aborts by calling {\tt GripeNoSuchFont}.
	The translation from \dvi\ index to pointer
	is done in part by {\tt SSearch}.
\item[DVIRule(void (*\var{fn})(\var*{height}, \var{width}), int \var{advance})]
	is a \void-valued function
	that reads (from {\tt ds.ds_fp})
	a rule height and width,
	then calls the function \var{fn},
	passing the result of converting these to pixels
	using the global conversion.
	Then, if \var{advance} is not zero,
	{\tt DVIRule} adjusts the \dvi\ {\tt h} and {\tt hh} values
	and corrects for over-drift.
\item[DVIBeginPage(void (*\var{fn})(i32 *\var{count}))]
	is a \void-valued function
	that reads the ten \cccount\ register values
	from the \dvi\ file {\tt ds.ds_fp},
	storing these in a local array,
	then reads the previous page pointer into {\tt ds.ds_prevpage}.
	It then resets the \dvi\ position variables
	to {\tt ds.ds_fresh},
	resets the stack pointer {\tt ds.ds_sp},
	and finally calls the given function \var*{fn},
	passing it the address of the first element of the array
	holding the ten \cccount\ values.
\end{dtt}

\subsection{{\tt\string\special} Decoding}
The header file {\tt sdecode.h}
and the library file {\tt lib/sdecode.c}
provide decoding routines for \TeX\ \verb|\special|s
that allow a reasonably simple and uniform interface
to driver-specific functions.\footnote
{With any luck, the interface will prove to be expandable
when (if?) the \TeX\ User's Group put together
a standard for {\tt\string\special} interpretation.
People using the library should be aware, however,
that this interface is subject to change if necessary.
(Of course, the same goes for the rest of the library,
although most of the other code is less likely to be affected.)}
The library routine {\tt SDecode} reads a sequence of strings
with a simple syntax as suggested by the following BNF:
\begin{center}
\begin{tabular}{@{}l@{}l@{}l}
list & ::= & words $\mid$ list semicolon words \\
words & ::= & word $\mid$ words space word \\
\end{tabular}
\end{center}
A `word' is a sequence of characters other than `semicolon' and `space';
the latter two are by default set to whitespace (space, tab, etc.)\
and a semicolon,
respectively.\footnote
{Actually, newline is also considered a `semicolon' by default.}
The set of characters treated as spaces and semicolons can be overridden
by calling the {\tt SDsetclass} function, described below.

After breaking the \verb|\special| text into a sequence of `words',
{\tt SDecode} looks up the first word in a table.
The table associates the word with argument types and with a C function.
The remaining words, if any,
are interpreted according to the argument type field in the table entry.
If all goes well,
{\tt SDecode} calls the C function,
passing arguments depending on the argument type field.
The argument list always includes the name of the function itself
(i.e., the first word) as the first argument.
Except for {\tt sda_rest},
the remaining arguments, if any, correspond
in order
to the words decoded.
See Table~\ref{tab:sda} for a complete list of types.
Symbolically,
`d' stands for a decimal ({\tt i32}) argument,
`f' for a floating-point ({\tt double}) argument,
`s' for a string ({\tt char *}) argument,
`x' for a hexadecimal ({\tt i32}) argument,
and an `n' prefix indicates an array (of any size) of the following.
Only a fixed set of types are allowed,
because of the way C functions work.
(There is no way to build a `variadic' argument list
and pass it to a called function.)

\begin{table*}
\centering
\begin{tabular}{|@{\coltt}l@{\coltt}l|}
\hline
\multicolumn1{|c}{\bf Type} & \multicolumn1{c|}{\bf Function Type} \\
\hline
sda\_none & void (char *) \\
sda\_s & void (char *, char *) \\
sda\_d & void (char *, i32) \\
sda\_f & void (char *, double) \\
sda\_dd & void (char *, i32, i32) \\
sda\_ff & void (char *, double, double) \\
sda\_ddddff & void (char *, i32, i32, i32, i32, double, double) \\
sda\_nd & void (char *, int \var*{arraysize}, i32 *) \\
sda\_nx & void (char *, int \var*{arraysize}, i32 *) \\
\hline
sda\_rest & void (char *, i32 \var*{len1}, char *, i32 \var{len2}) \\
\hline
\end{tabular}
\caption{{\tt SDecode} Argument Types}\label{tab:sda}
\end{table*}

The table argument to {\tt SDecode}
should be the address of the first element of an array of type
{\tt struct sdecode}.
This structure contains the following members (in order):
{\tt char *sd_name}, {\tt enum sd_args sd_args}, and
{\tt void (*sd_fn)()}.
The {\tt sd_name} field is the word to match.
The {\tt sd_args} field is one of the {\tt sda} values from
Table~\ref{tab:sda}.
After decoding the arguments,
{\tt SDecode} calls {\tt sd_fn} with arguments as in the table.
If the first word is not in the table,
or if one or more arguments are missing or malformed---for instance,
if {\tt sd_args} calls for a floating point number,
but the argument word is not a proper (C-style)
double precision number---{\tt SDecode} complains
(via {\tt error}) and skips to the next `semicolon'.
If there are extra arguments,
{\tt SDecode} prints a warning (again via {\tt error}),
but calls the C function.
In either case,
{\tt SDecode} resumes after the `semicolon'.

An {\tt sda_rest} function is `extra special'.
It receives as its arguments its own name,
then a length \var*{len1}, a pointer, and a second length \var*{len2}.
The first length \var{len1} is the number of bytes of \verb|\special|
stored in the buffer given by the pointer argument.
Following these,
the rest of the \verb|\special| is \var{len2} bytes long.
It is up to the function to read all of those bytes.
It is possible for such a function to seek backwards \var{len1} bytes,
and call {\tt SDecode} with a different table,
or after calling {\tt SDsetclass},
for certain kinds of backward compatibility.
In general, however,
{\tt sda_rest} functions
are best avoided whenever possible.

Literal semicolons, white space, and other special characters
can be included by quoting,
so that in \TeX,
the command
\begin{quote}
\begin{verbatim}
\special{"this is a strange word"}
\end{verbatim}
\end{quote}
is interpreted as a single word.
Backquote always quotes the next (single) character,
both inside and outside quotes:
\begin{quote}
\begin{verbatim}
\special{text "He said ``Hello'"}
\end{verbatim}
\end{quote}
expands to the two words `{\tt text}' and `{\tt He said, `Hello'}'.
Octal and C-style escapes are {\em not\/} provided,
since \TeX\ can produce any character code directly.

As an example,
under the standard decoding,
\begin{quote}
\begin{verbatim}
\special{size 1.5; text "hello world"; put}
\end{verbatim}
\end{quote}
would, given a suitable decode table,
call the function associated with `size'
with the {\tt double} argument 1.5,
then call the function associated with `text'
with the \mbox{\tt char *} argument `hello world',
then call the function associated with `put'.
All three functions
would also receive their names
({\tt size}, {\tt text}, and {\tt put})
as their first argument (type \mbox{\tt char *}).
This last is sometimes used
to allow a single C function to implement several similar other functions.
The \ps\ driver, for instance,
uses the same function for {\tt hsize}, {\tt vsize}, and so on,
simply passing the parameter to a \ps\ function
whose name is derived directly from the first word.

The functions in {\tt lib/sdecode.c}, then, are as follows:
\begin{dtt}
\item[int SDsize(struct sdecode *\var{table})]
	is a macro that computes the number of elements
	in the given \var*{table}.
	{\tt SDsize} actually works on any array,
	but is meant to be used
	to compute the \var*{tsize} argument to {\tt SDecode}.
	The result is an {\tt int}
	rather than a {\tt size_t}
	for compatibility with {\tt SDecode}
	and with older C implementations.
\item[SDecode(FILE *\var*{fp}, i32 \var*{len},
struct sdecode *\var*{tbl}, int \var{tsize})]
	is a \void-valued function that decodes a stream of
	\verb|\special|s from the file \var*{fp}.
	The stream is \var{len} bytes long,
	and the functions that will be recognised
	are those in the given \var*{tbl}.
	The table has \var{tsize} entries.
	The entries must be sorted
	according to the machine's collating sequence.\footnote
	{Thus, anyone using EBCDIC, for instance,
	may have to reorder tables made for ASCII.
	This is another area that is subject to future change.}
\item[SDsetclass(char *\var*{spaces}, char *\var{semis})]
	is a \void-valued function
	that tells {\tt SDecode}
	which characters are to be treated as whitespace
	and which characters are to be treated as semicolons.
	Whitespace separates words,
	while semicolons separate sequences of words.
	Either parameter may be given as {\tt NULL}
	to indicate that {\tt SDsetclass} should use the default.
\end{dtt}
DataMuseum.dk

DKUUG/EUUG Conference tapes

⟦148ed817a⟧ TextFile

Derivation

TextFile