|
DataMuseum.dkPresents historical artifacts from the history of: DKUUG/EUUG Conference tapes |
This is an automatic "excavation" of a thematic subset of
See our Wiki for more about DKUUG/EUUG Conference tapes Excavated with: AutoArchaeologist - Free & Open Source Software. |
top - metrics - downloadIndex: T r
Length: 16796 (0x419c) Types: TextFile Names: »rst.doc«
└─⟦52210d11f⟧ Bits:30007239 EUUGD2: TeX 3 1992-12 └─⟦c319c2751⟧ »unix3.0/TeX3.0.tar.Z« └─⟦036c765ac⟧ └─⟦this⟧ »TeX3.0/TeXcontrib/salkind/doc/rst.doc« └─⟦060c9c824⟧ Bits:30007080 DKUUG TeX 2/12/89 └─⟦this⟧ »./tex82/TeXcontrib/salkind/doc/rst.doc« └─⟦52210d11f⟧ Bits:30007239 EUUGD2: TeX 3 1992-12 └─⟦63303ae94⟧ »unix3.14/TeX3.14.tar.Z« └─⟦c58930e5c⟧ └─⟦this⟧ »TeX3.14/TeXcontrib/salkind/doc/rst.doc«
1. Raster Font File Format 1.1 Introduction A raster font (RST) file contains a description of a font suitable for use on a specific output device. The images of characters are described by sequences of bits specifying which pixels in a bounding rectangle should be blackened when the character is painted. RST files are not sufficient for typesetting, which requires information about kerning and ligatures (as supplied by TFM files, for example, to the TEX typesetting system). However, RST files contain enough information to print listings nicely, for example. 1.1.1 Fonts, Glyphs, Characters, Symbols The term font can name potentially many concepts. We will use it below in its more restricted sense of a collection of pictures or glyphs in a particular style, at a particular point size, for a particular output device. A RST file thus describes a font by this definition. A symbol is an ideal picture with which one or more meanings are commonly associated. The word ``commonly'' is used because a symbol such as the Greek alpha symbol may have a meaning arising from the context of its use, independent of its common meaning. It is ``ideal'' in the sense that it is the same symbol no matter what the style in which it is written or drawn. The word character is used in different ways. A character should be an alphabetic symbol. But it is often used to denote a wider class of symbols, as in the phrase ``the ascii character set.'' Sometimes it is used to mean a symbol written in a certain style, for example, ``the character Times Roman A.'' We will use the phrase character number in a fairly specific sense: the ordinal of a character in a given font, starting with zero. For example, the character number of the symbol upper-case a in the ascii character set is 65. A glyph is a particular instance of a symbol, a picture captured by a graphic description. In RST files, glyphs are represented as bit maps, and thus, for a given printer with fixed resolution and shape of dot, a glyph always has the same size and shape. 1.1.2 Bit Map Glyphs are represented in RST files as bit maps, a lists of rows; each row is a list of black and white dots. To view the glyph, the rows should be stacked vertically and displayed. 1.1.3 Units of Measure in RST Files -20 A fix is an integer, equal to 2 points. There are 72.27 points per inch. To convert a fix, f, to a real number of points, r, use the formula 20 r = f / 2 . 1.1.4 Strings A string in an RST file is represented as a sequence of 8-bit bytes. The first byte contains the number of bytes in the rest of the string. Thus a string can contain at most 255 characters. 1.1.5 File Addressing An RST file is a sequence of 8-bit bytes (often combined to form multiple-byte fields). All pointers in the file are offsets from the beginning of the file, starting with zero. Thus, a pointer to the first byte in the file would be zero. 1.1.6 Host RST File Representation This document can say nothing about the actual representation of an RST file on a given file system. Instead, it describes a virtual representation which is a randomly addressable sequence of 8-bit bytes. In practice, RST files have a host file-system name descriptive of their contents. For example, on some systems, a Computer Modern Roman font, point size 10, magnification 1, would have the filename CMR10.R10 (the R10 encodes the unity magnification). This allows typesetting software to access the correct RST file for its needs (based on name, size and magnification) without searching all the RST files in a font database (but this is very host-environment-dependent). 1.2 RST File Structure There are four portions of an RST file, viz., a font marker, the preamble, the directory and the raster section. At the beginning of the file there is a file mark, declaring the file to be an RST file. The preamble follows, giving vital information about the font, such as how many characters are in it, as well as nonessential information, such as who created the file. Next, the glyph directory contains the size of each glyph and a pointer into the last section, which contains the raster data for each glyph. 1.2.1 File Mark The file mark is 8 bytes long. The first four bytes contain the ascii letters Rast, identifying the file as a raster-format file. The next four bytes are currently unused, and should contain zero. 1.2.2 Preamble The preamble begins at the ninth byte (index 8) of the RST file with the number of bytes occupied by the rest of the preamble, followed by the version number of the RST format. An RST file format is version dependent. Version 0 has 18 fields of preamble information in at least 40 bytes. Below are the fields of the preamble, followed by a discussion of some of the fields. If the type of a field is not specified, it is an integer of the given size. - the number of bytes in the preamble, not including the two used by this field (2 bytes, offset 8) - format version number (currently zero) (1 byte, offset 10) - pointer to the glyph directory (>= 46) (3 bytes, offset 11) - character number of the first glyph in the font, usually 0 (2 bytes, offset 14) - character number of the last glyph in the font, usually 127 (2 bytes, offset 16) - font magnification, in units of 1/1000 (dimensionless); for example, an unmagnified font will have magnification 1000, and a font that is twice as big as its designsize will have magnification 2000 (a 0 magnification should be interpreted as 1000) (4 bytes, offset 18) - the designsize of the font, in fix units; if the font is not magnified, this will be the intended size of this font (4 bytes, offset 22) - the interline spacing for the font, in fix units; if this field is zero, try designsize * 1.2 (4 bytes, offset 26) - the width of a good looking interword space, in fixes. If this parameter is zero, try designsize/1.2 (4 bytes, offset 30) - rotation of the font in counter-clockwise positive degrees; normal fonts have rotation 0; fonts read while standing the page on its right edge have rotation 270 (2 bytes, offset 34) - character advance direction relative to the font's rotation; this tells where to place a ``next character'' on a ``line of characters;'' 0 mean advance to the right, 1 downward, 2 to the left, 3 upward (the English fonts have character advance direction 0, Chinese 1, Hebrew 2) (1 byte, offset 36) - line advance direction relative to the font's rotation; this tells where to place a ``next line of characters'' on the page, relative to the last line: 0 to the right, 1 downwards, 2 to the left, 3 upwards (the English fonts and Hebrew have line advance direction 1, Chinese 0) (1 byte, offset 37) - check identifier: this is used with metafont files to associate an RST file with a specific TFM file, ensuring that the TFM and RST files are describing the same font; if this field is zero, no check is to be done (4 bytes, offset 38) - font resolution in pixels per inch: 240 for the IMPRINT-10 (2 bytes, offset 42) - font identifier string (variable size, offset 44) - string describing the face-type encoding (variable size, offset >= 45) - string naming the intended output device (variable size, offset >= 46) - string naming the creator of this file (variable size, offset >=47) The first five fields (size, version, pointer to the glyph directory, character numbers of the first and last glyphs in the font) must be correct for any RST file to be useable. Though the magnification can presumably be obtained from the file name in some host environments, it is included for verification and to allow a magnification of greater precision than the file name might allow. The widths of the characters at their designsize are stored in the glyph directory. If the font is at magnification 2.0, the characters are actually twice as big as their designsize. The correct printing width of a character, then, is its given width multiplied by the magnification. (The individual character widths are given without the magnification figured in so that a printing program can substitute a font at one magnification for one at another magnification. If TEX says an RST font file at magnification 2.4 is called for, but such a font does not exist and one at 2.5 does exist, the latter can be used. The 2.5 magnification characters will be printed, but the 2.4 widths used by multiplying by 2.4 instead of 2.5. This will make words with these characters slightly cramped, but won't effect the spacing of the rest of the document. Alternatively, the characters could be spaced correctly if the space between words is shortened.) The designsize, interline spacing and space width are included to enable use of the font in simpler applications, such as listing generation or document production with software not concerned with kerns or ligatures. The designsize is the generic size of the font, and also tells the distance between baselines. An n point font normally has n/72 inches between baselines. The interline spacing distance field may differ from the designsize. In proper fonts they should be identical, but often fonts must be tuned for different devices. The interline spacing is a subjective spacing based on the look of text. (Metafont, for example, does not know about interline spacing, but in the RST files it generates, this field is set to 1.2 times the design size.) The width of a space determines the size of spaces and tabs. The rest of the fields are present to encourage correctness. A verification program can check that a font's file name corresponds to its contents. A device dependent program assembling text can ensure the font was made for it, or at least for a printer of its resolution. A user can find out who made a font in case it needs to be improved. 1.2.3 Glyph Directory Each glyph in the font has an entry in the glyph directory. Each entry takes up 15 bytes, thus the directory size in bytes is 15 times the number of glyphs in the font. A glyph is defined in the font if its character number is between the first and last character numbers (inclusive) as given in the preamble and if it has has a non-zero directory entry (described below). Let fg be the character number of the first glyph in the font file, let lg be that of the last glyph, and let DirPtr be the pointer to the first byte of the directory as given in the preamble. Then the address (offset in the RST file) of the directory entry for character n, assuming fg <= n <= lg, is DirPtr + ((n - fg) * 15). The format of a directory entry is: - h, height (in pixels) of raster picture (2 bytes, offset 0) - w, width (in pixels) of raster picture (2 bytes, offset 2) - y, distance (in pixels) from the top of the raster array to the glyphs's reference point (2 bytes, offset 4) - x, distance (in pixels) from the left of the raster array to the glyph's reference point (2 bytes, offset 6) - fw, advance width of the character, in fix units; note that this is a signed quantity (4 bytes, offset 8) also called the printing width or nominal width) (4 bytes) - p, pointer to raster data (3 bytes, offset 12) H and w are the height and width in pixels of the raster array containing the glyph. The glyph is stored as a sequence of h rows, each row being w % 8 bytes long. Again, the pixel height and width have no connection with the nominal height and width of the glyph as far as a typesetting system is concerned; rather, they denote the size of the smallest bounding box that fits around the black pixels comprising the character's raster representation. A character has a reference point often near, or just left of, its center. Y and x are the distances from the top left corner to this point measured in pixels. Both of these numbers may be negative, in two's complement representation. For the y-coordinate, positive is downward; for the x, positive is rightward. Fw is the advance width of the character in fix units (a signed quantity). Remember to multiply this width by the font magnification to get the true physical width. This is also called the printing width or nominal width (and is, for example, the same as the width of the glyph in a TEX TFM file). The final value contains a pointer to the character's raster picture in the raster data section. This pointer is an absolute byte address (the first byte of the file has address 0). 1.2.4 Raster Data The last section of the RST file is the raster data; each glyph is represented by a bit map in this section, but in no particular order (i.e., if you want to know where a particular glyph's raster data lives, look in the glyph's directory entry). Let bw be the width of the character in bytes, or (w % 8), where w is the pixel width of the raster data for the glyph as given in the directory entry. All rows start on byte boundaries and are packed to the left, so the last byte in a row will not be fully used unless the width of the character is a multiple of 8. Thus a glyph's ``picture'' (raster data, bit map) takes up bw*h bytes, where h is the pixel height of the glyph, also from the directory entry. Here is a letter ``Q'' as represented in the raster section of some RST file: Row 0 00001111 11100000 0....... 1 00011111 11110000 0....... 2 00111100 01111000 0....... 3 01110000 00011100 0....... 4 11110000 00011110 0....... 5 11100000 00001110 0....... 6 11100000 00001110 0....... 7 11100111 11001110 0....... 8 11111111 11011110 0....... 9 01111100 11111100 0....... 10 00111100 01111000 0....... 11 00011111 11110000 0....... 12 00X01111 11100011 1....... 13 00000000 11100111 0....... 14 00000000 01111100 0....... 15 00000000 00111000 0....... Col 76543210 76543210 76543210 (Bit number within a byte, high order bit is number 7) Here, the rows and columns are numbered, and the reference point of the character is marked with an X, but only the 0's and 1's are actually part of the character; periods represent padding zero bits that are not part of the character. H is 16 pixels, w 17 pixels, y 12 pixels, x 2 pixels, and bw 3 bytes. Note that the pixel width is just large enough to span the leftmost and rightmost black pixels in the character. Likewise, the pixel height is just large enough to span the topmost and bottommost black pixels. If this ``Q'' had a TEX TFM width of 5620393 fixes, it would print on the IMPRINT-10 at a width of 20 5620393 fixes / (2 fixes/points) / (72.27 points/inch) * 240 pixels/inch = 17.8 pixels which would be rounded up to 18. Suppose this were a font 5.4 points high. If this font were made from a font 10.8 points high and thus at magnification .5 (one half), then the nominal width would have been 11367088, twice as much as before, and only by multiplying it by the magnification (.5) would one get the actual printing width. Table of Contents 1. Raster Font File Format 0 1.1 Introduction 0 1.1.1 Fonts, Glyphs, Characters, Symbols 0 1.1.2 Bit Map 0 1.1.3 Units of Measure in RST Files 0 1.1.4 Strings 0 1.1.5 File Addressing 0 1.1.6 Host RST File Representation 0 1.2 RST File Structure 0 1.2.1 File Mark 0 1.2.2 Preamble 0 1.2.3 Glyph Directory 1 1.2.4 Raster Data 1