|
DataMuseum.dkPresents historical artifacts from the history of: DKUUG/EUUG Conference tapes |
This is an automatic "excavation" of a thematic subset of
See our Wiki for more about DKUUG/EUUG Conference tapes Excavated with: AutoArchaeologist - Free & Open Source Software. |
top - downloadIndex: ┃ T p ┃
Length: 5596 (0x15dc) Types: TextFile Names: »pgmg«
└─⟦87ddcff64⟧ Bits:30001253 CPHDIST85 Tape, 1985 Autumn Conference Copenhagen └─ ⟦this⟧ »cph85dist/stat/doc/pgmg«
.TC 4 .DR .CW "Gary Perlman .ds AU Gary Perlman .(T "UNIX|STAT Programming Notes Gary Perlman School of Information Technology Wang Institute of Graduate Studies Tyngsboro, MA 01879 .)T .ls 1 .bp 1 .!T .P This document provides an overview of the program structure of most UNIX|STAT programs. I say \fImost\fR because there was never a plan for all the programs. They just happened, and the similarity of their operation is mostly due to reworking of the code. .P After reading this document, you should have a clear idea about how to write programs that behave similarly to existing UNIX|STAT programs. Most of the similarity is in the processing of the input and output. Unfortunately, there are few general routines for numerical computation. .LH "General Program Structure .P The main programs in UNIX|STAT begin by including a standard header file "unixstat.h" (in the stat/src directory) that sets up many of the conventions used in the programs. .P The main function processes the command line options with the familiar .T argc , .T argv convention. An important early statement is to set the external variable, .T Argv0 , to the name of the program, using the .T ARGV0 macro. Then an initializing function is called to process command line options (see the later discussion) and do any necessary operations. Then data are read in and processed. .(D "Example Main Program #include "unixstat.h" PGM (name, purpose, version, date) main (argc, argv) char *argv; { ARGV0; /* used in error messages */ initial (argc, argv); /* set options and initialize */ readdata (); printresults (); exit (0); } .)D .LH "Standard I/O Library .P The standard input/output (I/O) library is used by all the UNIX|STAT programs. In general, the programs read data from the standard input and write results to the standard output. There are few exceptions to this. Error message and diagnostics are printed to the standard error output using conventions defined in the header file .T unixstat.h . .LH "Command Line Options .P Command line options are handled in two ways. The older programs, and in particular, the ones that do not process files on the command lines, use handmade parsers. The handmade parsers do not require a minus before command line options and largely ignore spaces on the command line. The newer commands, and the ones that process files, use the .T getopt command line option parser. .(D "Example Use of getopt int C; int opterr = 0; extern int optind; extern char *optarg; while ((C = getopt (argc, argv, "abx:y:")) != EOF) switch (C) { default: opterr++; break; case 'a': Aflag = 1; break; ... case 'x': Xflag = convert (optarg); break; ... } if (opterr) USAGE ([-ab] [-x xx] [-y yy] ...) while (optind < argc) process (argv[optind++]); .)D .LH "Checking of Input Source .P UNIX|STAT programs that read from the standard input check the the standard input to see if they are reading from a terminal. If input is from the terminal, it often means that the program user forgot to redirect or pipe input, and the .T checkstdin function warns the user of this. For piped or redirected input, the function is silent. .(D "Prompt for Terminal Input checkstdin (Argv0); .)D .LH "Parsing Input Lines .P There are two functions for parsing input lines into fields or columns. The older .T sstrings function copies quoted or whitespace delimited fields into an argument character matrix. This required a lot of space. The newer .T parseline takes a different argument, an array of pointers to strings. .T parseline uses the raw input line for storage space, and so modifies it, unlike .T sstrings . .(D "Parsing the Input Data char *array[MAXARRAY]; int ncols; ncols = parseline (line, array, MAXARRAY); if (ncols == 0) continue; /* ignore blank lines */ if (ncols > MAXARRAY) ERRMANY (columns, MAXARRAY) .)D .LH "Type Checking and Conversion .P UNIX|STAT programs must verify that their inputs are valid. Any fields that should be numbers must be checked with the .T number function. A standard error message is defined in .T unixstat.h . .(D if (!number (input)) ERRNUM (input) .)D .LH "Calculations .P All calculations in UNIX|STAT programs are done in double precision. In some cases, when many data points must be stored (on the order of thousands) then single precision .T float storage is used. .LH "Error Messages .P Standard error messages are defined in .T unixstat.h and these make use of the external variable .T Argv0 that should be set to the name of the program. Most standard error messages cause an exit with a non-zero status. The .T WARNING macro is the only one that does not. Rather than describe all the error messages, the .T unixstat.h header file should be read. The error messages all make use of the .T ERRMSG# macros in which .T # is replaced by a digit 0-3. The .T ERRMSG# macro takes an unquoted .T printf format string followed by 0-3 arguments. .(D "Using Standard Error Messages #include "unixstat.h" extern char *Argv0; if (...) ERRMSG0 (Something went wrong) if (...) ERRMSG1 (Something went wrong on line %d, Lineno) .)D .LH "Appendixes .P Following pages include the header file .T unixstat.h , a skeleton UNIX|STAT program, .T unixstat.c , and manual entries for several utility functions: .(D .ta 12n barplot create a barplot cor compute correlation coefficient getopt command line option parser getword read a word from a file histogram print a histogram number check string as a number parseline parse a line into fields pof probability of F-ratio scatterplot print a scatter plot strings parse a line into fields .)D .TC