⟦a8c677325⟧

TextFile

.TH DM 1 "March 5, 1985" "UNIX|STAT 5.0" "UNIX User's Manual"
.SH NAME
dm \- data manipulator: formatting and conditional transformations
.SH SYNOPSIS
.B dm
[Efile] [expressions]
.SH DESCRIPTION
.I dm
is a data manipulating program that allows you
to extract columns from a file, possibly based on conditions,
and produce algebraic combinations of columns.
.I dm
reads from the standard input (via redirection with < or piped
with |) and writes to the standard output.
To use 
.I dm,
you write a series of expressions,
and for each line of the input,
.I dm
reevaluates and prints the values of those expressions.
.I dm
provides you with much of the power of an interpreted
general purpose programming language, but is much more convenient.
.I dm
does many transformations of data one would usually
need to write and compile a one-shot program for.
.PP
.I dm
allows you to access the fields on each line of its input.
Numerical values of fields on a line can be accessed by
the letter 'x'
followed by a column number.
Character strings can be accessed by the letter 's' followed
by a column number.
For example, for the input line:
.ce
12 45.2 red
s1 is the string '12',
x2 is the number 45.2 (which is not the same as s2, the string '45.2'),
and s3 is the string 'red'.

.I "Column Extraction."
Simple column extraction can be accomplished by typing
the strings from the columns desired.
For example, to extract the third, eighth, first and second columns
(in that order) from "file," one would type:
.ce
dm s3 s8 s1 s2 < file

.I "Algebraic Transformations."
Algebraic operations involving the numerical values of fields
can be accomplished in a straightforward manner.
To print, in order, the sum of the first two columns,
the difference of the next two columns, and the square root of the sum
of squares of the first and third columns, one could type the command:
.ce
dm "x1+x2" "x3-x4" "(x1*x1+x3*x3)^.5"
There are a number of the usual mathematical functions that allow
expressions like:
.ce
dm "exp(x1) + log(log(x2))" "floor (x1/x2)"

.I "Testing Conditions."
Expressions can be conditionally evaluated by testing
the values of other expressions.
For example, to print the ratio of x1 and x2, one might want to
check the value of x2 before division and print 'error' if x2 is 0.0.
This could be done by the command:
.ce
dm "if x2 = 0 then 'error' else x1/x2"
Or one might want to extract only those lines in which two columns
have the same lexical value,
such as searching for matching responses.
If the obtained response is in column five and the correct
response is in column two, this could be accomplished with:
.ce
dm "if s5 = s2 then INPUT else KILL"
INPUT is a string variable that is equal to the current input line
and KILL is a control primitive that terminates execution for the current line.

.I "Other Features."
.I dm
offers a full set of comparison, algebraic, and logical operators.
It also features a set of special variables that hold useful
information for you and allow taking control in exceptional conditions.
These include:
INPUT, the current input line;
N, the number of fields in INPUT;
SUM, the sum of the columns in the INPUT;
RAND, a uniform random number;
NIL, an expression that causes no output;
KILL, which terminates evaluation on INPUT and goes to the next line;
and EXIT, which terminates all processing.
.SH SEE\ ALSO
unixstat(1),
abut(1), maketrix(1), transpose(1), series(1), reverse(1), colex(1), trans(1)
.br
G. Perlman,
.I "dm - A data manipulator,"
Cognitive Science Laboratory.
.SH AUTHOR
Gary Perlman
.SH KEYWORDS
statistics, data analysis, data manipulation, column extraction/trasformation
DataMuseum.dk

DKUUG/EUUG Conference tapes

⟦a8c677325⟧ TextFile

Derivation

TextFile