|
|
DataMuseum.dkPresents historical artifacts from the history of: DKUUG/EUUG Conference tapes |
This is an automatic "excavation" of a thematic subset of
See our Wiki for more about DKUUG/EUUG Conference tapes Excavated with: AutoArchaeologist - Free & Open Source Software. |
top - metrics - downloadIndex: T t
Length: 7261 (0x1c5d)
Types: TextFile
Names: »tutorial«
└─⟦87ddcff64⟧ Bits:30001253 CPHDIST85 Tape, 1985 Autumn Conference Copenhagen
└─⟦this⟧ »cph85dist/stat/doc/tutorial«
.ls 1
.de EX
.ce
.ft B
\\$1
.ft
..
.LH "Plotting a Function
.P
Suppose you want to make a plot of the function
.EX "Y = X**2 - 30x + 10
First, use SERIES to create a set of numbers to work with.
.EX "series -100 100
Then, transform this data using DM.
.EX "series -100 100 | dm x1 "x1*x1-30*x1+10"
Then, plot this data using the ``p'' option of PAIR.
.EX "series -100 100 | dm x1 "x1*x1-30*x1+10" | pair -p
The result is show below.
.nf
|--------------------------------------------------|13010
|3 |
|21 |
| 3 |
| 4 |
| 3 |
| 12 |
| 22 |
| 21 |
| 31 |
| 31 4|
| 31 4 |
| 32 14 |
| 22 13 |
| 24 33 |
| 41 41 |
| 33 24 |
| 142 142 |
| 242 143 |
| 2441 443 |
| 3444444444444 |
|--------------------------------------------------|-215
-100.000 100.000
.fi
.bp
.LH "Correlations
.P
Suppose you want to see the correlations between
X, the square of X, its logarithm, and its square root.
First, you create a series of numbers to work with.
.EX "series 1 100
Then you use DM to create your transformed columns.
.EX "series 1 100 | dm x1 "x1*x1" "log(x1)" "x1^.5"
Then you can pipe this output to CORR to get correlations.
.EX "series 1 100 | dm x1 "x1*x1" "log(x1)" "x1^.5" |
.EX " | corr x "x*x" "log(x)" "sqrt(x)"
The result is shown below.
.nf
Analysis for 100 points of 4 variables:
VARIABLE : x x*x log(x) sqrt(x)
MIN : 1.0000 1.0000 0.0000 1.0000
MAX : 100.0000 10000.0000 4.6052 10.0000
MEAN : 50.5000 3383.5000 3.6374 6.7146
SD : 29.0115 3024.3558 0.9281 2.3385
CORRELATION MATRIX:
x : 1.0000
x*x : 0.9689 1.0000
log(x) : 0.8959 0.7786 1.0000
sqrt(x) : 0.9815 0.9076 0.9621 1.0000
VARIABLE : x x*x log(x) sqrt(x)
.fi
.bp
.LH "Analysis of Variance
.P
Suppose you want to use ANOVA on some multifactor data.
First, you may have to set up a file of labels for the variables.
Suppose you have three variables:
subject name (12 in all),
dosage (low, medium, and high),
and hours without sleep (0, 10, 20, 30, 40).
You have 12*3*5 or 180 data points for each of your measures.
Suppose your N measures are in a file of N columns,
and that each subject's data is reported in successive lines.
Within each subject,
low dosages are reported for each fatigue level,
then high, then medium.
You would set up files called fatigue and dosage
with the lines:
.nf
.ta 1i 2i 3i 4i 5i
\fIfatigue\fR \fIdosage\fR
10 low
20 low
30 low
40 low
50 low
high
high
high
high
high
medium
medium
medium
medium
medium
.fi
.bp
Then you would create a label file using DM and ABUT.
.EX "series 0 179 | dm "floor(x1/15)" | abut -c - fatigue dosage
The result for the first subject is shown below.
Suppose this label file is called ``label.''
.nf
1 10 low
1 20 low
1 30 low
1 40 low
1 50 low
1 10 high
1 20 high
1 30 high
1 40 high
1 50 high
1 10 medium
1 20 medium
1 30 medium
1 40 medium
1 50 medium
.fi
.bp
.P
Now you can use ANOVA on the different measures.
Suppose you want to work on the third measure.
.EX "dm s3 < data | abut label - | anova S fatigue dose Var3
You could do all the variables using a shell script loop.
.nf
n=7
i=1
while (`eval $i<=$n`) do
dm s$i < data | abut label - | anova S fatigue dose Var$i > $i.out
done
pr [1-$n].out
.fi
.LH "Transformations and Paired T-Tests
.P
Suppose you wanted to compare the numbers in the first
half of a file with those in the second half.
The PAIR program assumes X and Y numbers alternate,
so some reformatting is needed.
Assuming not too many numbers are involved,
the MAKETRIX and TRANSPOSE commands can be used.
Suppose you have 50 numbers
on the first two lines of the file,
.nf
1 2 3 4 5 6 7 8 9 8 7 6 5 4 3 2 1 2 3 4 5 6 7 8 7
3 2 4 3 5 4 6 5 7 6 8 7 9 8 7 6 5 6 7 6 5 4 3 2 2
.fi
and 50 on the second two.
For this example, the second two will be generated using
DM and MAKETRIX.
.EX "maketrix 1 | dm "floor(10*log(x1))" | maketrix 25
.nf
0 6 10 13 16 17 19 20 21 20 19 17 16 13 10 6 0 6 10 13 16 17 19 20 19
10 6 13 10 16 13 17 16 19 17 20 19 21 20 19 17 16 17 19 17 16 13 10 6 6
.fi
The following will make a matrix with 50 columns,
and transpose it to make a two column input to pair
.EX "maketrix 50 | transpose | pair -ps
.bp
.nf
Column 1 Column 2 Difference
Minimums 1.0000 0.0000 -12.0000
Maximums 9.0000 21.0000 1.0000
Sums 253.0000 716.0000 -463.0000
SumSquares 1513.0000 11704.0000 4853.0000
Means 5.0600 14.3200 -9.2600
SDs 2.1798 5.4415 3.3975
t(49) 16.4143 18.6085 -19.2722
p -0.0000 -0.0000 -0.0000
Correlation r-squared t(48) p
0.9619 0.9252 24.3656 0.0000
Intercept Slope
2.1701 2.4012
|--------------------------------------------------|21
| 5 2|
| 8 |
| |
| 8 |
| 7 |
| |
| |
| 6 |
| |
| |
| 6 |
| |
| |
| |
| 6 |
| |
| |
| |
| |
|2 |
|--------------------------------------------------|0
1.000 9.000
.fi