DataMuseum.dk

Presents historical artifacts from the history of:

DKUUG/EUUG Conference tapes

This is an automatic "excavation" of a thematic subset of
artifacts from Datamuseum.dk's BitArchive.

See our Wiki for more about DKUUG/EUUG Conference tapes

Excavated with: AutoArchaeologist - Free & Open Source Software.


top - download
Index: ┃ T e

⟦ce8384357⟧ TextFile

    Length: 14813 (0x39dd)
    Types: TextFile
    Names: »example.out«

Derivation

└─⟦a0efdde77⟧ Bits:30001252 EUUGD11 Tape, 1987 Spring Conference Helsinki
    └─ ⟦this⟧ »EUUGD11/stat-5.3/eu/stat/example/example.out« 

TextFile

$ ff -dc -w 79 example.txt
            Annotated Example from Chapter 2 of the |STAT Handbook
                          Copyright 1986 Gary Perlman

      A concrete example with several |STAT programs is worked in detail.
        The example shows the style of analysis in |STAT.  New Users of
      |STAT should not try to understand all the details in the examples.
   Details about all the programs can be found in the online manual entries
  and more examples of program use appear in other chapters of the Handbook.

    The example is based on a familiar problem: grades in a course based on
    two midterm exams and a final exam.  Scores on exams are broken down by
      student gender and by the lab section taught by one of two teaching
   assistants: John or Jane.  The data are in the file exam.dat.  Each line
  in exam.dat contains a student ID number, the student's teaching assistant,
   the student's gender, and scores (out of 100) on the midterms and final.

    We will compute final grades based on the exam scores, compare male and
  female students, and compare the two teaching assistants.  The annotations
            in Chapter 2 of the Handbook will provide more details.
-------------------- Section 2.1    Data in exam.dat
$ cat exam.dat
S-1	john	male	56	42	58
S-2	john	male	96	90	91
S-3	john	male	70	59	65
S-4	john	male	82	75	78
S-5	john	male	85	90	92
S-6	john	male	69	60	65
S-7	john	female	82	78	60
S-8	john	female	84	81	82
S-9	john	female	89	80	68
S-10	john	female	90	93	91
S-11	jane	male	42	46	65
S-12	jane	male	28	15	34
S-13	jane	male	49	68	75
S-14	jane	male	36	30	48
S-15	jane	male	58	58	62
S-16	jane	male	72	70	84
S-17	jane	female	65	61	70
S-18	jane	female	68	75	71
S-19	jane	female	62	50	55
S-20	jane	female	71	72	87
\f

-------------------- Section 2.2    Computing Final Scores
$ dm INPUT ".2*x4 + .3*x5 + .5*x6" < exam.dat > scores.dat
-------------------- Examine Scores File: scores.dat
$ cat scores.dat
S-1	john	male	56	42	58	52.8
S-2	john	male	96	90	91	91.7
S-3	john	male	70	59	65	64.2
S-4	john	male	82	75	78	77.9
S-5	john	male	85	90	92	90
S-6	john	male	69	60	65	64.3
S-7	john	female	82	78	60	69.8
S-8	john	female	84	81	82	82.1
S-9	john	female	89	80	68	75.8
S-10	john	female	90	93	91	91.4
S-11	jane	male	42	46	65	54.7
S-12	jane	male	28	15	34	27.1
S-13	jane	male	49	68	75	67.7
S-14	jane	male	36	30	48	40.2
S-15	jane	male	58	58	62	60
S-16	jane	male	72	70	84	77.4
S-17	jane	female	65	61	70	66.3
S-18	jane	female	68	75	71	71.6
S-19	jane	female	62	50	55	54.9
S-20	jane	female	71	72	87	79.3
\f

-------------------- Sort Records by Final Scores
$ reverse -f < scores.dat | sort
27.1	34	15	28	male	jane	S-12
40.2	48	30	36	male	jane	S-14
52.8	58	42	56	male	john	S-1
54.7	65	46	42	male	jane	S-11
54.9	55	50	62	female	jane	S-19
60	62	58	58	male	jane	S-15
64.2	65	59	70	male	john	S-3
64.3	65	60	69	male	john	S-6
66.3	70	61	65	female	jane	S-17
67.7	75	68	49	male	jane	S-13
69.8	60	78	82	female	john	S-7
71.6	71	75	68	female	jane	S-18
75.8	68	80	89	female	john	S-9
77.4	84	70	72	male	jane	S-16
77.9	78	75	82	male	john	S-4
79.3	87	72	71	female	jane	S-20
82.1	82	81	84	female	john	S-8
90	92	90	85	male	john	S-5
91.4	91	93	90	female	john	S-10
91.7	91	90	96	male	john	S-2
-------------------- Another Way Using dsort
$ dsort n7 < scores.dat
S-12	jane	male	28	15	34	27.1
S-14	jane	male	36	30	48	40.2
S-1	john	male	56	42	58	52.8
S-11	jane	male	42	46	65	54.7
S-19	jane	female	62	50	55	54.9
S-15	jane	male	58	58	62	60
S-3	john	male	70	59	65	64.2
S-6	john	male	69	60	65	64.3
S-17	jane	female	65	61	70	66.3
S-13	jane	male	49	68	75	67.7
S-7	john	female	82	78	60	69.8
S-18	jane	female	68	75	71	71.6
S-9	john	female	89	80	68	75.8
S-16	jane	male	72	70	84	77.4
S-4	john	male	82	75	78	77.9
S-20	jane	female	71	72	87	79.3
S-8	john	female	84	81	82	82.1
S-5	john	male	85	90	92	90
S-10	john	female	90	93	91	91.4
S-2	john	male	96	90	91	91.7
\f

-------------------- Section 2.3    Summary of Final Scores
$ dm  s7  <  scores.dat | desc  -o  -t 75  -h  -i 10  -m 0
------------------------------------------------------------
 Under Range    In Range  Over Range     Missing         Sum
           0          20           0           0    1359.200
------------------------------------------------------------
        Mean      Median    Midpoint   Geometric    Harmonic
      67.960      68.750      59.400      65.564      62.529
------------------------------------------------------------
          SD   Quart Dev       Range     SE mean
      16.707      10.575      64.600       3.736
------------------------------------------------------------
     Minimum  Quartile 1  Quartile 2  Quartile 3     Maximum
      27.100      57.450      68.750      78.600      91.700
------------------------------------------------------------
        Skew     SD Skew    Kurtosis     SD Kurt
      -0.586       0.548       2.844       1.095
------------------------------------------------------------
   Null Mean           t    prob (t)           F    prob (F)
      75.000      -1.884       0.075       3.551       0.075
------------------------------------------------------------
       Midpt    Freq
       5.000       0 
      15.000       0 
      25.000       1 *
      35.000       0 
      45.000       1 *
      55.000       4 ****
      65.000       5 *****
      75.000       5 *****
      85.000       2 **
      95.000       2 **
\f

-------------------- Section 2.4    Predicting Final Exam Scores
$ dm x6 x4 x5 < scores.dat | regress -e final midterm1 midterm2
Analysis for 20 cases of 3 variables:
Variable        final   midterm1   midterm2 
Min           34.0000    28.0000    15.0000 
Max           92.0000    96.0000    93.0000 
Sum         1401.0000  1354.0000  1293.0000 
Mean          70.0500    67.7000    64.6500 
SD            15.3502    18.6720    20.4303 

Correlation Matrix:
final          1.0000 
midterm1       0.7586     1.0000 
midterm2       0.8838     0.9190     1.0000 
Variable        final   midterm1   midterm2 

Regression Equation for final:
final  =  -0.2835 midterm1  +  0.9022 midterm2  +  30.9177

Significance test for prediction of final
    Mult-R  R-Squared      SEest    F(2,17)   prob (F) 
    0.8942     0.7996     7.2640    33.9228     0.0000 
\f

-------------------- Predicted Plot From Regression Equation in regress.eqn
$ dm x6 x4 x5 < scores.dat | dm Eregress.eqn |
	pair -p -h 10 -w 30 -x final -y predicted
|------------------------------|89.3045
|                             3|
|                   1    1     |
|             1   1   11  1 1  |
|                              |
|              1 2 1           |predicted
|          1     1             |
|            1                 |
|       1                      |
|                              |
|1                             |
|------------------------------|36.5121
34.000                    92.000
        final  r= 0.894
-------------------- Residual Plot
$ dm x6 x4 x5 < scores.dat | dm Eregress.eqn | dm x2 x1-x2 |
	pair -p -h 10 -w 30 -x predicted -y residuals
|------------------------------|11.2546
|                     11       |
|                           1  |
|         1   1   1    1      1|
|      1        1        1    1|
|1               1      1      |residuals
|            1    1            |
|                        1     |
|                       1      |
|                              |
|                       1      |
|------------------------------|-18.0399
36.512                    89.304
      predicted  r= 0.000
\f

-------------------- Section 2.5    Failures by Assistant and Gender
$ dm s2 s3 "if x7 GE 75 then 'pass' else 'fail'" 1 < scores.dat |
	contab assistant gender success count
FACTOR:  assistant     gender    success      count 
LEVELS:          2          2          2         20 

assistan   count
john          10
jane          10
Total         20
NOTE: Yates' correction for continuity applied
	chisq       0.000000     df   1      p  1.000000

gender     count
male          12
female         8
Total         20
NOTE: Yates' correction for continuity applied
	chisq       0.450000     df   1      p  0.502335

success    count
fail          12
pass           8
Total         20
NOTE: Yates' correction for continuity applied
	chisq       0.450000     df   1      p  0.502335

SOURCE: assistant gender 
            male  female  Totals
john           6       4      10
jane           6       4      10
Totals        12       8      20
Analysis for assistant x gender:
	NOTE: Yates' correction for continuity applied
	WARNING: 2 of 4 cells had expected frequencies < 5
	chisq       0.000000     df   1      p  1.000000
	Fisher Exact One-Tailed Probability     0.675042
	Fisher Exact Other-Tail Probability     0.675042
	Fisher Exact Two-Tailed Probability     1.000000
	phi Coefficient == Cramer's V           0.000000
	Contingency Coefficient                 0.000000

SOURCE: assistant success 
            fail    pass  Totals
john           4       6      10
jane           8       2      10
Totals        12       8      20
Analysis for assistant x success:
	NOTE: Yates' correction for continuity applied
	WARNING: 2 of 4 cells had expected frequencies < 5
	chisq       1.875000     df   1      p  0.170904
	Fisher Exact One-Tailed Probability     0.084901
	Fisher Exact Other-Tail Probability     0.084901
	Fisher Exact Two-Tailed Probability     0.169802
	phi Coefficient == Cramer's V           0.306186
	Contingency Coefficient                 0.292770

SOURCE: gender success 
            fail    pass  Totals
male           8       4      12
female         4       4       8
Totals        12       8      20
Analysis for gender x success:
	NOTE: Yates' correction for continuity applied
	WARNING: 3 of 4 cells had expected frequencies < 5
	chisq       0.078125     df   1      p  0.779855
	Fisher Exact One-Tailed Probability     0.886759
	Fisher Exact Other-Tail Probability     0.259609
	Fisher Exact Two-Tailed Probability     1.000000
	phi Coefficient == Cramer's V           0.062500
	Contingency Coefficient                 0.062378

SOURCE: assistant gender success 
assistan  gender success
    john    male    fail       3
    john    male    pass       3
    john  female    fail       1
    john  female    pass       3
    jane    male    fail       5
    jane    male    pass       1
    jane  female    fail       3
    jane  female    pass       1
\f

-------------------- Section 2.6    Effects of Assistant and Gender
$ dm s1 s2 s3 "'m1'" s4 s1 s2 s3 "'m2'" s5 s1 s2 s3 "'final'" s6 < scores.dat |
	maketrix 5 | anova student assistant gender exam score
SOURCE: grand mean
assista gender  exam       N       MEAN         SD         SE
                          60    67.4667    18.0981     2.3365

SOURCE: assistant 
assista gender  exam       N       MEAN         SD         SE
john                      30    76.7000    13.7869     2.5171
jane                      30    58.2333    17.3179     3.1618

SOURCE: gender 
assista gender  exam       N       MEAN         SD         SE
        male              36    62.8611    20.1085     3.3514
        female            24    74.3750    11.9120     2.4315

SOURCE: assistant gender 
assista gender  exam       N       MEAN         SD         SE
john    male              18    73.5000    15.4053     3.6311
john    female            12    81.5000     9.6153     2.7757
jane    male              18    52.2222    18.8541     4.4440
jane    female            12    67.2500     9.6684     2.7910

SOURCE: exam 
assista gender  exam       N       MEAN         SD         SE
                m1        20    67.7000    18.6720     4.1752
                m2        20    64.6500    20.4303     4.5684
                final     20    70.0500    15.3502     3.4324

SOURCE: assistant exam 
assista gender  exam       N       MEAN         SD         SE
john            m1        10    80.3000    11.9355     3.7743
john            m2        10    74.8000    16.3761     5.1786
john            final     10    75.0000    13.4247     4.2453
jane            m1        10    55.1000    15.5167     4.9068
jane            m2        10    54.5000    19.5973     6.1972
jane            final     10    65.1000    16.2101     5.1261

SOURCE: gender exam 
assista gender  exam       N       MEAN         SD         SE
        male    m1        12    61.9167    20.7822     5.9993
        male    m2        12    58.5833    22.5931     6.5221
        male    final     12    68.0833    17.1329     4.9459
        female  m1         8    76.3750    11.1475     3.9413
        female  m2         8    73.7500    13.1557     4.6512
        female  final      8    73.0000    12.7167     4.4960

SOURCE: assistant gender exam 
assista gender  exam       N       MEAN         SD         SE
john    male    m1         6    76.3333    14.1516     5.7774
john    male    m2         6    69.3333    19.1172     7.8046
john    male    final      6    74.8333    14.4418     5.8959
john    female  m1         4    86.2500     3.8622     1.9311
john    female  m2         4    83.0000     6.7823     3.3912
john    female  final      4    75.2500    13.8894     6.9447
jane    male    m1         6    47.5000    15.8461     6.4692
jane    male    m2         6    47.8333    21.9127     8.9458
jane    male    final      6    61.3333    18.1071     7.3922
jane    female  m1         4    66.5000     3.8730     1.9365
jane    female  m2         4    64.5000    11.3871     5.6936
jane    female  final      4    70.7500    13.0735     6.5368

FACTOR  :    student  assistant     gender       exam      score 
LEVELS  :         20          2          2          3         60 
TYPE    :     RANDOM    BETWEEN    BETWEEN     WITHIN       DATA 

SOURCE                SS     df             MS         F      p
===============================================================
mean	     273105.0667      1    273105.0667   443.734  0.000 ***
s/ag	       9847.5278     16       615.4705

assista	       5115.2667      1      5115.2667     8.311  0.011 *
s/ag	       9847.5278     16       615.4705

gender 	       1909.0028      1      1909.0028     3.102  0.097 
s/ag	       9847.5278     16       615.4705

ag	        177.8028      1       177.8028     0.289  0.598 
s/ag	       9847.5278     16       615.4705

exam   	        293.2333      2       146.6167     4.564  0.018 *
es/ag	       1027.8889     32        32.1215

ae	        610.4333      2       305.2167     9.502  0.001 ***
es/ag	       1027.8889     32        32.1215

ge	        314.5722      2       157.2861     4.897  0.014 *
es/ag	       1027.8889     32        32.1215

age	         29.2056      2        14.6028     0.455  0.639 
es/ag	       1027.8889     32        32.1215

-------------------- Scheffe 95% Confidence Interval:
$ echo "sqrt ($df1 * $critf * $MSerror * 2 / $N)" | calc
sqrt(((((2 * 3.294537) * 32.1215) * 2) / 10)) =	6.506165391