⟦ace174ad1⟧

TextFile

.TH CORR/REGRESS 1 "March 5, 1985" "UNIX|STAT 5.0" "UNIX User's Manual"
.SH NAME
corr/regress \- multivariate linear regression and correlation
.SH SYNOPSIS
.B regress
[-scp] [column names]
.SH DESCRIPTION
Note that this is a new version of
.I regress
and that it replaces the old one.
.I regress
performs a general linear correlation analysis
with multiple linear regression analysis.
.I regress
reads from the standard input
(via redirection with < or piped with |)
and writes to the standard output.
.PP
.I regress
prints various summary statistics
and correlations for up to ten (maybe more) variables.
.PP
The program assumes that its input is a file of lines,
each containing an equal number of numerical fields.
Optionally, names for these fields can be supplied in your call to
the program, but if none are specified, REG, A, B, C, etc. are used.
.PP
For regression analysis,
the first column is predicted with all the others.
This is different from the old version of the program
that simultaneously predicted all variables.
The columns in your input file can be reordered for
.I regress
with several filters including dm, colex, and awk.
.SH OPTIONS
.TP
.B -s
Print the matrix of raw sums of squares and cross products.
You would never want to use this option.
.TP
.B -c
Print the covariance matrix.
.TP
.B -p
Do a partial correlation analysis to determine the
contribution of individual predictors after the others have been included.
For each predictor,
the regression weight (b) and
the standardized regression weight (beta)
are reported.
The Rsq value is the squared multiple correlation
of the predictor with all the others;
if there is only one predictor, this will be zero,
and if there is only one other, all Rsq's will be identical.
The significance test answers the question:
``After all the other variables have been taken into account,
does this variable significantly improve prediction?''
Note that t*t=F.
.SH DIAGNOSTICS
.I regress
will complain about "ragged input"
if there are not an equal number of fields on each line.
If some fields are not numerical,
.I regress
will complain about "non-numerical input."
If
.I regress
tries to do a regression analysis,
and finds that two variables are perfectly correlated,
.I regress
will complain about a "singular correlation matrix."
.SH ALGORITHM
The program is based on the methods described by
Kerlinger and Pedhazur (1973) in
.I "Multiple Regression in Behavioral Research."
.SH SEE\ ALSO
unixstat(1), dm(1), pair(1), abut(1), maketrix(1)
.SH AUTHOR
Gary Perlman
.SH KEYWORDS
inferential statistics, data analysis
DataMuseum.dk

DKUUG/EUUG Conference tapes

⟦ace174ad1⟧ TextFile

Derivation

TextFile