
                                                  CORR/REGRESS(1)

NAME
     corr/regress - multivariate linear regression  and  correla-
     tion

SYNOPSIS
     regress [-scp] [column names]

DESCRIPTION
     Note that this is a new  version  of  _r_e_g_r_e_s_s  and  that  it
     replaces  the  old  one.   _r_e_g_r_e_s_s performs a general linear
     correlation  analysis  with   multiple   linear   regression
     analysis.   _r_e_g_r_e_s_s  reads  from  the  standard  input  (via
     redirection with < or piped with |) and writes to the  stan-
     dard output.

     _r_e_g_r_e_s_s prints various summary statistics  and  correlations
     for up to ten (maybe more) variables.

     The program assumes that its input is a file of lines,  each
     containing an equal number of numerical fields.  Optionally,
     names for these fields can be supplied in your call  to  the
     program,  but  if none are specified, REG, A, B, C, etc. are
     used.

     For regression analysis, the first column is predicted  with
     all  the  others.  This is different from the old version of
     the program that  simultaneously  predicted  all  variables.
     The  columns in your input file can be reordered for _r_e_g_r_e_s_s
     with several filters including dm, colex, and awk.

OPTIONS
     -s   Print the matrix of raw sums of squares and cross  pro-
          ducts.  You would never want to use this option.

     -c   Print the covariance matrix.

     -p   Do a partial correlation analysis to determine the con-
          tribution  of  individual  predictors  after the others
          have been included.  For each predictor, the regression
          weight  (b)  and  the  standardized  regression  weight
          (beta) are reported.  The Rsq value is the squared mul-
          tiple correlation of the predictor with all the others;
          if there is only one predictor, this will be zero,  and
          if  there  is only one other, all Rsq's will be identi-
          cal.   The  significance  test  answers  the  question:
          ``After  all  the  other variables have been taken into
          account, does this variable significantly improve pred-
          iction?'' Note that t*t=F.

DIAGNOSTICS
     _r_e_g_r_e_s_s will complain about "ragged input" if there are  not
     an  equal number of fields on each line.  If some fields are

CORR/REGRESS(1)

     not numerical, _r_e_g_r_e_s_s will  complain  about  "non-numerical
     input."  If  _r_e_g_r_e_s_s  tries to do a regression analysis, and
     finds that two variables are perfectly  correlated,  _r_e_g_r_e_s_s
     will complain about a "singular correlation matrix."

ALGORITHM
     The program is based on the methods described  by  Kerlinger
     and  Pedhazur  (1973)  in  _M_u_l_t_i_p_l_e _R_e_g_r_e_s_s_i_o_n _i_n _B_e_h_a_v_i_o_r_a_l
     _R_e_s_e_a_r_c_h.

SEE ALSO
     unixstat(1), dm(1), pair(1), abut(1), maketrix(1)

AUTHOR
     Gary Perlman

KEYWORDS
     inferential statistics, data analysis

