MODEL Statement
<label:>
- MODEL response=independents < / options >
;
<label:>
- MODEL events/trials=independents < / options >
;
The MODEL statement names the variables used
as the response and the independent variables.
Additionally, you can specify the distribution used to model the
response, as well as other options.
More than one MODEL statement can be
specified with the PROBIT procedure.
The optional label is used to label
output from the matching MODEL statement.
The response can be a single variable with a value
that is used to indicate the level of the observed response.
Such a response variable must be listed in the CLASS statement.
For example, the response might be a variable
called Symptoms that takes on the values
`None,' `Mild,' or `Severe.'
Note that, for dichotomous response variables, the probability of
the lower sorted value is modeled by default (see the "Details" section).
Because the model fit by the PROBIT procedure requires
ordered response levels, you may need to use either the
ORDER=DATA option in the PROC statement or a numeric coding
of the response to get the desired ordering of levels.
Alternatively, the response can be specified as a
pair of variable names separated by a slash (/).
The value of the first variable, events, is the
number of positive responses (or events). The value
of the second variable, trials, is the number of trials.
Both variables must be numeric and nonnegative, and
the ratio of the first variable value to the second
variable value must be between 0 and 1, inclusive.
For example, the variables might be hits, a variable
containing the number of hits for a baseball player, and
AtBats, a variable containing the number of times at bat.
A model for hitting proportion (batting average)
as a function of age could be specified as
model hits/AtBats=age;
If no independent variables are specified, PROC PROBIT fits
an intercept-only model.
The following options are available in the MODEL statement.
- CONVERGE=value
-
specifies the convergence criterion.
Convergence is declared when
the maximum change in the parameter estimates between
Newton-Raphson steps is less than the value specified.
The change is a relative change if the parameter is greater than
0.01 in absolute value; otherwise, it is an absolute change.
By default, CONVERGE=0.001.
- CORRB
-
displays the estimated correlation matrix of the parameter estimates.
- COVB
-
displays the estimated covariance matrix of the parameter estimates.
- DISTRIBUTION=distribution-type
- DIST=distribution-type
- D=distribution-type
-
specifies the cumulative distribution function
used to model the response probabilities.
The distributions are described in the "Details" section.
Valid values for distribution-type are
- NORMAL
- the normal distribution for the probit model
- LOGISTIC
- the logistic distribution for the logit model
- EXTREMEVALUE | EXTREME | GOMPERTZ
- the extreme value, or Gompertz distribution for the gompit model
By default, DISTRIBUTION=NORMAL.
- HPROB=value
-
specifies a minimum probability level
for the Pearson chi-square
to indicate a good fit. The default value is 0.10.
The LACKFIT option must also be specified
for this option to have any effect.
For Pearson goodness of fit chi-square values with
probability greater than the HPROB= value, the fiducial
limits, if requested with the INVERSECL option,
are computed using a critical value of 1.96.
For chi-square values with probability less than the value of the
HPROB= option, the critical value is a 0.95 two-sided quantile
value taken from the t distribution with degrees of
freedom equal to (k - 1) ×m - q, where k is
the number of levels for the response variable, m is the
number of different sets of independent variable values,
and q is the number of parameters fit in the model.
If you specify the HPROB= option in both the PROC and MODEL
statements, the MODEL statement option takes precedence.
- INITIAL=values
-
sets initial values for the parameters in the model other than the
intercept. The values must be given in the order in which the
variables are listed in the MODEL statement.
If some of the independent variables listed in the MODEL statement
are classification variables, then there must be as many values
given for that variable as there are classification levels minus 1.
The INITIAL option can be specified as follows.
Type of List
|
|
Specification
|
list separated by blanks | | initial=3 4 5 |
list separated by commas | | initial=3,4,5 |
By default, all parameters have initial estimates of zero.
- INTERCEPT=value
-
initializes the intercept parameter to value.
By default, INTERCEPT=0.
- INVERSECL
-
computes confidence limits for the values of the
first continuous independent variable (such as
dose) that yield selected response rates.
If the algorithm fails to converge (this can happen when C is
nonzero), missing values are reported for the confidence limits.
See the section "Inverse Confidence Limits" for details.
- ITPRINT
-
displays the iteration history, the final evaluation of
the gradient, and the second derivative matrix (Hessian).
- LACKFIT
-
performs two goodness-of-fit tests (a Pearson chi-square test
and a log-likelihood ratio chi-square test) for the fitted model.
Note: The data set must be sorted by the independent variables before
the PROBIT procedure is run if you want to perform a test of fit.
This test is not appropriate if the data are very sparse, with
only a few values at each set of the independent variable values.
If the Pearson chi-square test statistic is significant, then the
covariance estimates and standard error estimates are adjusted.
See
the "Lack of Fit Tests" section for a description of the tests.
If you specify the LACKFIT option in both the PROC and MODEL
statements, the MODEL statement option takes precedence.
- MAXITER=value
-
specifies the maximum number of iterations to
be performed in estimating the parameters.
By default, MAXITER=50.
- NOINT
-
fits a model with no intercept parameter.
If the INTERCEPT= option is also specified, the intercept
is fixed at the specified value; otherwise, it is set to zero.
This is most useful when the response is binary.
When the response has k levels,
then k-1 intercept parameters are fit.
The NOINT option sets the intercept parameter
corresponding to the lowest response level equal to zero.
A Lagrange multiplier, or score, test for the restricted
model is computed when the NOINT option is specified.
- SINGULAR=value
-
specifies the singularity criterion for determining linear
dependencies in the set of independent variables.
The sum of squares and crossproducts matrix of
the independent variables is formed and swept.
If the relative size of a pivot becomes less than
the value specified, then the variable corresponding
to the pivot is considered to be linearly dependent
on the previous set of variables considered.
By default, SINGULAR=1E-12.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.