Analytical or numerical derivatives are used to calculate the Hessian
matrix, which when inverted provides a covariance matrix of the
free (i.e., estimated) parameters.
The Hessian matrix is the matrix of second partial derivatives
of the likelihood function with respect to the values of the freely
estimated parameters.
When calculating the covariance matrix for the parameter values
for an accepted fit, you will be asked a few questions.
First, you will be asked if you prefer to use analytical or numerical
derivatives.
However, analytical derivatives are available for use only if they
were included in the compiled model code – numerical derivatives
are always available.
(Note that a covariance matrix using analytical derivatives cannot
be used in conjunction with Bayesian prior distributions on the
parameter values since the prior distributions are not incorporated
into the calculation of the analytical derivatives– however
numerical derivatives can be used).
After an informative message, you will next be given an opportunity
to modify the parameter values and step sizes in the
algorithm file.
Typically, you would not make any changes to the parameter values,
but depending upon your recent success with the current step sizes,
you might be motivated to fiddle with the step sizes.
Now if you have chosen numerical derivatives, you will next be
sequentially asked for three factors by which a parameter’s
step size will be multiplied to govern the size of the step taken
away the parameter’s value at the accepted fit.
Each of these factored step sizes will be used to calculate three
separate Hessian matrices for three separate ‘grids’
of parameter deviations from their values at the accepted model
fit.
Next you will be asked to provide the constant which converts the
likelihood function or sum-of-squares used for minimization to the
negative ln-likelihood of your data.
Typically this factor is provided for you in the compiled model
code - if it is, then use that value, otherwise is it can usually
be interpreted from the likelihood definition presented on the
'Initialization form'.
For example, if the likelihood function is twice the negative ln-likelihood,
then the appropriate constant is 0.5.
Once you confirm the likelihood constant, you will be asked if
you want a full covariance matrix, or just the conditional diagonal
variances under the (no doubt false) assumption that all parameters
except the one whose variance is being evaluated have a standard
error of zero (0).
A diagonal of only the parameter variances is not recommended and
will produce optimistic standard errors – it has value only
in its speed of calculation that might assist you in a preliminary,
ad hoc, analysis of parameter uncertainty.
Once the covariance matrices on the three grids have been calculated,
a quick comparison of their determinants will inform you of their
similarity – the more similar the determinants, the more probable
that the covariance matrices are similar.
Ideally, if samples sizes are sufficient and the model’s
behaviour is ‘close-to-linear’ near the accepted fit,
all three Hessian matrices, and consequently their associated covariance
matrices, will be almost identical.
In that case the standard errors reported for each parameter will
be very similar among the three grids and will represent a Gaussian
(normal) distribution of parameter error.
Else, strong differences among matrices may indicate that the covariance
matrix is not multinormal, i.e., sample size and/or model behaviour
are not such that asymptotic conditions can be assumed and that
the central limit theorem is governing behaviour of the covariance
matrix.
A practical interpretation of such a divergent outcome for the
three grids is that the error distribution for at least one parameter
is not normal, rather it is skewed.
In that case the standard errors reported for each skewed error
distribution will differ on each grid.
Though initially such an outcome may seem frustrating, in reality
this numerical method can provide more realistic representations
of the error distribution around a parameter if that parameter’s
step size and the corresponding grid factors are carefully chosen.
In addition to a covariance matrix, SmartStats' © output includes
a correlation matrix and z-tests for each estimated parameter value
against the values of zero (0) or one (1).
Other options for assessing parameter uncertainty in SmartStats
© include Likelihood profiling
and Bayesian statistics.
More explanation on calculating covariances using numerical derivatives
can be found in:
Mittertreiner, A., and J. Schnute.
1985. Simplex: a manual and software package for easy nonlinear
parameter estimation and interpretation in fishery research. Canadian
Technical Report of Fisheries and Aquatic Science 1384, Ottawa,
Ontario, Canada.
Though most of the implementation aspects of this technical report
are out-of-date, perhaps even obsolete, the conceptual descriptions
of the simplex method, algorithm
files, and covariance calculations are informative.
|