The NLIN Procedure

Getting Started

The NLIN procedure performs univariate nonlinear regression using the least squares method. Nonlinear regression analysis is indicated when you have information specifying that the functional relationship between the predictor and response variables is nonlinear in the parameters. Such information might come from direct knowledge of the true model, theoretical developments, or previous studies. Nonlinear, in this sense, means that the mathematical relationship between the variables and parameters is not required to have a linear form. For example, consider the following two models:

Y=a X² + b

Y = [1/a] X + b

where a and b are parameters and X and Y are random variables. The first model is linear in the parameters; the second model is nonlinear.

Estimating the Nonlinear Model

As an example of a nonlinear regression analysis, consider the following theoretical model of enzyme kinetics. The model relates the initial velocity of an enzymatic reaction to the substrate concentration.

$f(x, {\theta}) = \frac{\theta_1 x_i} {\theta_2 + x_i}, { for } i = 1, 2, ... , n$

where x_i represents the amount of substrate for n trials and $f(x, {\theta})$ is the velocity of the reaction. The vector ${\theta}$ contains the rate parameters.

Suppose that you want to study the relationship between concentration and velocity for a particular enzyme/substrate pair. You record the reaction rate (velocity) observed at different substrate concentrations. Your data set is as follows:

   data Enzyme;
      input Concentration Velocity @@;
      datalines;
   0.26 124.7   0.30 126.9   0.48 135.9   0.50 137.6
   0.54 139.6   0.68 141.1   0.82 142.8   1.14 147.6
   1.28 149.8   1.38 149.4   1.80 153.9   2.30 152.5
   2.44 154.5   2.48 154.7
   ;

The SAS data set Enzyme contains the two variables Concentration (substrate concentration) and Velocity (reaction rate). The double trailing at sign (@@) in the INPUT statement specifies that observations are input from each line until all of the values are read.

The following statements request a nonlinear regression analysis:

   proc nlin data=Enzyme method=marquardt hougaard;
      parms theta1=155
            theta2=0 to 0.07 by 0.01;
      model Velocity = theta1*Concentration / (theta2 + Concentration);
   run;

The DATA= option specifies that the SAS data set Enzyme be used in the analysis. The METHOD= option directs PROC NLIN to use the MARQUARDT iterative method. The HOUGAARD option requests that a skewness measure be calculated for the parameters.

The MODEL statement specifies the enzymatic reaction model

$V = \frac{\theta_1 C} { \theta_2 + C}$

where V represents the velocity or reaction rate and C represents the substrate concentration.

The PARMS statement declares the parameters and specifies their initial values. In this example, the initial estimates in the PARMS statement are obtained as follows. Since the model is a monotonic increasing function in C, and

$\lim_{C arrow \infty} ( \frac{\theta_1 C}{\theta_2 + C} ) = \theta_1$

take the largest observed value of the variable Velocity (154.7) as the initial value for the parameter Theta1. Thus, the PARMS statement specifies 155 as the initial value for Theta1, which is approximately equal to the maximum observed velocity.

To obtain an initial value for the parameter theta₂, first rearrange the model equation to solve for $\theta_2$ :

$\theta_2 = \frac{\theta_1 C} { V } - C$

By substituting the initial value of Theta1 for $\theta_1$ and taking each pair of observed values of Concentration and Velocity for C and V, respectively, you obtain a set of possible starting values for Theta2 ranging from about 0.01 to 0.07.

You can choose any value within this range as a starting value for Theta2, or you can direct PROC NLIN to perform a preliminary search for the best initial Theta2 value within that range of values. The PARMS statement specifies a range of values for Theta2, which results in a search over the grid points from 0 to 0.07 in increments of 0.01. The output from this PROC NLIN invocation are displayed in the following figures.

PROC NLIN evaluates the model at each point on the specified grid for the Theta2 parameter. Figure 45.1 displays the calculations resulting from the grid search.

The NLIN Procedure

Grid Search

Dependent Variable Velocity

theta1	theta2	Sum of Squares
155.0	0	3075.4
155.0	0.0100	2074.1
155.0	0.0200	1310.3
155.0	0.0300	752.0
155.0	0.0400	371.9
155.0	0.0500	147.2
155.0	0.0600	58.1130
155.0	0.0700	87.9662

The NLIN Procedure

Iterative Phase

Dependent Variable Velocity

Method: Marquardt

Iter	theta1	theta2	Sum of Squares
0	155.0	0.0600	58.1130
1	158.0	0.0736	19.7017
2	158.1	0.0741	19.6606
3	158.1	0.0741	19.6606

NOTE: Convergence criterion met.

Figure 45.1: Nonlinear Least Squares Grid Search from the NLIN Procedure

The parameter Theta1 is held constant at its specified initial value of 155, the grid is traversed, and the residual sums of squares are computed at each point. The "best" starting value is the point that corresponds to the smallest value of the residual sum of squares. Figure 45.1 shows that the best starting value for Theta2 is 0.06. PROC NLIN uses this point as the initial value for Theta2 in the following iterative phase.

PROC NLIN determines convergence using the relative offset measure of Bates and Watts (1981). When this measure is less than 10^-5, convergence is declared. Figure 45.1 displays the iteration history.

The NLIN Procedure

Estimation Summary
Method	Marquardt
Iterations	3
R	5.861E-6
PPC(theta2)	8.569E-7
RPC(theta2)	0.000078
Object	2.902E-7
Objective	19.66059
Observations Read	14
Observations Used	14
Observations Missing	0

Figure 45.2: Estimation Summary from the NLIN Procedure

Figure 45.2 displays a summary of the estimation including several convergence measures R, PPC, RPC, and Object. The R measure is the relative offset convergence measure of Bates and Watts. A PPC value of 8.569E-7 indicates that the parameter Theta2 (which has the largest PPC value of all the parameters) would change by that relative amount were PROC NLIN to take an additional iteration step. The RPC value indicates that Theta2 changed by 0.000078, relative to its value in the last iteration. The Object measure indicates that the objective function value changed 2.902E-7 in relative value from the last iteration.

The NLIN Procedure

NOTE:

An intercept was not specified for this model.

Source	DF	Sum of Squares	Mean Square	F Value	Approx Pr > F
Regression	2	290116	145058	88537.2	<.0001
Residual	12	19.6606	1.6384
Uncorrected Total	14	290135

Corrected Total	13	1269.7

Figure 45.3: Nonlinear Least Squares Summary from the NLIN Procedure

Figure 45.3 displays the least squares summary statistics for the model. The degrees of freedom, sums of squares, and mean squares are listed.

The NLIN Procedure

Parameter	Estimate	Approx Std Error	Approximate 95% Confidence Limits		Skewness
theta1	158.1	0.6737	156.6	159.6	0.0152
theta2	0.0741	0.00313	0.0673	0.0809	0.0362

Figure 45.4: Parameter Estimates from the NLIN Procedure

Figure 45.4 displays the estimates for each parameter, the associated asymptotic standard error, and the upper and lower values for the asymptotic 95% confidence interval. PROC NLIN also displays the asymptotic correlations between the estimated parameters (not shown).

The skewness measures of 0.0152 and 0.0362 indicate that the parameters are nearly linear and that their standard errors and confidence intervals can be safely used for inferences.

Thus, the estimated nonlinear model relating reaction velocity and substrate concentration can be written as

$\hat{V} = \frac{158.105 C} {0.0741 + C}$

where V represents the velocity or rate of the reaction, and C represents the substrate concentration.

Chapter Contents
Previous
Next
Top