|
Chapter Contents |
Previous |
Next |
| General Statistics Examples |
The technique of parameter estimation in linear models using
the notion of regression quantiles is a generalization of
the LAE or LAV least absolute value estimation technique.
For a given quantile q, the estimate
b* of
in the model


/* Routine to find Regression Quantiles */
/* yname: name of dependent variable */
/* y: dependent variable */
/* xname: names of independent variables */
/* X: independent variables */
/* b: estimates */
/* predict: predicted values */
/* error: difference of y and predicted */
/* q: quantile */
/* */
/* notes: This subroutine finds the estimates b */
/* that minimize */
/* */
/* q * (y - Xb) * e - (1-q) * (Xb - y) * ^e */
/* */
/* where e = ( Xb <= y ). When q = .5 this is equivalent */
/* to minimizing the sum of the absolute deviations. */
/* */
/* This subroutine follows the approach given in: */
/* */
/* Koenker, R. and G. Bassett (1978) */
/* */
/* Bassett, G. and R. Koenker (1982). */
/* */
start rq(yname,y,xname,x,b,predict,error,q);
bound=1.0e10;
coef=x`;
m=nrow(coef);
n=ncol(coef);
/* Build rhs and bounds */
r=repeat(0,m+2,1);
L=repeat(q-1,1,n)||repeat(0,1,m)||-bound||-bound;
u=repeat(q,1,n)||repeat(.,1,m)||{ . . };
/* Build coefficient matrix and basis */
a=(y`||repeat(0,1,m)||{ -1 0 })//
(repeat(0,1,n)||repeat(-1,1,m)||{ 0 -1 })//
(coef||I(m)||repeat(0,m,2));
basis=n+m+2-(0:n+m+1);
/* Find a feasible solution */
call lp(rc,p,d,a,r,,u,L,basis);
/* Find the optimal solution */
L=repeat(q-1,1,n)||repeat(0,1,m)||-bound||{0};
u=repeat(q,1,n)||repeat(0,1,m)||{ . 0 };
call lp(rc,p,d,a,r,n+m+1,u,L,basis);
/* Report the solution */
variable=xname`;
b=d[3:m+2];
predict=X*b;
error=y-predict;
wsum=sum(choose(error<0,(q-1)*error,q*error));
print ,,'Regression Quantile Estimation' ,
'Dependent Variable: ' yname ,
'Regression Quantile: ' q ,
'Number of Observations: ' n ,
'Sum of Weighted Absolute Errors: ' wsum ,
variable b,
X y predict error;
finish rq;
The following example uses data on the
United States population from 1790 to 1970:
z = { 3.929 1790 ,
5.308 1800 ,
7.239 1810 ,
9.638 1820 ,
12.866 1830 ,
17.069 1840 ,
23.191 1850 ,
31.443 1860 ,
39.818 1870 ,
50.155 1880 ,
62.947 1890 ,
75.994 1900 ,
91.972 1910 ,
105.710 1920 ,
122.775 1930 ,
131.669 1940 ,
151.325 1950 ,
179.323 1960 ,
203.211 1970 };
y=z[,1];
x=repeat(1,19,1)||z[,2]||z[,2]##2;
run rq('pop',y,{'intercpt' 'year' 'yearsq'},x,b1,pred,resid,.5);
The output is shown below.
/* Compare L1 residuals with least squares residuals */
/* Compute the least squares residuals */
resid2=y-x*inv(x`*x)*x`*y;
/* x axis of plot */
xx=repeat(x[,2] ,3,1);
/* y axis of plot */
yy=resid//resid2//repeat(0,19,1);
/* plot character*/
id=repeat('1',19,1)//repeat('2',19,1)//repeat('-',19,1);
call pgraf(xx||yy,id,'Year','Residual',
'1=L(1) residuals, 2=least squares residual');
The output generated is shown below.
|
|
|
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.