Chapter Contents |
Previous |
Next |
HISTOGRAM Statement |
See FNORM2 in the SAS/QC Sample Library |
This example shows how to display a fitted curve that is not supported by the HISTOGRAM statement.
The offset of an attachment point is measured
(in mm) for a number of manufactured assemblies, and the
measurements are saved in a data set named ASSEMBLY.
data assembly; label offset = 'Offset (in mm)'; input offset @@; datalines; 11.11 13.07 11.42 3.92 11.08 5.40 11.22 14.69 6.27 9.76 9.18 5.07 3.51 16.65 14.10 9.69 16.61 5.67 2.89 8.13 9.97 3.28 13.03 13.78 3.13 9.53 4.58 7.94 13.51 11.43 11.98 3.90 7.67 4.32 12.69 6.17 11.48 2.82 20.42 1.01 3.18 6.02 6.63 1.72 2.42 11.32 16.49 1.22 9.13 3.34 1.29 1.70 0.65 2.62 2.04 11.08 18.85 11.94 8.34 2.07 0.31 8.91 13.62 14.94 4.83 16.84 7.09 3.37 0.49 15.19 5.16 4.14 1.92 12.70 1.97 2.10 9.38 3.18 4.18 7.22 15.84 10.85 2.35 1.93 9.19 1.39 11.40 12.20 16.07 9.23 0.05 2.15 1.95 4.39 0.48 10.16 4.81 8.28 5.68 22.81 0.23 0.38 12.71 0.06 10.11 18.38 5.53 9.36 9.32 3.63 12.93 10.39 2.05 15.49 8.12 9.52 7.77 10.70 6.37 1.91 8.60 22.22 1.74 5.84 12.90 13.06 5.08 2.09 6.41 1.40 15.60 2.36 3.97 6.17 0.62 8.56 9.36 10.19 7.16 2.37 12.91 0.95 0.89 3.82 7.86 5.33 12.92 2.64 7.92 14.06 ;
The assembly process is in statistical control, and it is decided to fit a folded normal distribution to the offset measurements. A variable X has a folded normal distribution if X=|Y|, where Y is distributed as . The fitted density is
You can use SAS/IML software to compute preliminary estimates of and based on a method of moments given by Elandt (1961). These estimates are computed by solving equation (19) of Elandt (1961), which is given by
Begin by using the MEANS procedure to compute the first and second moments and using the DATA step to compute the constant A.
proc means data=assembly noprint; var offset; output out=stat mean=m1 var=var n=n min=min; * Compute constant A from equation (19) of Elandt (1961) ; data stat; keep m2 a min; set stat; a = (m1*m1); m2 = ((n-1)/n)*var + a; a = a/m2;
Next, use the SAS/IML subroutine NLPDD to solve equation (19) by minimizing , and compute and .
proc iml; use stat; read all var {m2} into m2; read all var {a} into a; read all var {min} into min; * f(t) is the function in equation (19) of Elandt (1961) ; start f(t) global(a); y = 0.39894*exp(-0.5*t*t); y = (2*y-(t*(1-2*probnorm(t))))**2/(1+t*t); y = (y-a)**2; return(y); finish; * Minimize (f(t)-A)**2 and estimate mu and sigma ; if ( min < 0 ) then do; print "Warning: Observations are not all nonnegative."; print "The folded normal is inappropriate."; stop; end; if ( a < 0.6374 ) then do; print "Warning: Estimates may be unreliable"; end; opt = { 0 0 }; con = { 1e-6 }; x0 = { 2.0 }; tc = { . . . . . 1e-12 . . . . . . .}; call nlpdd(rc,etheta0,"f",x0,opt,con,tc); esig0 = sqrt(m2/(1+etheta0*etheta0)); emu0 = etheta0*esig0; create prelim var {emu0 esig0 etheta0}; append; close prelim;The preliminary estimates are saved in the data set PRELIM, as shown in Output 4.7.1. Output 4.7.1: Preliminary Estimates of , , and
|
* Define the log likelihood of the folded normal ; start g(p) global(x); y = 0.0; do i = 1 to nrow(x); z = exp( (-0.5/p[2])*(x[i]-p[1])*(x[i]-p[1]) ); z = z + exp( (-0.5/p[2])*(x[i]+p[1])*(x[i]+p[1]) ); y = y + log(z); end; y = y - nrow(x)*log( sqrt( p[2] ) ); return(y); finish; * Maximize the log likelihood with subroutine NLPDD ; use assembly; read all var {offset} into x; esig0sq = esig0*esig0; x0 = emu0||esig0sq; opt = { 1 0 }; con = { . 0.0, . . }; call nlpdd(rc,xr,"g",x0,opt,con); emu = xr[1]; esig = sqrt(xr[2]); etheta = emu/esig; create parmest var{emu esig etheta}; append; close parmest; quit;
The data set PARMEST saves the maximum likelihood estimates and (as well as ), as shown in Output 4.7.2.
Output 4.7.2: Final Estimates of , , and
|
proc capability data=assembly noprint; histogram offset / outfit=out normal(noprint) noplot; run;
Output 4.7.3 provides a listing of the data set OUT. The width and endpoints of the histogram bars are saved as values of the variables _WIDTH_, _MIDPT1_, and _MIDPTN_. See "Output Data Sets".
Output 4.7.3: The OUTFIT= Data Set OUT
The following statements create an annotate data set named
ANNO, which contains the coordinates of the fitted curve:
data anno; merge parmest out; length function color $ 8; function = 'point'; color = 'black'; size = 2; xsys = '2'; ysys = '2'; when = 'a'; constant = 39.894*_width_; left = _midpt1_ - 0.5*_width_; right = _midptn_ + 0.5*_width_; inc = (right-left)/100; do x = left to right by inc; z1 = (x-emu)/esig; z2 = (x+emu)/esig; y = (constant/esig)*(exp(-0.5*z1*z1)+exp(-0.5*z2*z2)); output; function = 'draw'; end; run;
The following statements read the ANNOTATE= data set and display the histogram and fitted curve, as shown in Output 4.7.4:
title "Folded Normal Distribution"; legend1 frame cframe=ligr cborder=black position=center; proc capability data=assembly noprint; spec usl=27 cusl=black lusl=2 wusl=2; histogram offset / annotate = anno cfill = red cframe = ligr legend = legend1; run;Output 4.7.4: Histogram with Annotated Folded Normal Curve
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.