Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
BOXCHART Statement

Creating Box Charts from Subgroup Summary Data

See SHWBOXA in the SAS/QC Sample Library

The previous example illustrates how you can create box charts using raw data (process measurements). However, in many applications the data are provided as subgroup summary statistics. This example illustrates how you can use the BOXCHART statement with data of this type.

The following data set (OILSUM) provides the data from the preceding example in summarized form. There is exactly one observation for each subgroup (note that the subgroups are still indexed by DAY).

   data oilsum;
      input day kwattsl kwatts1 kwattsx kwattsm
                kwatts3 kwattsh kwattsr kwattsn;
      informat day date7. ;
      format day date5. ;
      label day    ='Date of Measurement'
            kwattsl='Minimum Power Output'
            kwatts1='25th Percentile'
            kwattsx='Average Power Output'
            kwattsm='Median Power Output'
            kwatts3='75th Percentile'
            kwattsh='Maximum Power Output'
            kwattsr='Range of Power Output'
            kwattsn='Subgroup Sample Size';
      datalines;
   04JUL94 3180 3340.0 3487.40 3490.0 3610.0 4050 870 20
   05JUL94 3179 3333.5 3471.65 3419.5 3605.0 3849 670 20
   06JUL94 3304 3376.0 3488.30 3456.5 3604.5 3781 477 20
   07JUL94 3045 3390.5 3434.20 3447.0 3550.0 3629 584 20
   08JUL94 2968 3321.0 3475.80 3487.0 3611.5 3916 948 20
   09JUL94 3047 3425.5 3518.10 3576.0 3615.0 3881 834 20
   10JUL94 3002 3368.5 3492.65 3495.5 3621.5 3787 785 20
   11JUL94 3196 3346.0 3496.40 3473.5 3592.5 3994 798 20
   12JUL94 3115 3188.5 3398.50 3426.0 3568.5 3731 616 20
   13JUL94 3263 3340.0 3456.05 3444.0 3505.5 4040 777 20
   14JUL94 3215 3336.0 3493.60 3441.5 3616.0 3872 657 20
   15JUL94 3182 3409.5 3563.30 3561.0 3719.5 3850 668 20
   16JUL94 3212 3378.0 3519.05 3515.0 3682.5 3769 557 20
   17JUL94 3077 3329.0 3474.20 3501.5 3599.5 3812 735 20
   18JUL94 3061 3315.5 3443.60 3435.0 3614.5 3815 754 20
   19JUL94 3288 3426.5 3586.35 3546.0 3762.5 3877 589 20
   20JUL94 3114 3373.0 3486.45 3474.5 3635.5 3928 814 20
   21JUL94 3167 3400.5 3492.90 3488.0 3582.5 3801 634 20
   22JUL94 3056 3322.0 3432.80 3460.0 3561.0 3800 744 20
   23JUL94 3145 3308.5 3496.90 3495.0 3652.0 3917 772 20
   ;
A listing of OILSUM is shown in Figure 32.4.

 
Summary Data Set for Power Outputs

day kwattsl kwatts1 kwattsx kwattsm kwatts3 kwattsh kwattsr kwattsn
04JUL 3180 3340.0 3487.40 3490.0 3610.0 4050 870 20
05JUL 3179 3333.5 3471.65 3419.5 3605.0 3849 670 20
06JUL 3304 3376.0 3488.30 3456.5 3604.5 3781 477 20
07JUL 3045 3390.5 3434.20 3447.0 3550.0 3629 584 20
08JUL 2968 3321.0 3475.80 3487.0 3611.5 3916 948 20
09JUL 3047 3425.5 3518.10 3576.0 3615.0 3881 834 20
10JUL 3002 3368.5 3492.65 3495.5 3621.5 3787 785 20
11JUL 3196 3346.0 3496.40 3473.5 3592.5 3994 798 20
12JUL 3115 3188.5 3398.50 3426.0 3568.5 3731 616 20
13JUL 3263 3340.0 3456.05 3444.0 3505.5 4040 777 20
14JUL 3215 3336.0 3493.60 3441.5 3616.0 3872 657 20
15JUL 3182 3409.5 3563.30 3561.0 3719.5 3850 668 20
16JUL 3212 3378.0 3519.05 3515.0 3682.5 3769 557 20
17JUL 3077 3329.0 3474.20 3501.5 3599.5 3812 735 20
18JUL 3061 3315.5 3443.60 3435.0 3614.5 3815 754 20
19JUL 3288 3426.5 3586.35 3546.0 3762.5 3877 589 20
20JUL 3114 3373.0 3486.45 3474.5 3635.5 3928 814 20
21JUL 3167 3400.5 3492.90 3488.0 3582.5 3801 634 20
22JUL 3056 3322.0 3432.80 3460.0 3561.0 3800 744 20
23JUL 3145 3308.5 3496.90 3495.0 3652.0 3917 772 20
Figure 32.4: The Summary Data Set OILSUM

There are eight summary variables in OILSUM.

You can read this data set by specifying it as a HISTORY= data set in the PROC SHEWHART statement, as illustrated by the following statements, which create the box chart shown in Figure 32.5:

   title 'Box Chart for Power Output';
   symbol v=dot c=salmon;
   proc shewhart history=oilsum;
      boxchart kwatts*day / cinfill  = ligr
                            cboxfill = ywh
                            cboxes   = dagr
                            cframe   = vligb;
   run;

Note that the process KWATTS is not the name of a SAS variable in the data set but is, instead, the common prefix for the names of the eight summary variables. The suffix characters L, 1, X, M, 3, H, R, and N indicate the contents of the variable. For example, the suffix characters 1 and 3 indicate first and third quartiles. The name DAY specified after the asterisk is the name of the subgroup-variable.

boxgs5.gif (6776 bytes)

Figure 32.5: Box Chart for Power Output Data

In general, a HISTORY= input data set used with the BOXCHART statement must contain the following variables:


Furthermore, the names of the summary variables must begin with the process name specified in the BOXCHART statement and end with the appropriate suffix character. If the names do not follow this convention, you can use the RENAME option in the PROC SHEWHART statement to rename the variables for the duration of the SHEWHART procedure step (see "Creating Charts for Means and Ranges from Summary Data" ).

If you specify the STDDEVIATIONS option in the BOXCHART statement, the HISTORY= data set must contain a subgroup standard deviation variable; otherwise, the HISTORY= data set must contain a subgroup range variable. The STDDEVIATIONS option specifies that the estimate of the process standard deviation \sigma is to be calculated from subgroup standard deviations rather than subgroup ranges. For example, in the following statements, the data set OILSUM2 must contain a subgroup standard deviation variable named KWATTSS:

   title 'Box Chart for Power Output';
   symbol v=dot;
   proc shewhart history=oilsum2;
      boxchart kwatts*day / stddeviations;
   run;

In summary, the interpretation of process depends on the input data set.


For more information, see "HISTORY= Data Set" .

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.