Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
EWMACHART Statement

Example 20.3: Working with Unequal Subgroup Sample Sizes

See MACEW4 in the SAS/QC Sample Library

This example contains measurements from the metal clip manufacturing process (introduced in "Creating EWMA Charts from Raw Data" ). The following statements create a SAS data set named CLIPS4, which contains additional clip gap measurements taken on a daily basis:

   data clips4;
      input day @;
      length dayc $2.;
      informat day ddmmyy8.;
      format   day date5.;
      dayc=put(day,date5.);
      dayc=substr(dayc,1,2);
      do i=1 to 5;
         input gap @;
         output;
         end;
      drop i;
      label dayc='April';
   datalines;
    1/4/86  14.93  14.65  14.87  15.11  15.18
    2/4/86  15.06  14.95  14.91  15.14  15.41
    3/4/86  14.90  14.90  14.96  15.26  15.18
    4/4/86  15.25  14.57  15.33  15.38  14.89
    7/4/86  14.68  14.63  14.72  15.32  14.86
    8/4/86  14.48  14.88  14.98  14.74  15.48
    9/4/86  14.99  15.16  15.02  15.53  14.66
   10/4/86  14.88  15.44  15.04  15.10  14.89
   11/4/86  15.14  15.33  14.75  15.23  14.64
   14/4/86  15.46  15.30  14.92  14.58  14.68
   15/4/86  15.23  14.63    .      .      .
   16/4/86  15.13  15.25    .      .      .
   17/4/86  15.06  15.25  15.28  15.30  15.34
   18/4/86  15.22  14.77  15.12  14.82  15.29
   21/4/86  14.95  14.96  14.65  14.87  14.77
   22/4/86  15.01  15.11  15.11  14.79  14.88
   23/4/86  14.97  15.50  14.93  15.13  15.25
   24/4/86  15.23  15.21  15.31  15.07  14.97
   25/4/86  15.08  14.75  14.93  15.34  14.98
   28/4/86  15.07  14.86  15.42  15.47  15.24
   29/4/86  15.27  15.20  14.85  15.62  14.67
   30/4/86  14.97  14.73  15.09  14.98  14.46
   ;

Note that only two gap measurements were recorded on April 15 and April 16.

A listing of CLIPS4 is shown in Output 20.3.1. This data set contains three variables: DAY is a numeric variable that contains the date (month, day, and year) that the measurement is taken, DAYC is a character variable that contains the day the measurement is taken, and GAP is a numeric variable that contains the measurement.

Output 20.3.1: The Data Set CLIPS4
 
The Data Set CLIPS4

day dayc gap
01APR 01 14.93
01APR 01 14.65
01APR 01 14.87
01APR 01 15.11
01APR 01 15.18
02APR 02 15.06
02APR 02 14.95
02APR 02 14.91
02APR 02 15.14
02APR 02 15.41
03APR 03 14.90
03APR 03 14.90
03APR 03 14.96
03APR 03 15.26
03APR 03 15.18
04APR 04 15.25
04APR 04 14.57
04APR 04 15.33
04APR 04 15.38
04APR 04 14.89
07APR 07 14.68
07APR 07 14.63
07APR 07 14.72
07APR 07 15.32
07APR 07 14.86
08APR 08 14.48
08APR 08 14.88
08APR 08 14.98
08APR 08 14.74
08APR 08 15.48
09APR 09 14.99
09APR 09 15.16
09APR 09 15.02
09APR 09 15.53
09APR 09 14.66
10APR 10 14.88
10APR 10 15.44
10APR 10 15.04
10APR 10 15.10
10APR 10 14.89
11APR 11 15.14
11APR 11 15.33
11APR 11 14.75
11APR 11 15.23
11APR 11 14.64
14APR 14 15.46
14APR 14 15.30
14APR 14 14.92
14APR 14 14.58
14APR 14 14.68
15APR 15 15.23
15APR 15 14.63
15APR 15 .
15APR 15 .
15APR 15 .
16APR 16 15.13
16APR 16 15.25
16APR 16 .
16APR 16 .
16APR 16 .
17APR 17 15.06
17APR 17 15.25
17APR 17 15.28
17APR 17 15.30
17APR 17 15.34
18APR 18 15.22
18APR 18 14.77
18APR 18 15.12
18APR 18 14.82
18APR 18 15.29
21APR 21 14.95
21APR 21 14.96
21APR 21 14.65
21APR 21 14.87
21APR 21 14.77
22APR 22 15.01
22APR 22 15.11
22APR 22 15.11
22APR 22 14.79
22APR 22 14.88
23APR 23 14.97
23APR 23 15.50
23APR 23 14.93
23APR 23 15.13
23APR 23 15.25
24APR 24 15.23
24APR 24 15.21
24APR 24 15.31
24APR 24 15.07
24APR 24 14.97
25APR 25 15.08
25APR 25 14.75
25APR 25 14.93
25APR 25 15.34
25APR 25 14.98
28APR 28 15.07
28APR 28 14.86
28APR 28 15.42
28APR 28 15.47
28APR 28 15.24
29APR 29 15.27
29APR 29 15.20
29APR 29 14.85
29APR 29 15.62
29APR 29 14.67
30APR 30 14.97
30APR 30 14.73
30APR 30 15.09
30APR 30 14.98
30APR 30 14.46

The following statements request an EWMA chart, shown in Output 20.3.2, for these gap measurements:

   title 'EWMA Chart for Gap Measurements';
   symbol v=dot c=salmon;
   proc macontrol data=clips4;
      ewmachart gap*dayc / weight   = 0.3
                           cframe   = vibg
                           cinfill  = ligr
                           coutfill = yellow
                           cconnect = salmon;
   run;

The character variable DAYC (rather than the numeric variable DAY) is specified as the subgroup-variable in the preceding EWMACHART statement. If DAY were the subgroup-variable, each day during April would appear on the horizontal axis, including the weekend days of April 5 and April 6 for which no measurements were taken. To avoid this problem, the subgroup-variable DAYC is created from DAY using the PUT and SUBSTR function. Since DAYC is a character subgroup-variable, a discrete axis is used for the horizontal axis, and as a result, April 5 and April 6 do not appear on the horizontal axis in Output 20.3.2. A LABEL statement is used to specify the label April for the horizontal axis, indicating the month that these measurements were taken.

Output 20.3.2: EWMA Chart with Varying Sample Sizes
ewmaex3b.gif (5134 bytes)

Note that the control limits vary with the subgroup sample size. The sample size legend in the lower left corner displays the minimum and maximum subgroup sample sizes.

The EWMACHART statement provides various options for working with unequal subgroup sample sizes. For example, you can use the LIMITN= option to specify a fixed (nominal) sample size for computing control limits, as illustrated by the following statements:

   proc macontrol data=clips4;
      ewmachart gap*dayc / weight   = 0.3
                           limitn   = 5
                           cframe   = vibg
                           cinfill  = ligr
                           coutfill = yellow
                           cconnect = salmon;
   run;

The resulting chart is shown in Output 20.3.3.

Output 20.3.3: Control Limits Based on Fixed Sample Size
ewmaex3c.gif (4833 bytes)

Note that the only points displayed are those corresponding to subgroups whose sample size matches the nominal sample size of five. Therefore, points are not displayed for April 15 and April 16. To plot points for all subgroups (regardless of subgroup sample size), you can specify the ALLN option, as follows:

   proc macontrol data=clips4;
      ewmachart gap*dayc/ weight   = 0.3
                          limitn   = 5
                          alln
                          nmarkers
                          cframe   = vibg
                          cinfill  = ligr
                          coutfill = yellow
                          cconnect = salmon;
   run;

The chart is shown in Output 20.3.4. The NMARKERS option requests special symbols to identify points for which the subgroup sample size differs from the nominal sample size.

Output 20.3.4: Displaying All Subgroups Regardless of Sample Size
ewmaex3d.gif (5145 bytes)

You can use the SMETHOD= option to determine how the process standard deviation \sigma is to be estimated when the subgroup sample sizes vary. The default method computes \hat{\sigma} as an unweighted average of subgroup estimates of \sigma.Specifying SMETHOD=MVLUE requests a minimum variance linear unbiased estimate (MVLUE), which assigns greater weight to estimates of \sigma from subgroups with larger sample sizes. Specifying SMETHOD=RMSDF requests a weighted root-mean-square estimate. If the unknown standard deviation \sigma is constant across subgroups, the root-mean-square estimate is more efficient than the MVLUE. For more information, see "Methods for Estimating the Standard Deviation" .

The following statements apply all three methods:

   proc macontrol data=clips4;
      ewmachart gap*dayc / outlimits = cliplim1
                           outindex  = 'Default'
                           weight    = 0.3
                           nochart;
      ewmachart gap*dayc / smethod   = mvlue
                           outlimits = cliplim2
                           outindex  = 'MVLUE'
                           weight    = 0.3
                           nochart;
      ewmachart gap*dayc / smethod   = rmsdf
                           outlimits = cliplim3
                           outindex  = 'RMSDF'
                           weight    = 0.3
                           nochart;
   run;

   title 'Estimating the Process Standard Deviation';
   data climits;
      set cliplim1 cliplim2 cliplim3;
   run;

The data set CLIMITS is listed in Output 20.3.5.

Output 20.3.5: Listing of the Data Set CLIMITS
 
Estimating the Process Standard Deviation

_VAR_ _SUBGRP_ _INDEX_ _TYPE_ _LIMITN_ _ALPHA_ _SIGMAS_ _MEAN_ _STDDEV_ _WEIGHT_
gap dayc Default ESTIMATE V .002699796 3 15.0354 0.26503 0.3
gap dayc MVLUE ESTIMATE V .002699796 3 15.0354 0.26096 0.3
gap dayc RMSDF ESTIMATE V .002699796 3 15.0354 0.25959 0.3

Note that the estimate of the process standard deviation (stored in the variable _STDDEV_) is slightly different depending on the estimation method. The variable _LIMITN_ is assigned the special missing value V in the OUTLIMITS= data set, indicating that the subgroup sample sizes vary.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.