Example 25.4: Linear Discriminant Analysis of Remote-Sensing Data on Crops
In this example, the remote-sensing data described
at the beginning of the section are used.
In the first PROC DISCRIM statement, the DISCRIM procedure
uses normal-theory methods (METHOD=NORMAL)
assuming equal variances (POOL=YES) in five crops.
The PRIORS statement, PRIORS PROP, sets the prior
probabilities proportional to the sample sizes.
The LIST option lists the resubstitution
classification results for each observation (Output 25.4.2).
The CROSSVALIDATE option displays
cross validation error-rate estimates (Output 25.4.3).
The OUTSTAT= option stores the calibration information
in a new data set to classify future observations.
A second PROC DISCRIM statement uses this calibration
information to classify a test data set.
Note that the values of the identification variable,
xvalues, are obtained by rereading the x1 through x4
fields in the data lines as a single character variable.
The following statements produce Output 25.4.1
through Output 25.4.3.
data crops;
title 'Discriminant Analysis of Remote Sensing Data
on Five Crops';
input Crop $ 4-13 x1-x4 xvalues $ 14-24;
datalines;
Corn 16 27 31 33
Corn 15 23 30 30
Corn 16 27 27 26
Corn 18 20 25 23
Corn 15 15 31 32
Corn 15 32 32 15
Corn 12 15 16 73
Soybeans 20 23 23 25
Soybeans 24 24 25 32
Soybeans 21 25 23 24
Soybeans 27 45 24 12
Soybeans 12 13 15 42
Soybeans 22 32 31 43
Cotton 31 32 33 34
Cotton 29 24 26 28
Cotton 34 32 28 45
Cotton 26 25 23 24
Cotton 53 48 75 26
Cotton 34 35 25 78
Sugarbeets22 23 25 42
Sugarbeets25 25 24 26
Sugarbeets34 25 16 52
Sugarbeets54 23 21 54
Sugarbeets25 43 32 15
Sugarbeets26 54 2 54
Clover 12 45 32 54
Clover 24 58 25 34
Clover 87 54 61 21
Clover 51 31 31 16
Clover 96 48 54 62
Clover 31 31 11 11
Clover 56 13 13 71
Clover 32 13 27 32
Clover 36 26 54 32
Clover 53 08 06 54
Clover 32 32 62 16
;
proc discrim data=crops outstat=cropstat
method=normal pool=yes
list crossvalidate;
class Crop;
priors prop;
id xvalues;
var x1-x4;
title2 'Using Linear Discriminant Function';
run;
Output 25.4.1: Linear Discriminant Function on Crop Data
|
| Discriminant Analysis of Remote Sensing Data on Five Crops |
| Using Linear Discriminant Function |
| Observations |
36 |
DF Total |
35 |
| Variables |
4 |
DF Within Classes |
31 |
| Classes |
5 |
DF Between Classes |
4 |
| Class Level Information |
| Crop |
Variable Name |
Frequency |
Weight |
Proportion |
Prior Probability |
| Clover |
Clover |
11 |
11.0000 |
0.305556 |
0.305556 |
| Corn |
Corn |
7 |
7.0000 |
0.194444 |
0.194444 |
| Cotton |
Cotton |
6 |
6.0000 |
0.166667 |
0.166667 |
| Soybeans |
Soybeans |
6 |
6.0000 |
0.166667 |
0.166667 |
| Sugarbeets |
Sugarbeets |
6 |
6.0000 |
0.166667 |
0.166667 |
|
|
| Discriminant Analysis of Remote Sensing Data on Five Crops |
| Using Linear Discriminant Function |
Pooled Covariance Matrix Information |
Covariance Matrix Rank |
Natural Log of the Determinant of the Covariance Matrix |
| 4 |
21.30189 |
|
|
| Discriminant Analysis of Remote Sensing Data on Five Crops |
| Using Linear Discriminant Function |
| Generalized Squared Distance to Crop |
| From Crop |
Clover |
Corn |
Cotton |
Soybeans |
Sugarbeets |
| Clover |
2.37125 |
7.52830 |
4.44969 |
6.16665 |
5.07262 |
| Corn |
6.62433 |
3.27522 |
5.46798 |
4.31383 |
6.47395 |
| Cotton |
3.23741 |
5.15968 |
3.58352 |
5.01819 |
4.87908 |
| Soybeans |
4.95438 |
4.00552 |
5.01819 |
3.58352 |
4.65998 |
| Sugarbeets |
3.86034 |
6.16564 |
4.87908 |
4.65998 |
3.58352 |
|
|
| Discriminant Analysis of Remote Sensing Data on Five Crops |
| Using Linear Discriminant Function |
| Linear Discriminant Function for Crop |
| Variable |
Clover |
Corn |
Cotton |
Soybeans |
Sugarbeets |
| Constant |
-10.98457 |
-7.72070 |
-11.46537 |
-7.28260 |
-9.80179 |
| x1 |
0.08907 |
-0.04180 |
0.02462 |
0.0000369 |
0.04245 |
| x2 |
0.17379 |
0.11970 |
0.17596 |
0.15896 |
0.20988 |
| x3 |
0.11899 |
0.16511 |
0.15880 |
0.10622 |
0.06540 |
| x4 |
0.15637 |
0.16768 |
0.18362 |
0.14133 |
0.16408 |
|
Output 25.4.2: Misclassified Observations: Resubstitution
|
| Discriminant Analysis of Remote Sensing Data on Five Crops |
| Using Linear Discriminant Function |
| The DISCRIM Procedure |
| Classification Results for Calibration Data: WORK.CROPS |
| Resubstitution Results using Linear Discriminant Function |
| Posterior Probability of Membership in Crop |
| xvalues |
From Crop |
Classified into Crop |
Clover |
Corn |
Cotton |
Soybeans |
Sugarbeets |
| 16 27 31 33 |
Corn |
Corn |
|
0.0894 |
0.4054 |
0.1763 |
0.2392 |
0.0897 |
| 15 23 30 30 |
Corn |
Corn |
|
0.0769 |
0.4558 |
0.1421 |
0.2530 |
0.0722 |
| 16 27 27 26 |
Corn |
Corn |
|
0.0982 |
0.3422 |
0.1365 |
0.3073 |
0.1157 |
| 18 20 25 23 |
Corn |
Corn |
|
0.1052 |
0.3634 |
0.1078 |
0.3281 |
0.0955 |
| 15 15 31 32 |
Corn |
Corn |
|
0.0588 |
0.5754 |
0.1173 |
0.2087 |
0.0398 |
| 15 32 32 15 |
Corn |
Soybeans |
* |
0.0972 |
0.3278 |
0.1318 |
0.3420 |
0.1011 |
| 12 15 16 73 |
Corn |
Corn |
|
0.0454 |
0.5238 |
0.1849 |
0.1376 |
0.1083 |
| 20 23 23 25 |
Soybeans |
Soybeans |
|
0.1330 |
0.2804 |
0.1176 |
0.3305 |
0.1385 |
| 24 24 25 32 |
Soybeans |
Soybeans |
|
0.1768 |
0.2483 |
0.1586 |
0.2660 |
0.1502 |
| 21 25 23 24 |
Soybeans |
Soybeans |
|
0.1481 |
0.2431 |
0.1200 |
0.3318 |
0.1570 |
| 27 45 24 12 |
Soybeans |
Sugarbeets |
* |
0.2357 |
0.0547 |
0.1016 |
0.2721 |
0.3359 |
| 12 13 15 42 |
Soybeans |
Corn |
* |
0.0549 |
0.4749 |
0.0920 |
0.2768 |
0.1013 |
| 22 32 31 43 |
Soybeans |
Cotton |
* |
0.1474 |
0.2606 |
0.2624 |
0.1848 |
0.1448 |
| 31 32 33 34 |
Cotton |
Clover |
* |
0.2815 |
0.1518 |
0.2377 |
0.1767 |
0.1523 |
| 29 24 26 28 |
Cotton |
Soybeans |
* |
0.2521 |
0.1842 |
0.1529 |
0.2549 |
0.1559 |
| 34 32 28 45 |
Cotton |
Clover |
* |
0.3125 |
0.1023 |
0.2404 |
0.1357 |
0.2091 |
| 26 25 23 24 |
Cotton |
Soybeans |
* |
0.2121 |
0.1809 |
0.1245 |
0.3045 |
0.1780 |
| 53 48 75 26 |
Cotton |
Clover |
* |
0.4837 |
0.0391 |
0.4384 |
0.0223 |
0.0166 |
| 34 35 25 78 |
Cotton |
Cotton |
|
0.2256 |
0.0794 |
0.3810 |
0.0592 |
0.2548 |
| 22 23 25 42 |
Sugarbeets |
Corn |
* |
0.1421 |
0.3066 |
0.1901 |
0.2231 |
0.1381 |
| 25 25 24 26 |
Sugarbeets |
Soybeans |
* |
0.1969 |
0.2050 |
0.1354 |
0.2960 |
0.1667 |
| 34 25 16 52 |
Sugarbeets |
Sugarbeets |
|
0.2928 |
0.0871 |
0.1665 |
0.1479 |
0.3056 |
| 54 23 21 54 |
Sugarbeets |
Clover |
* |
0.6215 |
0.0194 |
0.1250 |
0.0496 |
0.1845 |
| 25 43 32 15 |
Sugarbeets |
Soybeans |
* |
0.2258 |
0.1135 |
0.1646 |
0.2770 |
0.2191 |
| 26 54 2 54 |
Sugarbeets |
Sugarbeets |
|
0.0850 |
0.0081 |
0.0521 |
0.0661 |
0.7887 |
| 12 45 32 54 |
Clover |
Cotton |
* |
0.0693 |
0.2663 |
0.3394 |
0.1460 |
0.1789 |
| 24 58 25 34 |
Clover |
Sugarbeets |
* |
0.1647 |
0.0376 |
0.1680 |
0.1452 |
0.4845 |
| 87 54 61 21 |
Clover |
Clover |
|
0.9328 |
0.0003 |
0.0478 |
0.0025 |
0.0165 |
| 51 31 31 16 |
Clover |
Clover |
|
0.6642 |
0.0205 |
0.0872 |
0.0959 |
0.1322 |
| 96 48 54 62 |
Clover |
Clover |
|
0.9215 |
0.0002 |
0.0604 |
0.0007 |
0.0173 |
| 31 31 11 11 |
Clover |
Sugarbeets |
* |
0.2525 |
0.0402 |
0.0473 |
0.3012 |
0.3588 |
| 56 13 13 71 |
Clover |
Clover |
|
0.6132 |
0.0212 |
0.1226 |
0.0408 |
0.2023 |
| 32 13 27 32 |
Clover |
Clover |
|
0.2669 |
0.2616 |
0.1512 |
0.2260 |
0.0943 |
| 36 26 54 32 |
Clover |
Cotton |
* |
0.2650 |
0.2645 |
0.3495 |
0.0918 |
0.0292 |
| 53 08 06 54 |
Clover |
Clover |
|
0.5914 |
0.0237 |
0.0676 |
0.0781 |
0.2392 |
| 32 32 62 16 |
Clover |
Cotton |
* |
0.2163 |
0.3180 |
0.3327 |
0.1125 |
0.0206 |
| * Misclassified observation |
|
|
| Discriminant Analysis of Remote Sensing Data on Five Crops |
| Using Linear Discriminant Function |
| The DISCRIM Procedure |
| Classification Summary for Calibration Data: WORK.CROPS |
| Resubstitution Summary using Linear Discriminant Function |
| Number of Observations and Percent Classified into Crop |
| From Crop |
Clover |
Corn |
Cotton |
Soybeans |
Sugarbeets |
Total |
| Clover |
6
54.55 |
0
0.00 |
3
27.27 |
0
0.00 |
2
18.18 |
11
100.00 |
| Corn |
0
0.00 |
6
85.71 |
0
0.00 |
1
14.29 |
0
0.00 |
7
100.00 |
| Cotton |
3
50.00 |
0
0.00 |
1
16.67 |
2
33.33 |
0
0.00 |
6
100.00 |
| Soybeans |
0
0.00 |
1
16.67 |
1
16.67 |
3
50.00 |
1
16.67 |
6
100.00 |
| Sugarbeets |
1
16.67 |
1
16.67 |
0
0.00 |
2
33.33 |
2
33.33 |
6
100.00 |
| Total |
10
27.78 |
8
22.22 |
5
13.89 |
8
22.22 |
5
13.89 |
36
100.00 |
| Priors |
0.30556
|
0.19444
|
0.16667
|
0.16667
|
0.16667
|
|
| Error Count Estimates for Crop |
| |
Clover |
Corn |
Cotton |
Soybeans |
Sugarbeets |
Total |
| Rate |
0.4545 |
0.1429 |
0.8333 |
0.5000 |
0.6667 |
0.5000 |
| Priors |
0.3056 |
0.1944 |
0.1667 |
0.1667 |
0.1667 |
|
|
Output 25.4.3: Misclassified Observations: Cross Validation
|
| Discriminant Analysis of Remote Sensing Data on Five Crops |
| Using Linear Discriminant Function |
| The DISCRIM Procedure |
| Classification Summary for Calibration Data: WORK.CROPS |
| Cross-validation Summary using Linear Discriminant Function |
| Number of Observations and Percent Classified into Crop |
| From Crop |
Clover |
Corn |
Cotton |
Soybeans |
Sugarbeets |
Total |
| Clover |
4
36.36 |
3
27.27 |
1
9.09 |
0
0.00 |
3
27.27 |
11
100.00 |
| Corn |
0
0.00 |
4
57.14 |
1
14.29 |
2
28.57 |
0
0.00 |
7
100.00 |
| Cotton |
3
50.00 |
0
0.00 |
0
0.00 |
2
33.33 |
1
16.67 |
6
100.00 |
| Soybeans |
0
0.00 |
1
16.67 |
1
16.67 |
3
50.00 |
1
16.67 |
6
100.00 |
| Sugarbeets |
2
33.33 |
1
16.67 |
0
0.00 |
2
33.33 |
1
16.67 |
6
100.00 |
| Total |
9
25.00 |
9
25.00 |
3
8.33 |
9
25.00 |
6
16.67 |
36
100.00 |
| Priors |
0.30556
|
0.19444
|
0.16667
|
0.16667
|
0.16667
|
|
| Error Count Estimates for Crop |
| |
Clover |
Corn |
Cotton |
Soybeans |
Sugarbeets |
Total |
| Rate |
0.6364 |
0.4286 |
1.0000 |
0.5000 |
0.8333 |
0.6667 |
| Priors |
0.3056 |
0.1944 |
0.1667 |
0.1667 |
0.1667 |
|
|
Now use the calibration information stored in the Cropstat
data set to classify a test data set.
The TESTLIST option lists the classification
results for each observation in the test data set.
The following statements produce Output 25.4.4
and Output 25.4.5:
data test;
input Crop $ 1-10 x1-x4 xvalues $ 11-21;
datalines;
Corn 16 27 31 33
Soybeans 21 25 23 24
Cotton 29 24 26 28
Sugarbeets54 23 21 54
Clover 32 32 62 16
;
proc discrim data=cropstat testdata=test testout=tout
testlist;
class Crop;
testid xvalues;
var x1-x4;
title2 'Classification of Test Data';
run;
proc print data=tout;
title2 'Output Classification Results of Test Data';
run;
Output 25.4.4: Classification of Test Data
|
| Discriminant Analysis of Remote Sensing Data on Five Crops |
| Classification of Test Data |
| The DISCRIM Procedure |
| Classification Results for Test Data: WORK.TEST |
| Classification Results using Linear Discriminant Function |
| Posterior Probability of Membership in Crop |
| xvalues |
From Crop |
Classified into Crop |
Clover |
Corn |
Cotton |
Soybeans |
Sugarbeets |
| 16 27 31 33 |
Corn |
Corn |
|
0.0894 |
0.4054 |
0.1763 |
0.2392 |
0.0897 |
| 21 25 23 24 |
Soybeans |
Soybeans |
|
0.1481 |
0.2431 |
0.1200 |
0.3318 |
0.1570 |
| 29 24 26 28 |
Cotton |
Soybeans |
* |
0.2521 |
0.1842 |
0.1529 |
0.2549 |
0.1559 |
| 54 23 21 54 |
Sugarbeets |
Clover |
* |
0.6215 |
0.0194 |
0.1250 |
0.0496 |
0.1845 |
| 32 32 62 16 |
Clover |
Cotton |
* |
0.2163 |
0.3180 |
0.3327 |
0.1125 |
0.0206 |
| * Misclassified observation |
|
|
| Discriminant Analysis of Remote Sensing Data on Five Crops |
| Classification of Test Data |
| The DISCRIM Procedure |
| Classification Summary for Test Data: WORK.TEST |
| Classification Summary using Linear Discriminant Function |
| Number of Observations and Percent Classified into Crop |
| From Crop |
Clover |
Corn |
Cotton |
Soybeans |
Sugarbeets |
Total |
| Clover |
0
0.00 |
0
0.00 |
1
100.00 |
0
0.00 |
0
0.00 |
1
100.00 |
| Corn |
0
0.00 |
1
100.00 |
0
0.00 |
0
0.00 |
0
0.00 |
1
100.00 |
| Cotton |
0
0.00 |
0
0.00 |
0
0.00 |
1
100.00 |
0
0.00 |
1
100.00 |
| Soybeans |
0
0.00 |
0
0.00 |
0
0.00 |
1
100.00 |
0
0.00 |
1
100.00 |
| Sugarbeets |
1
100.00 |
0
0.00 |
0
0.00 |
0
0.00 |
0
0.00 |
1
100.00 |
| Total |
1
20.00 |
1
20.00 |
1
20.00 |
2
40.00 |
0
0.00 |
5
100.00 |
| Priors |
0.30556
|
0.19444
|
0.16667
|
0.16667
|
0.16667
|
|
| Error Count Estimates for Crop |
| |
Clover |
Corn |
Cotton |
Soybeans |
Sugarbeets |
Total |
| Rate |
1.0000 |
0.0000 |
1.0000 |
0.0000 |
1.0000 |
0.6389 |
| Priors |
0.3056 |
0.1944 |
0.1667 |
0.1667 |
0.1667 |
|
|
Output 25.4.5: Output Data Set of the Classification Results for Test Data
|
| Discriminant Analysis of Remote Sensing Data on Five Crops |
| Output Classification Results of Test Data |
| Obs |
Crop |
x1 |
x2 |
x3 |
x4 |
xvalues |
Clover |
Corn |
Cotton |
Soybeans |
Sugarbeets |
_INTO_ |
| 1 |
Corn |
16 |
27 |
31 |
33 |
16 27 31 33 |
0.08935 |
0.40543 |
0.17632 |
0.23918 |
0.08972 |
Corn |
| 2 |
Soybeans |
21 |
25 |
23 |
24 |
21 25 23 24 |
0.14811 |
0.24308 |
0.11999 |
0.33184 |
0.15698 |
Soybeans |
| 3 |
Cotton |
29 |
24 |
26 |
28 |
29 24 26 28 |
0.25213 |
0.18420 |
0.15294 |
0.25486 |
0.15588 |
Soybeans |
| 4 |
Sugarbeets |
54 |
23 |
21 |
54 |
54 23 21 54 |
0.62150 |
0.01937 |
0.12498 |
0.04962 |
0.18452 |
Clover |
| 5 |
Clover |
32 |
32 |
62 |
16 |
32 32 62 16 |
0.21633 |
0.31799 |
0.33266 |
0.11246 |
0.02056 |
Cotton |
|
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.