|
Chapter Contents |
Previous |
Next |
| The CLUSTER Procedure |
This first example clusters ten American cities based on the flying mileages between them. Six clustering methods are shown with corresponding tree diagrams produced by the TREE procedure. The EML method cannot be used because it requires coordinate data. The other omitted methods produce the same clusters, although not the same distances between clusters, as one of the illustrated methods: complete linkage and the flexible-beta method yield the same clusters as Ward's method, McQuitty's similarity analysis produces the same clusters as average linkage, and the median method corresponds to the centroid method.
All of the methods suggest a division of the cities into two clusters along the east-west dimension. There is disagreement, however, about which cluster Denver should belong to. Some of the methods indicate a possible third cluster containing Denver and Houston. The following statements produce Output 23.1.1:
title 'Cluster Analysis of Flying Mileages Between 10 American Cities';
data mileages(type=distance);
input (atlanta chicago denver houston losangeles
miami newyork sanfran seattle washdc) (5.)
@55 city $15.;
datalines;
0 ATLANTA
587 0 CHICAGO
1212 920 0 DENVER
701 940 879 0 HOUSTON
1936 1745 831 1374 0 LOS ANGELES
604 1188 1726 968 2339 0 MIAMI
748 713 1631 1420 2451 1092 0 NEW YORK
2139 1858 949 1645 347 2594 2571 0 SAN FRANCISCO
2182 1737 1021 1891 959 2734 2408 678 0 SEATTLE
543 597 1494 1220 2300 923 205 2442 2329 0 WASHINGTON D.C.
;
/*---------------------- Average linkage --------------------*/
proc cluster data=mileages method=average pseudo;
id city;
run;
proc tree horizontal spaces=2;
id city;
run;
/*---------------------- Centroid method --------------------*/
proc cluster data=mileages method=centroid pseudo;
id city;
run;
proc tree horizontal spaces=2;
id city;
run;
/*-------- Density linkage with 3rd-nearest-neighbor --------*/
proc cluster data=mileages method=density k=3;
id city;
run;
proc tree horizontal spaces=2;
id city;
run;
/*--------------------- Single linkage ----------------------*/
proc cluster data=mileages method=single;
id city;
run;
proc tree horizontal spaces=2;
id city;
run;
/*--- Two-stage density linkage with 3rd-nearest-neighbor ---*/
proc cluster data=mileages method=twostage k=3;
id city;
run;
proc tree horizontal spaces=2;
id city;
run;
/* Ward's minimum variance with pseudo $F$ and $t^2$ statistics */
proc cluster data=mileages method=ward pseudo;
id city;
run;
proc tree horizontal spaces=2;
id city;
run;
Output 23.1.1: Statistics and Tree Diagrams for
Six Different Clustering Methods
|
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.