Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The FASTCLUS Procedure

Computational Resources

Let
n & = & {number of observations} \ 
v & = & {number of variables} \ 
c & = & {number of clusters} \ 
p & = & {number of passes over the data set} \

Memory

The memory required is approximately 4(19v + 12cv + 10c + 2 max(c+ 1, v)) bytes.

If you request the DISTANCE option, an additional 4c(c + 1) bytes of space is needed.

Time

The overall time required by PROC FASTCLUS is roughly proportional to nvcp if c is small with respect to n.

Initial seed selection requires one pass over the data set. If the observations are in random order, the time required is roughly proportional to

nvc + vc2
unless you specify REPLACE=NONE. In that case, a complete pass may not be necessary, and the time is roughly proportional to mvc, where c \leq m \leq n.

The DRIFT option, each iteration, and the final assignment of cluster seeds each require one pass, with time for each pass roughly proportional to nvc.

For greatest efficiency, you should list the variables in the VAR statement in order of decreasing variance.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.