Chapter Contents

Previous

Next
The G3D Procedure

SCATTER Statement


Creates three-dimensional scatter plots using values of three numeric variables from the input data set.

Requirements: Exactly one plot request is required.
Global statements: FOOTNOTE, TITLE
Alias: SCAT



Description

The SCATTER statement specifies one plot request that identifies the three numeric variables to plot. This statement automatically

You can use statement options to modify any of the three plot axes as well as the general appearance of the graph, control the viewing angle, and specify characteristics for reference lines. In addition, if the needles drawn from the data points to the base plane complicate a graph, you can suppress them.

You can use global statements to add text to the graph, and an Annotate data set to enhance the plot.

Syntax

SCATTER plot-request </ option(s)>;

plot-request must be

y*x=z

option(s) can be one or more options from any or all of the following categories:


Required Arguments

y*x=z
specifies three numeric variables from the input data set:

y
is one of the variables that is plotted on the horizontal (x-y) plane.

x
is another of the variables that is plotted on the horizontal (x-y) plane.

z
is the variable that is plotted on the vertical (z) axis.

The SCATTER statement does not require a full grid of observations for the horizontal variable.


Options

Options in a SCATTER statement affect all graphs that are produced by that statement. You can specify as many options as you want and list them in any order.

ANNOTATE=Annotate-data-set
ANNO=Annotate-data-set
specifies a data set to annotate plots that are produced by the SCATTER statement.
See also: The Annotate Data Set

CAXIS=axis-color
specifies a color for axis lines and tick marks. By default, axes display in the second color in the colors list.
Featured in: Rotating a Scatter Plot

COLOR='data-point-color' | data-point-color-variable
specifies a color name or a character variable in the input data set whose values are color names. These color values determine the color or colors of the shapes that represent a plot's data points. Color values must be valid color names for the device that is used. By default, plot shapes display in the third color in the current colors list.

If you specify COLOR='data-point-color', all shapes are drawn in that color. For example, the procedure uses BLUE for all graph shapes when you specify

color='blue'

If you specify COLOR=data-point-color-variable, the color of the symbol is determined by the value of the color variable for that observation. For example, the procedure uses the value of the variable CLASS as the color for each data point shape when you specify

color=class

Using COLOR=data-point-color-variable enables you to assign different colors to the shapes to classify data.
Featured in: Using Shapes in Scatter Plots

CTEXT=text-color
specifies a color for all text on the axes, including tick mark values and axis labels. If you omit this option, a color specification is searched for in this order:
  1. the CTEXT= option in a GOPTIONS statement

  2. the default, the first color in the colors list.

DESCRIPTION='entry-description'
DES='entry-description'
specifies the description of the catalog entry for the chart. The maximum length for entry-description is 40 characters. The description does not appear on the chart. By default, the procedure assigns a description of the form SCATTER OF y*x=z, where y*x=z is the request that is specified in the SCATTER statement.

GRID
draws reference lines at the major tick marks on all axes.
Featured in: Using Shapes in Scatter Plots

NAME='entry-name'
specifies the name of the catalog entry for the graph. The maximum length for entry-name is eight characters. The default name is G3D. If the specified name duplicates the name of an existing entry, SAS/GRAPH software adds a number to the duplicate name to create a unique entry, for example, G3D1.

NOAXIS
NOAXES
specifies that a plot have no axes, axis labels, or tick mark values.

NOLABEL
specifies that a plot have no axis labels or tick mark values. Use this option if you want to generate axis labels and tick mark values with an Annotate data set.

NONEEDLE
specifies that a plot have no lines that connect the shapes representing data points to the x-y plane. The NONEEDLE option option has no effect when SHAPE='PILLAR' or SHAPE='PRISM'.
Featured in: Using Shapes in Scatter Plots

ROTATE=angle-list
specifies one or more angles at which to rotate the x-y plane about the perpendicular z axis. The units for angle-list are degrees. By default, ROTATE=70. Angle-list is either an explicit list of values, or a starting and an ending value with an interval increment, or a combination of both forms:
n <...n>
n TO n <BY increment>
n <...n> TO n <BY increment > <n <...n> >

The values specified in angle-list can be negative or positive and can be larger than 360°. For example, a rotation angle of 45° can also be expressed

rotate=405
rotate=-315

You can specify a sequence of angles to produce separate graphs for each angle. The angles that are specified in the ROTATE= option are paired with any angles that are specified with the TILT= option. If one option contains fewer values than the other, the last value in the shorter list is paired with the remaining values in the longer list.
See also: TILT= option.
Featured in: Rotating a Scatter Plot

SHAPE='symbol-name' | shape-variable
specifies a symbol name or a character variable whose values are symbol names. Symbols represent a scatter plot's data points. By default, SHAPE='PYRAMID'.

Values for symbol-name are
BALLOON DIAMOND PRISM
CLUB FLAG PYRAMID
CROSS HEART SPADE
CUBE PILLAR SQUARE
CYLINDER POINT STAR.

Scatter Plot Symbols illustrates these symbol types with needles.

Scatter Plot Symbols

[IMAGE]

If you specify SHAPE='symbol-name', all data points are drawn in that shape. For example, the procedure draws all data points as balloons when you specify

shape='balloon'

If you specify SHAPE=shape-variable, the shape of the data point is determined by the value of the shape variable for that observation. For example, the procedure uses the value of the variable CLASS for a particular observation as the shape for that data point when you specify

shape=class

Using SHAPE=shape-variable enables you to assign different shapes to the data points to classify data.
Featured in: Using Shapes in Scatter Plots

SIZE=symbol-size | size-variable
specifies either a constant or a numeric variable, the values of which determine the size of symbol shapes on the scatter plot.

If you specify SIZE=symbol-size, all data points are drawn in that size. For example, if you specify SIZE=3, the procedure draws all symbol shapes three times the normal size. By default, SIZE=1.0. The units are in default symbol size.

If you specify SIZE=size-variable, the size of the data point is determined by the value of the size variable for that observation. For example, when you specify SIZE=CLASS, the procedure uses the value of the variable CLASS for each observation as the size of that data point. If you use SIZE=size-variable, you can assign different sizes to the data points to classify data.
Featured in: Rotating a Scatter Plot

TILT=angle-list
specifies one or more angles at which to tilt the graph toward you. The units for angle-list are degrees. By default, TILT=70. Angle-list is either an explicit list of values, or a starting and an ending value with an interval increment, or a combination of both forms:
n <...n>
n TO n <BY increment>
n <...n> TO n <BY increment > <n <...n> >

The values that are specified in angle-list must be 0 through 90.

You can specify a sequence of angles to produce separate graphs for each angle. The angles that are specified in the TILT= option are paired with any angles that are specified with the ROTATE= option. If one option contains fewer values than the other, the last value in the shorter list is paired with the remaining values in the longer list.
See also: ROTATE= option

XTICKNUM=number-of-ticks
YTICKNUM=number-of-ticks
ZTICKNUM=number-of-ticks
specify the number of major tick marks that are located on a plot's x,{ it y}, or z axis, respectively. The value for n must be 2 or greater. By default, XTICKNUM=4, YTICKNUM=4, and ZTICKNUM=4.
Featured in: Rotating a Scatter Plot

ZMAX=max-value
ZMIN=min-value
specify the maximum and minimum values that are displayed on a plot's z axis. By default, the z axis is defined by the minimum and maximum z values in the data. You can use the ZMIN= and ZMAX= options to extend the z axis beyond this range. The value that is specified by ZMAX= must be greater than that specified by ZMIN=. If you specify a ZMAX= or ZMIN= value within the actual range of the z variable values, the plot's data values are clipped at the specified level.
Featured in: Rotating a Scatter Plot


Changing the Appearance of the Points

Use the COLOR=, SHAPE=, and SIZE= options to change the appearance of your scatter plot or to classify data using color, shape, size, or any combination of these features. Scatter Plot Symbols illustrates the shape names that you can specify in the SHAPE= option.

For example, to make all of the data points red balloons at twice the normal size, use

scatter y*x=z /color='red' shape='balloon' size=2;

To size your points according to the values of the variable TYPE in your input data set, use

scatter y*x=z / size=type;

For an example, see Using Shapes in Scatter Plots.


Simulating an Overlaid Scatter Plot

You can approximate an overlaid scatter plot by graphing multiple values for the vertical (z) variables for a single (x, y) position in a single scatter plot. To do this, add a small value to the value of one of the horizontal variables (x or y) to give the observation a slightly different (x, y) position. Thus, you enable the procedure to plot both values of the vertical (z) variable. Represent each different vertical (z) variable with a different symbol, size, or color. The resulting plot appears to be multiple plots overlaid on the same axes.

For example, suppose you want to graph a data set that contains two values for the vertical variable Z for each combination of variables X and Y. You could produce the original data set with a DATA step like this:

data planes;
   input x y z shape $;
   datalines;
1 1 1 PRISM
1 2 1 PRISM
1 3 1 PRISM
2 1 1 PRISM
2 2 1 PRISM
2 3 1 PRISM
3 1 1 PRISM
3 2 1 PRISM
3 3 1 PRISM
1 1 2 BALLOON
1 2 2 BALLOON
1 3 2 BALLOON
2 1 2 BALLOON
2 2 2 BALLOON
2 3 2 BALLOON
3 1 2 BALLOON
3 2 2 BALLOON
3 3 2 BALLOON
;

The SHAPE variable is assigned a different value for each different Z value for a single combination of X and Y values.

Ordinarily, the SCATTER statement only plots the Z value for the last observation for a single combination of X and Y. However, you can use a DATA step to assign a slightly different x, y position to all observations where Z is greater than 1:

data planes2;
   set planes;
   if z > 1 then x = x + .000001;
run;

Then you can use a SCATTER statement to produce a plot like the one in Simulated Overlaid Scatter Plot:

proc g3d data=planes2;
   scatter x*y=z / zmin=0 shape=shape;
run;
quit;

Simulated Overlaid Scatter Plot

[IMAGE]


Reversing Values on an Axis

Although you can use the SCATTER statement's ROTATE option to alter the view of a plot and therefore the general orientation to axes values, you cannot use SCATTER statement options to reverse axis values for one of the plot variables. To do this, you can multiply that variable's values by -1 to reverse the values themselves, which has the result of reversing the axis when those values are used to generate a plot. You should then use PROC FORMAT to define a format that displays the variable's values as they exist in the original data.

For example, the following code generates the scatter plot shown in Default Y-axis Order:

data original;
   input y x z;
   datalines;
-1.15 1 .01
-1.00 2 .02
 1.20 3 .03
 1.25 4 .04
 1.50 5 .05
 2.10 1 .06
 2.15 2 .07
 2.20 3 .08
 2.25 4 .09
 2.30 5 .10
;

title1 'Default Y Axis Order';

/* default Y axis order */
proc g3d data=original;
   scatter y * x = z;
run;

Default Y-axis Order

[IMAGE]

To reverse the Y axis in the plot that is shown in Default Y-axis Order, you can write a DATA step like the following to reverse the Y values and, therefore, reverse the Y axis when the values are plotted:

data minus_y;
   set original;
   y=-y;
run;

The previous code creates the MINUS_Y data set by reading the ORIGINAL data set, and then multiplying the values of variable Y by -1. Although plotting Y values from the MINUS_Y data set would reverse values on the Y axis, it would misrepresent the original data. Such a plot would label the axis with the negative-Y values. You can correct the problem by using PROC FORMAT to display Y values as they are stored in the ORIGINAL data set:

proc format;
  picture reverse
     low - < 0  = '09.00'
     0 < - high = '09.00' (prefix='-')
     0          = '09.00';
run;

Here, the PICTURE statement defines a picture format named REVERSE, which you can refer to in DATA and PROC steps by using the name followed by a period. A picture format is a template for printing numbers. The '09.00' specifications are digit selectors that indicate which digits or columns in the variable values will display in output; columns that do not have a specified digit selector will not be displayed in output. Thus, a picture format for displaying the values of variable Y needs a column for a minus sign, a column for units, and two columns for decimals. The digit selector 0 specifies that no leading zeros will display in a column, and the digit selector 9 specifies that a leading zero will display in a column.

The PICTURE statement defines this new picture format for three data ranges. The lowest value in the data up to but not including zero will display with no prefix, which means negative values will display without a minus sign. All values above (but not including) zero to the highest value in the data will be displayed with the specified prefix, which in this case is a minus sign. Because zero is excluded from both ranges, it is assigned its own picture with no prefix.

You can now assign the REVERSE format to the Y values from the MINUS_Y data set and use Y to generate a scatter plot. The resulting plot displays Y's negative values without a prefix, and its positive values display with a minus sign prefix. This effectively represents Y values as they are stored internally in the ORIGINAL data set, thus correcting the data misrepresentation that results from multiplying Y by -1.

The following code generates the scatter plot shown in Reverse Y-axis Order:

title1 'Reverse Y Axis Order';

/* reverses order of default Y axis */
proc g3d data=minus_y;
   format y reverse.;
   scatter y * x = z;
run;
quit;

Reverse Y-axis Order

[IMAGE]


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.