Chapter Contents

Previous

Next
The GMAP Procedure

Concepts

The GMAP procedure requires a map data set and a response data set. These two data sets must contain the required variables or the procedure stops with an error message. You can use the same data set as both the map data set and the response data set, as long as the requirements are met. If a different data set is used as the response data set, it must contain an ID variable that is identical to the ID variable in the map data set.


About Map Data Sets

A map data set is a SAS data set that contains coordinates that define the boundaries of map areas, such as states or counties. A map data set must contain at least these variables:

The X and Y variable values in the map data set do not have to be in any specific units because they are rescaled by the GMAP procedure based on the minimum and maximum values in the data set. The minimum X and Y values are in the lower-left corner of the map, and the maximum X and Y values are in the upper-right corner.

Map data sets in which the X and Y variables contain longitude and latitude should be projected before you use them with PROC GMAP. See The GPROJECT Procedure for details.

Optionally, the map data set also can contain a variable named SEGMENT to identify map areas that comprise noncontiguous polygons. Each unique value of the SEGMENT variable within a single map area defines a distinct polygon. If the SEGMENT variable is not present, each map area is drawn as a separate closed polygon that indicates a single segment.

The observations for each segment of a map area in the map data set must occur in the order in which the points are to be joined. The GMAP procedure forms map area outlines by connecting the boundary points of each segment in the order in which they appear in the data set, eventually joining the last point to the first point to complete the polygon.

Any variables in the map data set other than the ones mentioned above are ignored for the purpose of determining map boundaries.


About SAS/GRAPH Map Data Sets

In addition to the variables described in About Map Data Sets, the SAS/GRAPH map data sets may also contain the following variables:

The GMAP procedure uses the values of the X and Y variables to draw the map. Therefore, if you want to produce an unprojected map by using the values in LONG and LAT, you would have to rename LONG and LAT to X and Y first.

SAS/GRAPH includes a number of predefined map data sets. These data sets are described in SAS/GRAPH Map Data Sets.

Map Data Sets Containing X, Y, LONG, and LAT

Most Institute-supplied map data sets contain four coordinate variables (X, Y, LONG, and LAT). In this case, X and Y are always projected values that will be used by the GRAPH procedures (by default). If you need to use the unprojected values that are contained in the LONG and LAT variables, you will need to rename the LONG and LAT variables to X and Y since the GMAP procedure automatically uses X and Y. See Input Map Data Sets that Contain Both Projected and Unprojected Values for more details.

Map Data Sets Containing Only X and Y

The Institute-supplied map data sets that contain X and Y variables (and no LONG and LAT variables), are usually projected maps. However, there are a few map data sets for the US and Canada that contain X and Y values that are unprojected longitude and latitude. In this case, you will need to use the GPROJECT procedure to project the map (see The GPROJECT Procedure).

Note:   You can determine whether a SAS map data set is projected or unprojected by looking at the description of each variable that is displayed when you use the CONTENTS procedure or by browsing the MAPS.METAMAPS data set.  [cautionend]

Specialty Map Data Sets

There are several map data sets available with SAS/GRAPH that allow you to easily label maps:

MAPS.USCENTER
contains the X and Y coordinates of the visual center of each state in the U.S. and Washington, D.C., as well as points in the ocean for states that are too small to contain a label. You can use MAPS.USCENTER with the MAPS.US, MAPS.USCOUNTY, MAPS.COUNTIES, and MAPS.COUNTY data sets.

MAPS.USCITY
contains the X and Y coordinates of selected cities in the U.S. Many city names occur in more than one state, so you may have to subset by state to avoid duplication. You can use MAPS.USCITY with the MAPS.US, MAPS.USCOUNTY, MAPS.COUNTIES, and MAPS.COUNTY data sets.

MAPS.CANCENS
contains the names of the Canadian census divisions. You can use MAPS.CANCENS with the MAPS.CANADA and MAPS.CANADA3 data sets.

See the MAPS.METAMAPS data set for details on each of the Institute-supplied map data sets.


About Response Data Sets

A response data set is a SAS data set that contains

The response data set can contain other variables in addition to these required variables.

The values of the map area identification variables in the response data set determine the map areas to be included on the map unless you use the ALL option in the PROC GMAP statement. That is, unless you use ALL in the PROC GMAP statement, only the map areas with response values are shown on the map. As a result, you do not need to subset your map data set if you are mapping only a small section of the map. However, if you map the same small section frequently, create a subset of the map data set for efficiency.

For choropleth, block, and prism maps, the response variables can be either character or numeric. For surface maps, the response variables must be numeric with only positive values.

About Response Variables

The GMAP procedure can produce block, choropleth, and prism maps for both numeric and character response variables. Numeric variables fall into two categories: discrete and continuous.

Numeric response variables are always treated as continuous variables unless the DISCRETE option is used in the action statement.

About Response Levels

Response levels are the values that identify categories of data on the graph. The categories that are shown on the graph are based on the values of the response variable. Based on the type of the response variable, a response level can represent these values:

The BLOCK, CHORO, and PRISM statements assign patterns to response levels. In CHORO and PRISM maps, response levels are shown as map areas. However, in BLOCK maps, response levels are shown as blocks. The default fill pattern for the response level is solid.

PATTERN statements can define the fill patterns and colors for both blocks and map areas. PATTERN definitions that define valid block patterns are applied to the blocks (response levels), and PATTERN definitions that define valid map patterns are applied to map areas.

See PATTERN Statement for more information on fill pattern values and default pattern rotation.


About Identification Variables

Identification (ID) variables are common to both the map data set and the response data set. They identify the map areas (for example, counties, states, or provinces) that make up the map. A unit area or map area is a group of observations with the same ID value. The GMAP procedure matches the value of the response variables for each map area in the response data set to the corresponding map area in the map data set to create the output graphs.


Displaying Map Areas and Response Data

Whether the GMAP procedure draws a map area and whether it displays patterns for response values depends on the contents of the response data set and on the ALL and MISSING options. Displaying Map Areas and Response Data describes the conditions under which the procedure does or does not display map areas and response data.

Displaying Map Areas and Response Data
If the response data set... And if... Then the procedure...
includes the map area the map area has a response value draws the map area and displays the response data
includes the map area the map area has no response value (that is, the value is missing) draws the map area but leaves it empty
includes the map area the map area has no response value and the MISSING option is used in the map statement draws the map area and displays a response level for the missing value
does not include the map area the ALL option is used in the PROC GMAP statement draws the map area but leaves it empty
does not include the map area the ALL option is not used does not draw the map area


Summary of Use

To use the GMAP procedure, you must do the following:

  1. If necessary, issue a LIBNAME statement for the SAS data library that contains the map data set that you want to display.

  2. Determine what processing needs to be done to the map data set before it is displayed. Use the GPROJECT, GREDUCE, and GREMOVE procedures or a DATA step to perform the necessary processing.

  3. Issue a LIBNAME statement for the SAS data set that contains the response data set, or use a DATA step to create a response data set.

  4. Use the PROC GMAP statement to identify the map and response data sets.

  5. Use the ID statement to name the identification variable(s).

  6. Use a BLOCK, CHORO, PRISM, or SURFACE statement to identify the response variable and generate the map.


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.