Chapter Contents

Previous

Next
The UNIVARIATE Procedure

Example 7: Creating Schematic Plots and an Output Data Set with BY Groups


Procedure features:
PROC UNIVARIATE statement options:
NEXTROBS=
PLOT
PLOTSIZE=
BY statement
OUTPUT statement
Other features:
FORMAT statement
FORMAT procedure
PRINT procedure
SORT procedure
Data set: STATEPOP

This example


Program

options nodate pageno=1 linesize=120 pagesize=80;
 Note about code
proc format;
   value Regnfmt 1='Northeast'
D                 2='South'
                 3='Midwest'
                 4='West';
run;
 Note about code
data metropop;
   set statepop;
   keep Region Decade Populationcount;
   label PopulationCount='US Census Population (millions)'
         Decade='Census year';
   decade=1980;
   populationcount=sum(citypop_80,noncitypop_80);
   output;
   decade=1990;
   populationcount=sum(citypop_90,noncitypop_90);
   output;
  
 Note about code
proc sort data=metropop;
   by region decade;
run;
 Note about code
proc univariate data=metropop nextrobs=0
                plots plotsize=20 ;
 Note about code
   var populationcount;
 Note about code
   by region decade;
 Note about code
   output out=censtat sum=PopulationTotal mean=PopulationMean
          std=PopulationStdDeviation pctlpts=50 to 100 by 25 
          pctlpre=Pop_ ;
 Note about code
   format region regnfmt.;
   title 'United States Census of Population and Housing';
run;
 Note about code
proc print data=censtat;
   title1 'Statistics for Census Data by Decade and Region';
   title2 'Output Dataset From PROC UNIVARIATE';
run;


Output
The BY statement requests separate reports for each BY group. The first report contains univariate statistics for the 1980 Census, Northeast region. Using both the BY statement and a PLOTS option in the PROC statement produces the schematic plots on the last page of the output. Select the Side-by-side Box Plots from the Table of Contents to examine the graph. You can see and compare the data distribution for each region-year combination. [HTML Output]
 [Listing Output]

The CENSTAT data set includes the BY variables Region and Decade and contains eight observation, one for each BY group. [HTML Output]
 [Listing Output]


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.