Chapter Contents

Previous

Next
SAS/ACCESS Interface to CA-DATACOM/DB: Reference

Calculating Statistics

You can also use statistical procedures with CA-DATACOM/DB data. This section shows simple examples using the FREQ and MEANS procedures.


Using the FREQ Procedure

Suppose you want to find what percentage of your invoices went to each country so that you can decide where to increase your overseas marketing. The following example calculates the percentage of invoices for each country appearing in the CA-DATACOM/DB table INVOICE using the view descriptor VLIB.INV:

proc freq data=vlib.inv;
   tables country;
   title 'Data Described by VLIB.INV';
run;
Frequency Table for Field COUNTRY described by View Descriptor VLIB.INV shows the one-way frequency table this example generates.

Frequency Table for Field COUNTRY described by View Descriptor VLIB.INV
                           Data Described by VLIB.INV                          1
                                    COUNTRY

                                                 Cumulative  Cumulative
         COUNTRY            Frequency   Percent   Frequency    Percent
         --------------------------------------------------------------
         Argentina                 2      11.8           2       11.8
         Australia                 1       5.9           3       17.6
         Brazil                    4      23.5           7       41.2
         USA                      10      58.8          17      100.0

For more information on the FREQ procedure, see the SAS Procedures Guide.


Using the MEANS Procedure

Still analyzing recent orders, suppose you want to determine some statistics for each USA customer. The view descriptor VLIB.USAORDR accesses records from the ORDER table that have a SHIPTO value beginning with a 1, indicating a USA customer.

The following example generates the mean and sum of the length of material ordered and the fabric charges for each USA customer. Also included are the number of rows (N) and the number of missing values (NMISS).

proc means data=vlib.usaordr mean sum n nmiss maxdec=0;
   by shipto;
   var length fabricch;
   title 'Data Described by VLIB.USAORDR';
run;
The BY statement causes the interface view engine to generate ordering criteria so that the data are sorted. Statistics on Fabric Length and Charges for Each USA Customer shows some of the information produced by this example.

Statistics on Fabric Length and Charges for Each USA Customer
                         Data Described by VLIB.USAORDR                        1
-------------------------------- SHIPTO=14324742 -------------------------------


         Variable  Label           N  Nmiss          Mean           Sum
         --------------------------------------------------------------
         LENGTH    LENGTH          4      0          1095          4380
         FABRICCH  FABRICCHARGES   2      2       1934460       3868920
         --------------------------------------------------------------

-------------------------------- SHIPTO=14898029 -------------------------------


         Variable  Label           N  Nmiss          Mean           Sum
         --------------------------------------------------------------
         LENGTH    LENGTH          2      0          2500          5000
         FABRICCH  FABRICCHARGES   2      0       1400825       2801650
         --------------------------------------------------------------

-------------------------------- SHIPTO=15432147 -------------------------------


         Variable  Label           N  Nmiss          Mean           Sum
         --------------------------------------------------------------
         LENGTH    LENGTH          4      0           725          2900
         FABRICCH  FABRICCHARGES   2      2        252149        504297
         --------------------------------------------------------------

-------------------------------- SHIPTO=18543489 -------------------------------


         Variable  Label           N  Nmiss          Mean           Sum
         --------------------------------------------------------------
         LENGTH    LENGTH          6      0           303          1820
         FABRICCH  FABRICCHARGES   4      2      11063836      44255344
         --------------------------------------------------------------

-------------------------------- SHIPTO=19783482 -------------------------------


         Variable  Label           N  Nmiss          Mean           Sum
         --------------------------------------------------------------
         LENGTH    LENGTH          4      0           450          1800
         FABRICCH  FABRICCHARGES   4      0        252149       1008594
         --------------------------------------------------------------

-------------------------------- SHIPTO=19876078 -------------------------------


         Variable  Label           N  Nmiss          Mean           Sum
         --------------------------------------------------------------
         LENGTH    LENGTH          2      0           690          1380
         FABRICCH  FABRICCHARGES   0      2             .             .
         --------------------------------------------------------------

For more information on the MEANS procedure, see the SAS Procedures Guide.


Using the RANK Procedure

You can also use more advanced statistics procedures with CA-DATACOM/DB data. The following example uses the RANK procedure with data described by the view descriptor VLIB.EMPS to calculate the order of birthdays for a set of employees. This example creates a SAS data file MYDATA.RANKEX from the view descriptor VLIB.EMPS. It assigns the column name DATERANK to the new field created by the procedure. (The VLIB.EMPS view descriptor includes a WHERE clause to select only the employees whose job code is 602.)

proc rank data=vlib.emps out=vlib.rankexam;
   var birthdat;
   ranks daterank;
run;
proc print data=vlib.rankexam;
   title 'Order of Employee Birthdays';
run;
VLIB.EMPS is based on the CA-DATACOM/DB table EMPLOYEES. Ranking of Employee Birthdays shows the result of this example.

Ranking of Employee Birthdays
                          Order of Employee Birthdays                          1
     OBS     EMPID    JOBCODE    BIRTHDAT    LASTNAME              DATERANK

       1    456910       602     24SEP53     ARDIS                     5
       2    237642       602     13MAR54     BATTERSBY                 6
       3    239185       602     28AUG59     DOS REMEDIOS              7
       4    321783       602     03JUN35     GONZALES                  2
       5    120591       602     12FEB46     HAMMERSTEIN               4
       6    135673       602     21MAR61     HEMESLY                   8
       7    456921       602     12MAY62     KRAUSE                    9
       8    457232       602     15OCT63     LOVELL                   11
       9    423286       602     31OCT64     MIFUNE                   12
      10    216382       602     24JUL63     PURINTON                 10
      11    234967       602     21DEC67     SMITH                    13
      12    212916       602     29MAY28     WACHBERGER                1
      13    119012       602     05JAN46     WOLF-PROVENZA             3

For more information on the RANK procedure and other advanced statistics procedures, see the SAS Procedures Guide.


Chapter Contents

Previous

Next

Top of Page

Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.