Main Page Assignment Answers Project Page Project Problems
Email Lab TAs
Lab Hours ? Open Labs ? The Lab TA Printing Postscript
Professors' Web Pages

The BUEC 333 Project

If You are having trouble replicating the Instructor's Output, try this new link
Replicate Output


TABLE OF CONTENTS

Section 1 - How to get going on your project

How to get the Explanatory Variables

How to get the Dependent Variable

Notes on the Functional Form and the Variables

An example Excel File

Section 2 - How to get some extra data for your project

How Data is Organized in Cansim

How to download the data associated with specific CANSIM Series Numbers using the World Wide Web

How to Find Extra data for your project - the quick and easy way

Guide to a Great Project

Notes on Getting Help with Your Project from the Lab TA and your own TA.



How to get the Explanatory Variables

It is really easy. Simply download this file and save it to your disk:

expl_var.xls

Top of Answers Top of Project Top of Main Email Lab TAs

How to get the Dependent Variable

This is a little more complicated. The following file contains all the dependent variables.

dep_vars.xls

You will need to follow these steps

Download the explainatory variables and the dependent variables.

Start Excel and open up the explainatory variables file from your disk. Remember to call it a "tab" and "space" Delimited file, as you did for the first assignments.

Highlight the whole "B" coloumn. Do this by clicking on the "B". Go to the Insert menu and select "Coloumn". You should now end up with a space between you observation numbers and the first explainatory vairable.

Goto the File menu and select open. Open up the dependent variables file. Excel will automatically put it into a second excel window. Go through the same process you went through to load the explainatory variables.

Find your assigned dependent variable in the list of twenty. Remember that you can not pick a different vairable, and you must be able to at leat replicate exactly the output you have been given.

Select the whole coloumn, goto to the Edit menu and select copy.

Goto the Window menu and switch back to the Excel window with the explainatory variables.

Paste your dependent variable into the empty "B" coloumn.

You are now ready to go.

Top of Answers Top of Project Top of Main Email Lab TAs

Notes on the Functional Form and the Variables

Make sure you figure out what your dependent and explainatory variables are.

This is critically important.

All the TAs are in Economics. They will be looking for at least some economic theory. If you do not know what each coloumn of explainatory variables represents, you will not be able to come up with any theory. The theory doesn't have to be complicat ed. It can be as simple as "When the demand increases, and supply stays the same, then the price should go up.".

How do you figure out what "P70000" represents? You can look at these two files:

dep_vars.lab

expl_var.lab

The data has not been organized into any kind of logical functional form. To replicated the output on your hand out, you will have to use this illogical linear form. However, after that, you should think carefully about how you might want to transform the data. For instance, you might want to deflate some of the variables using CPI.

Top of Answers Top of Project Top of Main Email Lab TAs

An example Excel File

Here is an example Excel file. It has examples of some, but not all of the tests that you need to perform to complete you project.

You should be able to directly load it into Excel, scroll through it and get an idea of what you should be aiming for.

p-example.xls

Top of Answers Top of Project Top of Main Email Lab TAs


How Data is Organized in Cansim

You will risk failing your project if you do not understand exactly what variables you have retrieved from Cansim. Cansim is not organized in a very straightforward manner, thus it is easy to misunderstand what data you are actually using. As a result , it is crucial that you read this section thoroughly.

Cansim Data is organized into large groups called Matrices. Each Matrix contains several different data series. A series is a collection of data observations for a particular concept or element; comparable to a column in a table.

An example of a Cansim Matrix is as follows:

Matrix 440

The Title is :

STARTS, COMPLETIONS & UNDER CONSTRUCTION; DWELLING UNITS BY TYPE, METROPOLITAN AREAS, ANNUAL, ACTUAL DATA.

Here are some examples of some of the series in matrix 440:

  • D847002 - Calgary Starts Singles
  • D847047 - Vancouver Starts Semi-Detached
  • D847173 - Vancouver Completions Rows
  • D847404 - Vancouver Completions Total

What data do we have in each of these series?

The first series contains annual actual data on the number of starts of singles homes in Calgary. Note that the data is actual as opposed to seasonally adjusted. In this example, it would be important to determine how Cansim defines the difference bet ween housing starts and houses under construction.

Several students in pervious classes have done very poorly on their projects because they failed to understand the difference between the Matrix title and the actual series they used.

When I was a TA, a couple of my students said that their dependent variable was

the number of starts, under-construction and completions. What they gave as their dependent variable was the title of the matrix, not the title of the data series. How can a house be counted as both completed and under-construction?

Both student's received a near failing mark on their paper.

It is important that you read the matrix title, because it will tell you whether the data is annual Vs monthly or actual Vs seasonal, etc, and whether it data on people, cashflows, interest rates, etc. However, the series title is the place to find out what data you actually have.

Top of Answers Top of Project Top of Main Email Lab TAs

How to Find CANSIM Series Numbers on the Web

The University of Toronto has a great CANSIM homepage. The address is http://datacenter.chass.utoro nto.ca:5680/cansim/cansim.htm

  • Open up the UofT Web Page
  • .
  • Scroll down the page and goto the section called "Retrieve multiple Cansim Series by Label". That means lick on the blue words Retrieve Multiple Cansim Series by Label.
  • Then enter your series numbers in the box at the top of the page. For example:

    (D847002 , D847003 , D847004, D847005)

    Make sure that your do not forget to include the brackets.

  • Select the appropriate start and end dates
  • Choose the data output format. You will want to do this twice.

    The first time you will need to get the data in Spreadsheet format. Once the data comes up on the screen, save it to disk, just like you saved the real_estate_sales.data file.

    The second time you get the data, you will want to save it in plain format. Saving it in plain format will give you a record of the all the header information on the data.

  • Click on the Retrieve Button
  • Once you have got the data, you can save it to you disk directly from Netscape, just as you did with earlier assignments.
  • If you are using a Mac, you will have to use DROP-TEXT to clean out the formating, as you did with your first assignment.

Notes:

  1. Some of you are confused as to exactly what a header is. A header is simply a file which tells you what all the numbers mean. In your past assignments, you needed to look at the header to work out what the numbers in the first, second, third, etc columns actually represented. Was the first column or the second column that contained the sales prices observations? Sometimes it is difficult to remember, so you look at your header to make sure you know what each column of numbers represents.
  2. Make sure that you do not save the two data formats to the same file name. For instance do not save the Spreadsheet version of the data to Project.data and then save the Plain format version of the data to a file that is also called Projec t.data. Maybe you can call them Project-spread.data and Project-plain.data.

Warning:

TAs do not take kindly to people you have not bothered to figure out what their series numbers actually represent. The little series titles that were given to you with the information on your project are not enough information. The titles do not tell you whether the data is RAW or SEASONALLY ADJUSTED. The titles do not tell you units.

How do you find out exactly what your number represent?

Read the section above on how CANSIM organizes data. Or better yet, get a feel for the relationship between the title of the MATRIX and the title of the SERIES by looking the Alphabetical Matrix Index. Note, however, that this only shows you how the data is organized. It is not a very good way to look for Extra data. See below for more information on that.

Once you have got your data, make sure that you look at all the header information in the plain format version of your data file and make sure that you fully understand and explain what your explainitory and dependent variables are. Econometrics is abou t economics first and statistics second. In order to have your Economics right, you need to know what you data is, otherwise how can you explain it with any kind of economic theory?

Top of Answers Top of Project Top of Main Email Lab TAs

How to Find Extra data for your project - the quick and easy way

The best way to find and extra variable for your project is to use a keyword search.

Look for link entitled search CANSIM Index Files on the UofT homepage, or simply follow this link. http://datacenter.chass.utoronto.ca:5680/cansim/search.html

Here are a few pointers for speeding up you search:

  • Choose Main Index or Full Text Index from the Index Files Box.
  • For the Binary Search Operator, use the word "and"
  • The Best Tip of All: If, for instance, you are looking for monthly data on beer sales, type in beer and month or beer and monthly as your key words. If you don't do this, you might spend a long time looking through data that will de difficult to use because it is quarterly or annual, and not monthly.

DO NOT USE the ALPHABETIC MATRIX INDEX to look for extra data. It is an increadibly slow and confusing way of looking through the database.

Top of Answers Top of Project Top of Main Email Lab TAs

Steps to a Great Project

The steps for completing a project are as follows:

  1. Find a good economic theory that helps you to explain your dependent variable. Your TA will help you with this. It is not a good idea to simply make up this theory yourself. Instead, there are many texts and articles which have addressed the iss ue that you will be looking at.

    YOU MUST USE A REASONABLE ECONOMIC THEORY! If you decide to regress cement production on rainfall, you might get a decent fit, but you will fail your project. What exactly is a good theory? The theory must give you two things. First, it must prov ide you with a list of explainitory variables. For example, demand for ice-cream is related to its price and the temperature. Second, the theory must give the fuctional form of the relationship between the dependent variables and the independent variabl es. This relationship will be expressed in a formula, such as

    lnY=a*lnK+(1-a)*lnL.

    or

    Y=a+bK+cL.

    Above are examples of two different functional forms. Note that both fuctional forms are linear in the parameters.

  2. Get the actual data from Cansim.
  3. Import the Data into Excel, and start by running a Multiple Regression. After de-bugging, you get your first draft of econometric results.
  4. After interpreting these initial results, you will want to conduct further econometric inquiries, correct for problems, etc.

    This is where you will pick up most of your marks for your project. Talk to your TA about your initial regressions. Inevitably, there will be some problems with the regressions. For instance, you might have problems with low explanatory power, Heter oscedasticity, multicollinearity or autocorrelation. You can best demonstrate your understanding of econometrics by successfully identifying and in some cases, correcting for these problems. You will also want to conduct t tests on each variable and F -tests on combinations of various variables to see if they are significant.

  5. When your project is handed out to you, you will be given a portion of output that you should aim to replicate. Replicating this output is only a begining. The main aim of this protion of the project is to give you and indication that you are on t he right track. If you output does not match the Instructor's output, then you have done something wrong.

    It is probably a good idea to place your replication of the Instructor's results in an appendix. Maybe Appendix A. Thus, Appendix A would be output from a workbook, where the output exactly matched your instructor's output.

    Appendix B could include output from a second EXCEL workbook that had additional tests that you might want to perform on the original model. For Instance, you may feel that the Instructor has left out some important F-Tests, or should have checked for s omething else. Why would the Instructor do this to you. The project handout is only meant to be a guide, it is not designed to do all the work for you.

    Appendix C could then include output from a third EXCEL workbook in which you run a regression on an expanded version of the model, with extra variables and a full battery of appropriate tests. As mentioned above, this is where you will pick up most of your marks.

  6. Your TAs will be looking for your use of logic more than anything else. Simply performing various tests without explaining why you have chosen to perform those tests entirely misses the point. For example, a cla ssic mistake is to perform a whole series of T and F tests and then make a series of conclusions about whether or not to include various variables with out first checking for Heteroskedasticiy, Autocorrelation or Multicollinearity. All three problems ca use difficulties in inference. In other words, if you have Heteroscedasticity, Multicollinearity or Autocorrelation, your testing procedures are no longer valid. The T test and F tests do not work properly and their conclusions need to be used cautiou sly or ignored.

    Thus, the keys are :

    1. Why am I doing this test?

    2. What does the test tell me?

    3. What are the implications of the test results?

Top of Answers Top of Project Top of Main Email Lab TAs

Notes on Getting Help with Your Project from the Lab TA and your own TA.

Once you have run your initial regressions, having a printout of your workbook ready to show the TA makes it easier for them to give you useful advice on your project.

Top of Answers Top of Project Top of Main Email Lab TAs


Created: 11/4/96 Updated: 2/25/97