Grading
Critiques 10%
Replications 30%
Midterm 20%
Final 40%
You get graded based directly on your replications, and also indirectly through the exams, which will have code- and paper-based questions based directly on the replications that you have undertaken. As you can see, the purpose of this course is to get you to do econometrics, via the replications.
Assignments
A Replication is an exercise in which you attempt to repeat the empirical exercise undertaken by the author of the paper. Typically, you will only replicate a small number of lines in a table of results. A good start is to do the stata tutorials linked below. All requests for Stata help should be preceded by you googling "help stata whatever" and typing "help whatever" in Stata. In Stata, the help menus have examples at the bottom. Good starts here "help use", "help recode", "help keep" and "help regress".
Reading
1. Green, William, Econometric Analysis, Prentice Hall, 5th/6th Edition, 2008.
2. Kennedy, Peter, A Guide to Econometrics, 5th or 6th Edition (Paperback).
3. Angrist, Joshua and Jorn-Steffen Pischke, Mostly Harmless Econometrics: An Empiricist's Companion (Paperback).
I will try to let you know where to look for relevant supporting material in these textbooks. In addition, I will post links to other material you may find helpful.
Lecture Notes to Introduce Ordinary Least Squares.
Useful links for learning Stata are at UCLA economics Stata Tutorials .
Benjamin Philips' Tips for Using Stata.
A nice introduction to quantile regression is in Koenker and Hallock.
Assignment 1a is due in-class Thursday 17 Jan
Write a 2-page critical review of the paper, focusing on empirical strengths, shortcomings and improvements. Do the authors try to do something interesting? Do they succeed? What would you have done?
Assignment 2b due in-class Thursday 7 Feb
These data are given at the state-year level. "nf1" gives the basic no-fault indicator. This replication will use the regress command in Stata, with weights a nd robust standard errors. Consider whether or not the authors should have clustered their standard errors and respond accordingly. "Count" gives the number of observations used to compute each datum in each state-year, and so should be used to weight all regressions (we'll learn about this later). Use hetero-robust standard errors with the ",r" subcommand (we'll learn about this later). Other variables give statistics about the distribution of the age at first marriage, computed at the state-year level. "p_**" gives the ** percentile of age at first marriage.
Here is the Stata code that I used to do the \ in-class Monte Carlo exercise on Thursday 24 Jan.
Lecture Notes on Panels.
Lecture Notes on GLS/FGLS.
Lecture Notes on Seemingly Unrelated Regression.
Useful Reading:
Sources for OLS: Greene, 5th Ed, Chapters 1-3; Kennedy, 5th Ed, Chapters 1-3; Angrist-Pischke Ch 1-3.
Sources for GLS, Heteroskedasticity, Panel Methods: Greene, Chapters 10-13: Kennedy, 5th Ed, Chapters 8, 14, 17, Appendix B; Angrist-Pischke Ch5 (for panel s tuff)
Sources for Endogeneity (and also SUR): Green, Chapters 14-15; Kennedy, 5th Ed, Chapters 9, 10; Angrist-Pischke Ch4.
Sources for Testing: Greene, 5th Ed, Chapters 5,6,4; Kennedy, 5th Ed, Chapter 4.
Sources for SUR: Greene, Chapter 14; Kennedy, 5th Ed, Chapter 10.
Sources for Selection Correction (Heckman Two-Step): Green Chapter 19.5 "Sample Selection".
Study Questions for the Midterm. You may also be interested in the midterm for 2010 and 2012 (I seem to have misplaced 2011). Grading keys are 2010 and 2012. The midterm will be comprised of 3-4 questions from the study questions, and 3-4 other questions. I recommend that you work on the study questions in groups. You should NOT delegate study questions and then report back; rather you should work on them together, so that you can understand them better.
Here is a grading key.
Lecture Notes on Endogeneity.
More on Venn Diagrams for Regression, Peter Kennedy, 2002. This paper presents the Ballentine Diagrams discussed in class, relating to Multicollinearity and Endogeneity.
Assignment 3: Due at the beginning of class Thursday 28 Feb
Assignment 3b: Due at the beginning of class Monday 11 March
The data are encrypted, with the encryption key emailed to you. Please note that the suffix "_i" means that the variable has had missing values imputed to the marginal distribution of values. You may choose to use imputed or not non-imputed versions of variables as you see fit. Please do not keep the data on any computer after you are finished with it.
Assignment 4a: Due at the beginning of class Monday 18 March
Replication, due at the beginning of class Thursday 28 March
p* are natural logs of prices for goods 1-13, which vary across province and year only, and are all equal to 0 in Ontario in 2002. Note that good 3 is rental shelter, so p3 is the log of the rental price and s3 is the household expenditure on rental shelter. Consequently, lots of households have s3=0, because they t pay rent. That is in fact the whole point of this exercise. Pricerural and pricebigcity are the natural logs of urban and nonurban price indices for owned accommodation. Pricebigcity = 0 in Ontario in 2002, and pricerural is less than zero for them, because accommodation is cheaper outside big cities.
z* are 22 demographic controls, with self-explanatory labels.
yearbuip is 6 decades, with 6 being the most recent. typdwelp is categorical, single-detached, condo etc. hhinctot is the total income of the household. hhszd31p is number of people in the household at December 31. numbedrp is number of bedrooms. numbthrp is number of bathrooms. rpagegrp is age of household head minus 40. rpmarp is marital status of respondent.
Note that total nonshelter consumption is the sum of 9 of the 10 included spending categories (s*), and does not include s3 (rental shelter expenditures).
redurent is an indicator of reduced rent and rental tenure. This is used to figure out who has t_i=1.
weight is the sample weight. Use aweight=weight to compute poverty statistics.
Replicate Table 3 just the first 7 rows (note that rho and sigma are constants but lambda is not a constant because you are using a homoskedastic model), and the "All" columns of Table 6 (without standard errors).
Please Note use a homoskedastic heckman correction rather than the heteroskedastic heckman correction (so that sigma_1 and sigma_2 do NOT depend on v1_i), and use the Stone index = exp(lnp'w) where lnp is the log-price vector and w is the expenditure share vector (where w_j=s_j/Sum_j s_j) as the price deflator. This means you will implement a simple version of section 3.2 (because it will be homoskedastic) and a super simple version of section 3.3 (because you will do no demand estimation, and simply deflate by the Stone index). Thus the equation on page 18 defining y^hat_i uses A=0.
Lecture Notes on Limited Dependent Variables.
Maximum Likelihood Notes are now found in Lecture Notes to Introduce Ordinary Least Squares (end of the doc, where ML is introduced) and Lecture Notes on Endogeneity (end of the doc, with application to endogeneity and selection correction), both above.
Lecture Notes on Confidence Intervals and Testing.
Assignment 5, due under my door by 4pm on 16 April.
Alternatively, you may choose to undertake any small econometric project you like. In this case, you MUST talk to me about it before 28 April to ensure that the data are available and the project is do-able.
Lecture Notes on OLS approaches to time-series econometrics.
The Final Exam
Please consult suggested readings above.
Half the exam will be drawn from study questions for the midterm (above) plus new study questions for the final.
There will be a long stata question as in the midterm, based on one of the assignments, as in the midterm.
There will be one question from the midterm.
There will be at least one question on the 4 papers you replicated for the course.
There will be no surprises (unless you thought the midterm was surprising---in that case, there will be exactly that many surprises).
Here is the final from 2011, and an answer key.
Please note that I will have extra office hours from 10-11am on Wednesday 17 and Thursday 18 April.