**** ROBOCALL STUDY **** File created March 12, 2012 **** This readme summarizes what files are used in creating this dataset, and what each file contributes. It doesn't contain information on the do files used to run regressions, but they should be self explanatory. **** * MATLAB FILES: * changenames.m takes the pollbypoll results provided by Elections Canada and replaces the candidate name with the the associated party name. In the case of Independents, the first Independent in a district is Independent, the second is independent2, and so on. For this to run, you need to run insheet.tc for 2006, 2008, 2011. It won't work the first time, but it will generate the list of csv files in each directory that this Matlab code calls. * strrepbatch replaces N strings in a cell array instead of one string, in order to save computation time and lines of code. * cell2csv.m is a function available on Mathworks, as is its documentation. * STATA DO FILES: *insheet.tc These files each read in 308 csv files, each containing the poll-by-poll results in a particular district. They then create each poll as a separate data file, and combine them into a master file. The also calculate some basic variables of interest, such as turnout. * merge This file does five things: First, it uploads the raw voting data from the raw data files (2008_master_data.dta, 2011_master_data.dta, and calculates statistics of interest. Second, it manually inputs the districts that were affected by robocalls. Third, it creates lags for all 2011 variables using the 2008 data. Fourth, it sets the sample, by determining which polls are absentee, mobile, or advance, and by dropping two districts. Finally, it calculates the weights and saves the result. * merge2006 This does the same thing as merge, except for 2006 and 2008 data. * mergewexpensesfinal This file combines the voting data in final_data.dta with information on conservative and opposition spending in the 2006 and 2008 elections. It collects data from two sources. The first is information on total election expenses in 2008 and 2011. The second is information on the 2011 spending limits. * montecarlo2 This file does Monte Carlo analysis on the districts that were reported to not be impacted by robocalls. It does this by sampling with replacement from the 279 districts where robocall==0 in our original data. 27 districts are randomly assigned to be the robocalled districts, and our analysis is performed for 2011 and 2008 (on the same sample of districts). This is repeate 1000 times. We can then calculate a distribution for the resulting t-statistics for 2011 and 2008. * CSV FILES: * (folder) These folders contain all the poll by poll data from . This data is downloadable from the Elections Canada website. * XLSX FILES: * table12masterlist This file contains the table12 file for 2006-2011. This allows us to match candidates to parties. These files are available on the elections Canada website. * STATA DTA FILES: * spendinglimit.dta This file contains 2011 spending limits for each district. These were obtained from the Elections Canada website. * exp_riding_important.dta These is the data on expenses as submitted by candidates in 2008. The raw data is available on the Elections Canada website.