The assignment asks you to analyse the determinants of average school test scores using a dataset with
information on 500 schools in California, USA.
There are 4 different tasks you should perform:
1. Import the data into Stata and inspect it
2. Estimate a simple model of the determinants of test scores
3. Run additional specifications
4. Interpret the results
The data is provided in the "CAschools.xls" Excel file, where the first row consists of variable names.
You should write a short report as described below (no more than 2,000 words) and submit this with your dofile included as an appendix via TurnItIn to Blackboard, by 23.59 Friday 2nd December.
Part 1: Import the data into Stata and inspect it
1. At the beginning of the do-file, clearly note your name and student number.
2. Import the data from the Excel file.
3. Label the variables according to the accompanying 'CAschools Description' document.
4. Produce a table of summary statistics for the variables in the dataset.
5. Generate and export some scatter plots that will help you decide what independent variables to include
in the models you use in Part 3 below.
Part 2: Estimate a simple model of the determinants of test scores
1. Run a regression of the student teacher ratio, the share of students receiving free or reduced-price
school meals, the share of English language learners, and zip code median income on test scores.
2. Interpret the economic and statistical significance of your estimates.
3. Is the model successful in explaining the variation in test scores? Do you have reason to believe
omitted variable bias may be affecting any of your estimates?
Part 3: Run additional specifications
1. Extend the model from part 2 to include additional explanatory variables (note: run at least 4 additional
specifications).
2. Perform appropriate tests to investigate whether these new variables add explanatory power to the
model.
3. Does zip code median income have non-linear relationship with test scores?
4. Investigate whether the relationship between average teacher experience and test scores differs between
schools with above and below median shares of English language learners.
5. Save the data set, under the name 'Assignment'. Save the do-file and close the log file.
Part 4: Interpret the results
Write a report on your analysis, including your tables, graphs and interpretations from parts 1-3. Divide the
report into 3 sections corresponding to these Parts. The report should be up to 2000 words, without the
appendix. Copy your clearly commented dofile into an appendix.

The assignment asks you to analyse the determinants of average school test scores using a dataset with information on 500 schools in California USA There are 4 class=