Classification and Regression Trees
1. Boston Housing data. Random sample a training data set that contains 90% of the original data points.
(i) Fit a regression tree (CART) on the training data. Report the model's in-sample MSE performance.
(ii) Test the out-of-sample performance. Using tree model built from (i) on the raining data, test with the remaining 10% testing data. Report out-of- sample model MSE.
(iii) Conduct linear regression using all explanatory variables except "indus" and "age" on the training data. Report the model's in-sample MSE. Test the out-of-sample performance with the remaining 10% testing data. Report out-of-sample model MSE etc?
(iv) What do you find comparing CART to the linear regression model fits from (iii)?