Learning Goal: I’m working on a r discussion question and need an explanation an

Learning Goal: I’m working on a r discussion question and need an explanation and answer to help me learn.1. You are given the dataset final_reg_new_sysm590_final.csv. Use Linear Regression to
find the best model for Y. Fully discuss and justify your finding.
2. For the dataset in Question 1 above, use Ridge Regression to find the best model. Why is
it the best?
3. Fit a Lasso regression model to the dataset given to you in Question 1 above. Which
model do you think is the best overall (i.e., linear regression, ridge regression or lasso)?
Why?
4. You are given VAL.xlsx, which has patient data on a dangerous disease. To manage the
disease, patients have been given two types of drugs (1 or 2). Is drug 2 (which is new and
supposed to be great) beneficial to patients? Why or why not? [Hint: you have to use all that
we studied in Survival Analysis. That is, fit 5 parametric models, the Cox model, and the BuckleyJames model; then compare and contrast them, choose the best one and say why you think it’s the
best.]
5. For the birth weight data (birthweight.xlsx), what is the best logit model that can be fit to
explain lowbwt? Why do you think this is your best model? Based on your model, what is
the predicted probability that Mom 1 and Mom 37 will have babies with low birthweight?
6. For the data set in 5 above, fit the best LPM to explain lowbwt. Which is the better
model, logit or LPM? Why? With your best LPM, what are the predicted probabilities for
Mom ID 300 and Mom ID 1683?
7. Data on the number of ear infections is in final_Q4.xlsx. What is the best model
explaining ear infections?
8. You are given the dataset en_sysm590_4K1K70pct.csv where the number of variables
exceeds the number of observations. Find the best model that fits this data. Why do you
think your model is the best?
Requirements: detailed explanations

Leave a Reply

Your email address will not be published. Required fields are marked *