## STAT430 : HomeworkThree

Referers: Fall2007 :: (Remote :: Orphans :: Tree )
Dorman Wiki
Dorman Lab Wiki
##### 0Homework 3

Some of the questions have answers not yet been covered in class.

Problems
1. This first problem deals with the temperature data.  The average January minimum temperature (JanTemp?) is provided for 29 non-coastal US cities during the years 1931 to 1960.  Also provided latitude (Lat) and longitude (Long).
1. Using manual calculations (a calculator or Excel can help you with this), fit the simple linear model $Y = β 0 + β 1 x + ε$, where $Y$ is the January temperature and $x$ is either longitude or latitude of the city.  Give estimates of $β 0 , β 1$, and $σ ε 2$.  Which of longitude and latitude most affects temperature?  What is the relationship?  Show your work.
2. Write down the predicted regression equation $Y ^ = ^ β 0 + ^ β 1 x$.  Use the equation that captures the stronger of the two relationships to predict the January temperature in Key West, FL (latitude: 25; longitude: 82).  How does it compare to the observed 65?  Is Key West an outlier?  Give the prediction interval for the Key West location.
3. Test if $β 1 = 0$ for the stronger relationship.
2. This second problem deals with the birth weight data.  Use R to carry out the analysis.  The goal of this study was to identify risk factors associated with giving birth to low birth weight babies (less than 2500 g).  Columns are id code, low birth weight indicator, mother's age, weight at last menstrual period, race (1=white, 2=black, 3=other), smoking indicator, history of premature labor indicator, hypertension indicator, uterine irritability indicator, number of physician visits in first trimester, and birth weight in grams.
1. Explore the relationship between the explanatory (predictor) variables and birth weight.  Indicate which predictors have a significant impact on birth weight.
2. What is the coefficient of determination?
3. Plot the relationship among the one of the important predictors and birth weight.  Also plot residuals for the estimated regression equation.  Can you identify outliers?

[1] SOURCE:  Hosmer and Lemeshow (2000) Applied Logistic Regression: Second Edition.  These data are copyrighted by John Wiley & Sons Inc. and must be acknowledged and used accordingly. Data were collected at Baystate Medical Center, Springfield, Massachusetts during 1986.