This homework is distinct from all future homeworks in that I have some special requests/instructions. These are: (1) the first problem is due later than the others, and (2) problems two and three are to be solved independent of your colleagues. I will add at least one more question based on new material next week.
9/14/07 (except problem 1)
download in pdf format
- [Due 9/28/07] Form a group (1-4 people). After discussion with your group and your professors and reading the relevant literature, choose a problem that is close to your research area and makes significant use of statistical techniques. Submit a write-up in which you (1) describe the problem, (2) summarize what is done in the literature, and (3) justify why your extension is possible and significant. The write-up should include all relevant references. Try to be as thorough as possible. It is ok if the statistical techniques used in the literature relevant for the problem is difficult for you at this point, but the other parts of the problem (i.e the impact of this project in your research area) should be absolutely clear to you. Caution: This project should be small enough to complete in 2 months time and novel enough to be published! Feel free to use the wiki? to discuss interests and develop project ideas.
- [independent work] Install R on your computer. You will get an introduction to R after this assignment is due, but to get you started, use R to compute the exact probability in the problem from class where you measure the amount of water in 100 bottles sampled from a manufacturing process that fills water bottles with mean ml water and standard deviation ml. Use commands help(), help.search(), and the R website to figure a solution. In your write-up to this problem, show the commands and output you obtained.
- [independent work] Write a real-life-related problem involving one of the distributions encountered during the first two weeks of class. Use plausible assumptions. For example, in the sample problem below, it is reasonable to model bus arrivals as Poisson because buses arrive infrequently at a bus stop.
The random variable
counting the number of CyRide
buses arriving at a bus stop between 12pm and 1pm follows a Poisson(1) distribution.
- What is the probability that exactly 5 buses arrive between 12pm and 1pm on a random day?
- If 2 buses arrive between 12 and 12:30, what is the chance that none will arrive between 12:30 and 1?
- Every day at a particular hour, a message is transmitted between earth and a remote vehicle on Mars. If the message is received, the remote vehicle will send confirmation back to earth. Suppose you record for days whether the inbound confirmation message arrives. Let be the number of messages received in days. What is the mean, variance, and distribution of random variable ? If a confirmation message was received the first 5 days, what is the probability that a message is received on days 6 and 7?
- Suppose you once wrote a program to simulate message delivery on a network. It's been a while since you wrote the program, but you at least remember (because there is an program option to set this parameter) that new messages arrived on the network at random with an average of one every 2 time steps. What you don't remember is the details of how you modeled the arrival process.
- How could you check whether your program models arrivals as a Poisson process (so that the number of arrivals in a fixed interval of time is Poisson distributed)? State a null hypothesis to test.
- If I tell you that has a chi-squared sampling distribution , where is the sample variance, is the sample size, and is the population variance, can you determine if the following data on the number of arrivals simulated by your program during 30 independent runs of 100 time steps is consistent with the hypothesis of Poisson arrivals?
60 51 44 41 67 47 48 52 59 55 59 44 42 44 41 53 68 55 64 51 50 55 49 51 48 45 51 61 52 45