Homework 1 - Probaility and Statistics Review
Due Friday, September 8, 11:59 PM on D2L
Homework Instruction and Requirement
- Homework 1 covers course materials of Week 1 to 2. 
- Please submit your work in one PDF file to D2L > Assessments > Dropbox. Multiple files or a file that is not in pdf format are not allowed. 
- In your homework, please number and answer questions in order. 
- It is your responsibility to let me understand what you try to show. If you type your answers, make sure there are no typos. I grade your work based on what you show, not what you want to show. If you choose to handwrite your answers, write them neatly. If I can’t read your sloppy handwriting, your answer is judged as wrong. 
- Relevant code should be attached. 
Programming and Computing
Please sharpen your coding skill using R or any language you prefer. No need to show your work on this part!
Probability and Statistics Review
- \(Y_1 \sim N(3, 8)\) and \(Y_2 \sim N(1, 4)\), and \(Y_1\) and \(Y_2\) are independent. What is the distribution the variable \(2Y_1 + 3Y_2\) follows?
- Plot normal density curves with different choices of mean and standard deviation. 
- Install the R package - ISLR2. Choose a continuous variable in the- ISLR2::Bostondata set. Use the- sample()function to draw a simple random sample of size 20 from this population. Calculate the sample average.
- Repeat the sampling in 3. several times to plot a sampling distribution of the sample mean. 
- Suppose \(Y_i \stackrel{iid}{\sim} N(\mu, \sigma^2)\), \(i = 1, 2, \dots, n\), with unknown \(\mu\) and \(\sigma\). The \(100(1-\alpha)\%\) confidence interval (CI) for the population mean \(\mu\) is \(\left( \overline{y}- t_{\alpha/2, n-1}\frac{s}{\sqrt{n}}, \overline{y} + t_{\alpha/2, n-1}\frac{s}{\sqrt{n}} \right)\). Use simulation with \(\alpha = 0.1\), \(\mu = 4\) and \(\sigma = 2\) to verify that such CIs contain \(\mu\) about \(100(1-\alpha)\%\) of times. Fill the percentage in the following table, and comment on your results. 
| Simulation times | \(n=5\) | \(n=30\) | \(n=200\) | 
|---|---|---|---|
| \(20\) | |||
| \(1000\) | |||
| \(20000\) | 
- If \(U_1\) and \(U_2\) are independent and both are uniform random variables over \([0, 1]\) interval https://en.wikipedia.org/wiki/Continuous_uniform_distribution, then \(X_1\) and \(X_2\) defined by \[X_1 = \sqrt{-2\ln(U_1)}\cos(2\pi U_2), \quad X_2 = \sqrt{-2\ln(U_1)}\sin(2\pi U_2)\] are independent \(N(0, 1)\) variables. Draw 10,000 samples for \(U_1\) and \(U_2\) using the runif()function, and use the transformation to generate the samples of \(X_1\) and \(X_2\). Verify- the standard normality of \(X_1\) and \(X_2\) by plotting their histogram with a superimposed standard normal density.
- the independence of \(X_1\) and \(X_2\) by plotting the scatterplot of \(X_1\) and \(X_2\) and computing their correlation coefficient.