MATH 4780 / MSSC 5780 Regression Analysis
Without constant variance,
LSEs are still unbiased if linearity and independence of \(x\)s and errors hold.
Inference results are not convincing: distorted \(p\)-values and incorrect coverage of confidence interval.
The \(b_j\)s will have larger standard errors (no minimum-variance).
Is it harder or easier to reject \(H_0: \beta_j = 0\) when the standard error of \(b_j\) is larger?
Common pattern: the conditional variation of \(y\) to increase with the level of \(y\).
It seems natural to plot \(e_i\) against \(y\)
Use R-student residual plot \(t_i\) vs. \(\hat{y}_i\) because R-student residuals have constant variance and they are uncorrelated with \(\hat{y}_i\).
Can you provide a distribution whose variance is functionally related to its mean?
Relationship of \(\sigma^2\) to \(E(y)\) | Transformation |
---|---|
\(\sigma^2 \propto\) constant | \(y' = y\) (no transformation) |
\(\sigma^2 \propto E(y)\) | \(y' = \sqrt{y}\) (square root; Poisson) |
\(\sigma^2 \propto E(y)[1-E(y)]\) | \(y' = \sin^{-1}(\sqrt{y})\) (arcsin; binomial proportions \(0 \le y_i \le 1\)) |
\(\sigma^2 \propto [E(y)]^2\) | \(y' = \ln(y)\) (natural log) |
\(\sigma^2 \propto [E(y)]^3\) | \(y' = y^{-1/2}\) (reciprocal square root) |
\(\sigma^2 \propto [E(y)]^4\) | \(y' = y^{-1}\) (reciprocal) |
The residuals increase with fitted values.
Nonlinear pattern.
Nonsense negative fitted values.
Variation around the central smooth seems approximately constant.
Nonlinearity is still apparent. (Linearity and lack of fit next week)
spreadLevelPlot(logciafit, smooth = FALSE)
Suggested power transformation: 1.1
Suppose that the other assumptions hold but that the error variances differ: \(\epsilon_i \stackrel{indep}{\sim} N(0, \sigma_i^2)\)
The resulting model isn’t estimable because we have more parameters than data points.
If we know the pattern of unequal error variances, so that \(\sigma_i^2 = \sigma^2/w_i\), for known values \(w_i\), we can compute the coefficient estimates that minimize the Weighted Residual Sum of Squares.
In simple linear regression, \[S(\beta_0, \beta_1) = \sum_{i=1}^nw_i\epsilon_i^2 = \sum_{i=1}^n w_i \left(y_i - \beta_0 - \beta_1x_i \right)^2\]
Observations with large variances will have smaller weights.
Prior knowledge or information
Residual analysis
Chosen inversely proportional to variances of measurement error
x <- c(3000, 3150, 3085, 5225, 5350, 6090, 8925, 9015, 8885, 8950, 9000, 11345,
12275, 12400, 12525, 12310, 13700, 15000, 15175, 14995, 15050, 15200,
15150, 16800, 16500, 17830, 19500, 19200, 19000, 19350)
y <- c(81464, 72661, 72344, 90743, 98588, 96507, 126574, 114133, 115814, 123181,
131434, 140564, 151352, 146926, 130963, 144630, 147041, 179021, 166200, 180732,
178187, 185304, 155931, 172579, 188851, 192424, 203112, 192482, 218715, 214317)
group <- c(1, 1, 1, 2, 2, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 7, 7, 7, 7, 7, 7,
8, 8, 9, 10, 10, 10, 10)
ex5_5 <- data.frame("expense" = x, "sales" = y, "group" = group)
par(mar = c(3.3, 3.3, 0, 0), mgp = c(2, 1, 0))
plot(ex5_5$expense, ex5_5$sales, xlab = "Advertising expense",
ylab = "Food sales", col = group, pch = 16, cex = 2, cex.lab = 1.5)
[1] 3000 3150 3085 5225 5350 6090 8925 9015 8885 8950 9000 11345
[13] 12275 12400 12525 12310 13700 15000 15175 14995 15050 15200 15150 16800
[25] 16500 17830 19500 19200 19000 19350
[1] 1 1 1 2 2 3 4 4 4 4 4 5 5 5 5 5 6 7 7 7 7 7 7 8 8
[26] 9 10 10 10 10
(Intercept) x_bar
-9226002 7782
[1] 7.1e-08 6.5e-08 6.8e-08 3.2e-08 3.1e-08 2.6e-08 1.7e-08 1.6e-08 1.7e-08
[10] 1.7e-08 1.6e-08 1.3e-08 1.2e-08 1.1e-08 1.1e-08 1.2e-08 1.0e-08 9.3e-09
[19] 9.2e-09 9.3e-09 9.3e-09 9.2e-09 9.2e-09 8.2e-09 8.4e-09 7.7e-09 7.0e-09
[28] 7.1e-09 7.2e-09 7.1e-09
...
Coefficients:
(Intercept) ex5_5$expense
51024.81 7.92
...