MATH 4780 - Fall 2023 - Regression Diagnostics - Linearity r emo::ji('right')

Assumptions of Linear Regression

\(Y_i= \beta_0 + \beta_1X_{i1} + \beta_2X_{i2} + \dots + \beta_kX_{ik} + \epsilon_i\)

\(E(Y \mid X)\) and \(X\) are linearly related.
\(\small E(\epsilon_i) = 0\)
\(\small \mathrm{Var}(\epsilon_i) = \sigma^2\)
\(\small \mathrm{Cov}(\epsilon_i, \epsilon_j) = 0\) for all \(i \ne j\).
\(\small \epsilon_i \stackrel{iid}{\sim} N(0, \sigma^2)\) (for statistical inference)

Assuming \(E(\epsilon) = 0\) implies that the regression surface captures the dependency of the conditional mean of \(y\) on the \(x\)s.
Violating linearity implies that the model fails to represent the relationship between the mean response and the regressors. (Lack of fit)

Detecting Nonlinearity (CIA Example)

Scatterplot \(y\) against each \(x\) can be misleading! It shows the marginal relationship between \(y\) and each \(x\), without controlling the level of other regressors.

Detecting Nonlinearity: Residual Plots

Care about the partial relationship between \(y\) and each \(x\) with impact of other \(x\)s controlled.
Residual-based plots are more relevant in detecting the departure of linearity.

Residual plots cannot distinguish between monotone and non-monotone nonlinearity.
The distinction:
- Monotone: just transform \(x\) to \(x^2\)
- Non-monotone: need quadratic form

Detecting Nonlinearity: Partial Residual Plots

Partial residual plots (Component-plus-Residual Plot) are for diagnosing nonlinearity.
The partial residuals for \(x_j\): \[e_i^{(j)} = b_jx_{ij} + e_i\]
- \(b_j\) is the coefficient of \(x_j\) in the full multiple regression
- \(e_i\)s are the residuals from the full multiple regression
Partial residual plot \(e_i^{(j)}\) vs. \(x_{ij}\) for \(x_j\)

R Lab Partial Residual Plots

logciafit <- lm(log(infant) ~ gdp + health + gini, data = CIA)
# Component-plus-Residual Plot 
car::crPlots(logciafit, ylab = "partial residual", layout = c(1, 3), grid = FALSE, main = "")

Transformation for Linearity

Monotone, simple: Power transformation on \(x\) and/or \(y\)
Monotone, not simple: Polynomial regression (next week) or regression splines (MSSC 6250)
Non-Monotone, simple: Quadratic regression \(y = \beta_0 + \beta_1 x + \beta_2 x^2 + \epsilon\)

Bulging Rule for Simple Monotone Nonlinearity

The bulge points	Transform	Ladder of powers/roots
left	\(x\)	down, e.g., \(\log(x)\)
right	\(x\)	up
down	\(y\)	down
up	\(y\)	up

Prefer to transform an \(x\) rather than \(y\), unless we see a common pattern of nonlinearity in the partial relationships of \(y\) to many \(x\)s.

Transformation on \(x\)s

gdp to log(gdp)
health to health + health^2

R Lab Partial Residual Plots

logciafit2 <- update(logciafit, . ~ log(gdp) + poly(health, degree = 2, raw = TRUE) + gini)
car::crPlots(logciafit2, ylab = "partial residual", layout = c(1, 3), grid = FALSE, main = "")

R Lab Improving Model Performance

car::brief(logciafit, digits = 2)

           (Intercept)     gdp health   gini
Estimate          3.02 -0.0439 -0.055 0.0216
Std. Error        0.29  0.0037  0.022 0.0061

 Residual SD = 0.59 on 130 df, R-squared = 0.71

car::brief(logciafit2, digits = 2)

           (Intercept) log(gdp) poly(health, degree = 2, raw = TRUE)1
Estimate          4.65   -0.720                                -0.221
Std. Error        0.32    0.038                                 0.058
           poly(health, degree = 2, raw = TRUE)2   gini
Estimate                                  0.0096 0.0191
Std. Error                                0.0034 0.0044

 Residual SD = 0.44 on 129 df, R-squared = 0.84

R Lab Plotting against Original Untransformed \(x\)

Code

library(effects)
par(mar = c(2, 2, 0, 0))
plot(Effect("gdp", logciafit2, residuals = TRUE), 
     lines = list(col = c("blue", "black"), lty = 2), 
     axes = list(grid = TRUE), confint = FALSE, 
     partial.residuals = list(plot = TRUE, smooth.col = "magenta", 
                              lty = 1, 
                              span = 3/4), 
     xlab = "GDP per Capita", ylab = "Partial Residual", main = "", cex.lab = 2)

Code

par(mar = c(2, 2, 0, 0))
plot(Effect("health", logciafit2, residuals = TRUE), 
     lines = list(col = c("blue", "black"), lty = 2), 
     axes = list(grid = TRUE), confint = FALSE, 
     partial.residuals = list(plot = TRUE, smooth.col = "magenta", 
                              lty = 1, 
                              span = 3/4),
     xlab = "Health Expenditures", ylab = "Partial Residual", main = "", cex.lab = 2)

Transforming \(x\)s Analytically: Box and Tidwell (1962)

Box and Tidwell (1962) proposed a procedure for estimating \(\lambda_1, \lambda_2, \dots, \lambda_k\) in the model \[y = \beta_0 + \beta_1x_1^{\lambda_1} + \cdots + \beta_kx_k^{\lambda_k}+ \epsilon\]
All \(x_j\)s are positive.
\(\beta_0, \beta_1, \dots, \beta_k\) are estimated after and conditional on the transformations.
\(x_j^{\lambda_j} = \log_e(x_j)\) if \(\lambda_j = 0\).

R Lab Box and Tidwell (1962)

Consider the model \[\log(Infant) = \beta_0 + \beta_1 GDP^{\lambda_1} + \beta_2Gini^{\lambda_2} + \beta_3 Health + \beta_4 Health ^ 2 + \epsilon\]

car::boxTidwell(log(infant) ~ gdp + gini, 
                other.x = ~poly(health, 2, raw = TRUE), data = CIA)

...
     MLE of lambda Score Statistic (t) Pr(>|t|)    
gdp            0.2                10.6   <2e-16 ***
gini          -0.5                -0.4      0.7    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
...

The point estimate of \(\lambda\) is \(\hat{\lambda}_1 = 0.2\) and \(\hat{\lambda}_2 = -0.5\)
The test is for \(H_0:\) No transformation is needed \((\lambda = 1)\).
- Strong evidence to transform \(GDP\)
- Little evidence of the need to transform the Gini coefficient

Other Methods for Dealing with Nonlinearity

Lack-of-fit test (LRA Sec 4.5, CMR Sec. 3.6): Need repeated observations

Transform a nonlinear function into a linear one (LRA Sec 5.3)

Can the nonlinear model \(y = \beta_0e^{\beta_1x}\epsilon\) be transformed into a linear one (intrinsically linear)?

Polynomial Regression, Regression Splines or other nonparametric regression (MSSC 6250)

A (pure) nonlinear model may be needed if the model assumptions cannot be satisfied.

Regression Diagnostics - Linearity ⏯

Model Adequacy Checking and Correction

Non-normality

Non-constant Error Variance

Non-linearity and Lack of Fit

Assumptions of Linear Regression

Detecting Nonlinearity (CIA Example)

Detecting Nonlinearity: Residual Plots

Detecting Nonlinearity: Partial Residual Plots

R Lab Partial Residual Plots

Transformation for Linearity

Bulging Rule for Simple Monotone Nonlinearity

Transformation on \(x\)s

R Lab Partial Residual Plots

R Lab Improving Model Performance

R Lab Plotting against Original Untransformed \(x\)

Transforming \(x\)s Analytically: Box and Tidwell (1962)

R Lab Box and Tidwell (1962)

Other Methods for Dealing with Nonlinearity