MATH 4780 / MSSC 5780 Regression Analysis
A second-order (degree) polynomial in one variable or a quadratic model is
A second-order polynomial in two variables is
The
If we set
Keep the order of the model as low as possible.
Transform data to keep the model 1st order.
If fails, try a 2nd order model.
Avoid higher-order polynomials unless they can be justified for reasons outside the data.
👉 Occam’s Razor: among competing models that predict equally well, choose the “simplest” one, i.e., a parsimonious model.
“Bayesian Deep Learning and a Probabilistic Perspective of Generalization” Wilson and Izmailov (2020) for the rationale of choosing a super high-order polynomial as the regression model.
Model building strategy
👉 Forward selection: successively fit models of increasing order until the
👉 Backward elimination: fit the highest order model and then delete terms one at a time until the highest order remaining term has a significant
👉 They do not necessarily lead to the same model.
👉 Restrict our attention to low-order polynomials.
Extrapolation
Ill-conditioning
conc_cen <- hardwood$conc - mean(hardwood$conc)
lm(strength ~ conc_cen + I(conc_cen ^ 2), data = hardwood)
Call:
lm(formula = strength ~ conc_cen + I(conc_cen^2), data = hardwood)
Coefficients:
(Intercept) conc_cen I(conc_cen^2)
45.295 2.546 -0.635
SOLUTION: 👉 piecewise polynomial regression that fits separate polynomials over different regions of
Example:
The joint points of pieces are called knots.
With
Any issue of piecewise polynomials?
Splines of degree
bs()
function in the splines package.