Primer on Polynomial Regression in Python
We went through linear regression at the notebook here previously.
Primer on Polynomial Regression in Python
We went through linear regression at the notebook here previously.
Many relationships are obviously not linear, so we need something else for nonlinear relationships.
Let’s run through how to undertake polynomial regression in this post.
First, what is polynomial regression?
This is simply regression at a higher order.
Why do we need it?
Sometimes a line just does not cut it, and you need a curve.
We use the same data as per the last post on linear regression. So we won’t go into that.
First, we import and declare all the instances we need for linear, cubic and quadratic regression.
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
# Recall we set interaction_only to True in Basics - Exploring Interactions. Here we want all terms in so we turn it off
linear_regression = linear_model.LinearRegression(normalize=False, fit_intercept=True)
create_cubic = PolynomialFeatures(degree=3, interaction_only=False, include_bias=False)
create_quadratic = PolynomialFeatures(degree=2, interaction_only=False, include_bias=False)
# Putting these in a pipline as before
linear_predictor = make_pipeline(linear_regression)
quadratic_predictor = make_pipeline(create_quadratic, linear_regression)
cubic_predictor = make_pipeline(create_cubic, linear_regression)With these setup, regression is super simple.
For linear regression, we just need to plug the variables into the linear_predictor we setup above.
regr_line = scatter.plot(xt, linear_predictor.fit(x,y).predict(xt), '-', color='red', linewidth=2)Same for quadratic (power of 2) and cubic (power of 3) regression.
regr_line = scatter.plot(xt, quadratic_predictor.fit(x,y).predict(xt), '-', color='red', linewidth=2)regr_line = scatter.plot(xt, cubic_predictor.fit(x,y).predict(xt), '-', color='red', linewidth=2)We could go crazy and have the regression be done at any arbitrary power.
create_ten = PolynomialFeatures(degree=10, interaction_only=False, include_bias=False)
ten_predictor = make_pipeline(create_ten, linear_regression)
scatter = merged.plot(kind='scatter', x=predictor, y='Resale Index', xlim=x_range, ylim=y_range)
regr_line = scatter.plot(xt, ten_predictor.fit(x,y).predict(xt), '-', color='red', linewidth=2)It’s fits perfectly. But that also means it’s going to be a horrible predictor once we apply the line to any other dataset.
The full set of code is in the notebook here.
playgrd.com || facebook.com/playgrdstar || instagram.com/playgrdstar/

