What is linear regression? It is a predictive modeling technique that finds a relationship between independent variable(s) and dependent variable(s) (which is a continuous variable). The independent variable(iv)’s can be categorical(e.g. US, UK, 0/1) or continuous(1729, 3.141 etc), while dependent variable(dv)s are continuous. Underlying function mapping iv’s and dv’s can be linear, quadratic, polynomial or other non-linear functions(like sigmoid function in logistic regression), but this article is on linear technique.
All in all, we saw how linear regression can be implemented in a few lines of code using sklearn library.
Ordinary least squares method to get a best fit line works well for many cases and is pretty intuitive.
However, when data suffer from multicollinearity or heteroskedasticity, we need to employ regularization techniques to perform regression.
VIF is a metric that can be used to detect multicollinearity in predictor variables.
Ridge and LASSO work pretty well to perform regression modeling.
Happy learning :)