Introduction to Linear Regression and Polynomial RegressionAyush PantBlockedUnblockFollowFollowingJan 13IntroductionIn this blog, we will discuss two important topics that will form a base for Machine Learning which is “Linear Regression” and “Polynomial Regression”.

What is Regression?Regression analysis is a form of predictive modelling technique which investigates the relationship between a dependent and independent variable.

The above definition is a bookish definition, in simple terms the regression can be defined as, “Using the relationship between variables to find the best fit line or the regression equation that can be used to make predictions”.

Regression | Image: WikipediaThere are many types of regressions such as ‘Linear Regression’, ‘Polynomial Regression’, ‘Logistic regression’ and others but in this blog, we are going to study “Linear Regression” and “Polynomial Regression”.

Linear RegressionLinear regression is a basic and commonly used type of predictive analysis which usually works on continuous data.

We will try to understand linear regression based on an example:Aarav is a trying to buy a house and is collecting housing data so that he can estimate the “cost” of the house according to the “Living area” of the house in feet.

Housing data | Andrew Ng courseHe observes the data and comes to the conclusion that the data is linear after he plots the scatter plot.

For his first scatter plot, Aarav uses two variables: ‘Living area’ and ‘Price’.

Scatter plot | Image: Andrew Ng courseAs soon as he saw a pattern in the data, he planned to make a regression line on the graph so that he can use the line to predict the ‘price of the house’.

Using the training data i.

e ‘Price’ and ‘Living area’, a regression line is obtained which will give the minimum error.

To do that he needs to make a line that is closest to as many points as possible.

This ‘linear equation’ is then used for any new data so that he is able to predict the required output.

Here, the β1 it’s are the parameters (also called weights) βo is the y-intercept and Єi is the random error term whose role is to add bias.

The above equation is the linear equation that needs to be obtained with the minimum error.

The above equation is a simple “equation of a line” that isY(predicted) = (β1*x + βo) + Error valueWhere ‘β1’ is the slope and ‘βo’ is the y-intercept similar to the equation of a line.

The values ‘β1’ and ‘βo’ must be chosen so that they minimize the error.

To check the error we have to calculate the sum of squared error and tune the parameters to try to reduce the error.

Error = Σ (actual output — predicted output)²Cost functionKey:1.

Y(predicted) is also called the hypothesis function.

2.

J(θ) is the cost function which can also be called the error function.

Our main goal is to minimize the value of the cost.

3.

y(i) is the predicted output.

4.

hθ(x(i)) is called the hypothesis function which is basically the Y(predicted) value.

Now the question arises, how do we reduce the error value.

Well, this can be done by using Gradient Descent.

The main goal of Gradient descent is to minimize the cost value.

i.

e.

min J(θo, θ1)Gradient Descent Visualization | Gif: mi-academy.

comGradient descent has an analogy in which we have to imagine ourselves at the top of a mountain valley and left stranded and blindfolded, our objective is to reach the bottom of the hill.

Feeling the slope of the terrain around you is what everyone would do.

Well, this action is analogous to calculating the gradient descent, and taking a step is analogous to one iteration of the update to the parameters.

Gradient Decent Analogy | Image: Andrew Ng courseChoosing a perfect learning rate is a very important task as it depends on how large of a step we take downhill during each iteration.

If we take too large of a step, we may step over the minimum.

However, if we take small steps, it will require many iterations to arrive at the minimum.

Linear RegressionPolynomial Linear RegressionIn the last section, we saw two variables in your data set were correlated but what happens if we know that our data is correlated, but the relationship doesn’t look linear?.So hence depending on what the data looks like, we can do a polynomial regression on the data to fit a polynomial equation to it.

Left: Linear Regression, Right: Polynomial regression | GIF: Towards Data ScienceHence If we try to use a simple linear regression in the above graph then the linear regression line won’t fit very well.

It is very difficult to fit a linear regression line in the above graph with a low value of error.

Hence we can try to use the polynomial regression to fit a polynomial line so that we can achieve a minimum error or minimum cost function.

The equation of the polynomial regression for the above graph data would be:y = θo + θ₁x₁ + θ₂ x₁²This is the general equation of a polynomial regression is:Y=θo + θ₁X + θ₂X² + … + θₘXᵐ + residual errorAdvantages of using Polynomial Regression:Polynomial provides the best approximation of the relationship between the dependent and independent variable.

A Broad range of function can be fit under it.

Polynomial basically fits a wide range of curvature.

Disadvantages of using Polynomial RegressionThe presence of one or two outliers in the data can seriously affect the results of the nonlinear analysis.

These are too sensitive to the outliers.

In addition, there are unfortunately fewer model validation tools for the detection of outliers in nonlinear regression than there are for linear regression.

ConclusionIn this blog, I have presented you with the basic concept of Linear Regression and Polynomial Regression.

.