The only difference is that the first file is using Gradient Descent and the second one Normal Equation to compute the value of theta..After we get the theta values in the “Create Linear Regression Model” section we will be able to plot the equation (line) and check if it fits this data..If there is any questions feel free to reach me.Create Linear Regression ModelNow we need to obtain the theta values for the equation that best fits the data we visualized in the previous step.I will explain two options for computing theta values.Computing Normal Equation (one-step algorithm)Using Gradient Descent algorithmNormal Equation (First Option)This is analytical solution for Linear Regression computation..Then we compute the theta vector with the formula….m = length(y);% Add ones columnX = [ones(m, 1) X];% Gradient Descent with Normal Equationtheta = (pinv(X'*X))*X'*y…After computing the theta we get the following valuestheta = [-173280098.50770, 90894.36059]Which actually represent the linear equationh = 90894.36059 * x + (-173280098.50770)In order to verify if the given equation fits the data well we can plot it on the same graph with the data..Where we can see that if fits well with the data.Image 3: Linear Regression Normal EquationGradient Descent (Second Option)Using Gradient Descent algorithm requires feature normalization in some cases..Where matrices are used to compute the theta values.Next we use this function in main_gradient_descent.m file to get the theta values….% Choose some alpha valuealpha = 0.1;% Set number of iterationsnum_iters = 400;% Initialize thetatheta = zeros(2, 1);…% Init Theta and Run Gradient Descenttheta = gradientDescent(X_norm, y, theta, alpha, num_iters)…What we get for theta istheta = [9599355.00000, 301462.59019]And this actually represent the linear equationh = 301462.59019 * x + (9599355)And we will plot this equation on the graph along with our data.% Plot linear regression lineplot(X, X_norm*theta, '-')Where by looking at the graph we can see that the blue line fits well our data.Image 5: Linear Equation Gradient DescentPredict Using Linear Regression ModelNow that we got the theta values for the equation we should do population prediction for some of the next years..So let’s calculate the expected number of people living in Sweden in 2020.Using the theta values that we got using the Normal Equation we can predict the expected population by multiplying with the theta values as shown on the code below.% Predict population for 2020pred_year = 2020;pred_year_val = [1 2020];% Calculate predicted valuepred_value = pred_year_val * theta;What we get is estimation of 10 326 510 people in 2020.In case of using the Gradient Descent, since we have done feature normalization we must not forget to normalize the feature before we do the prediction by using the mu and sigma variables.% Predict population for 2020pred_year = 2020;% Dont forget to normalize the feature before predictionpred_year_val = (pred_year .- mu)./sigma;% Add first columnpred_year_norm = [1 pred_year_val];% Calculate predicted valuepred_value = pred_year_norm * theta;So here the predicted population in 2020 is also 10 326 510 people.At the end we just plot this value on the graph marked with blue cross.Image 6: Final GraphConclusionThis model is very simple to build and use so it can be used on a lot of other project ides like predicting stock price, average km or miles per liter/galon of petrol, football player salary and scores per game, etc … I’m looking forward to hear your ideas and projects.. More details
- 7 Data Trends for 2020 (and one non-trend)
- What are Autoencoders? Learn How to Enhance a Blurred Image using an Autoencoder!
- Introducing Databricks Ingest: Easy and Efficient Data Ingestion from Different Sources into Delta Lake
- New Data Ingestion Network for Databricks: The Partner Ecosystem for Applications, Database, and Big Data Integrations into Delta Lake