After implementing this you would get a curve similar to the curve shown above for high bias.For high variance, you would get a curve similar to the one shown below.Note:There’s a gap between the validation error and the training error, unlike the high bias case.Increasing the number of training examples or regularization parameter can help to avoid high variance.By now you are in the condition to recognize whether you are in high bias or high variance which is a headstart to debug your code.Tuning the regularization parameter(lambda):One last thing that I want to tell you is about the regularization parameter lambda..In particular, a model without regularization (λ = 0) fits the training set well, but does not generalize..Conversely, a model with too much regularization does not fit the training set and testing set well..A good choice of ‘λ’ can provide a good fit to the data..You can easily find out the optimum value of lambda for your model if you’ve followed me..We have to do the same thing that we did in learning curves i.e we have to plot a graph between ‘error’ and number of training examples..However, this time we have to vary the parameter lambda and see the corresponding error in training and validation set..The region where both the error i.e training and validation are low would give us the optimum value for lambda..Let’s see the code behind this concept.for i = 1:length(lambda_vec) lambda = lambda_vec(i); theta = trainLinearReg(X,y,lambda);% here X and y are the whole trainig set matrix error_train(i) = linearRegCostFunction(X, y, theta, 0); error_val(i) = linearRegCostFunction(Xval, yval, theta, 0); endThe vector lambda_vec is a vector containing various values for lambda that we are going to take and loop it to find the corresponding errors in the training set and the validaiton set..Let’s see a graph of this concept and try to understand how it works.It is evident from the graph that a value of lambda around 100 could be a good choice to train the model because at this point both cross-validation and training set has a low error value..At this point, we have seen a number of possible errors that can make our algorithm perform poorly and also how we can eradicate these problems..If you wish to see a whole implementation of these concepts be sure to check out the link given below.An exercise to merge all these concepts and make your own model.ConclusionThus, learning curves can help to rectify high bias, high variance and also the regularization parameter in your model which saves a lot of time.. More details
- 7 Data Trends for 2020 (and one non-trend)
- What are Autoencoders? Learn How to Enhance a Blurred Image using an Autoencoder!
- Introducing Databricks Ingest: Easy and Efficient Data Ingestion from Different Sources into Delta Lake
- New Data Ingestion Network for Databricks: The Partner Ecosystem for Applications, Database, and Big Data Integrations into Delta Lake