How to build your first Neural Network to predict house prices with Keras

Congratulations!Summary: Coding up our first neural network required only a few lines of code:We specify the architecture with the Keras Sequential model.

We specify some of our settings (optimizer, loss function, metrics to track) with model.

compileWe train our model (find the best parameters for our architecture) with the training data with model.

fitWe evaluate our model on the test set with model.

evaluateVisualizing Loss and AccuracyIn Intuitive Deep Learning Part 1b, we talked about overfitting and some regularization techniques.

How do we know if our model is currently overfitting?What we might want to do is to plot the training loss and the val loss over the number of epochs passed.

To display some nice graphs, we will use the package matplotlib.

As usual, we have to import the code we wish to use:import matplotlib.

pyplot as pltThen, we want to visualize the training loss and the validation loss.

To do so, run this snippet of code:plt.

plot(hist.

history['loss'])plt.

plot(hist.

history['val_loss'])plt.

title('Model loss')plt.

ylabel('Loss')plt.

xlabel('Epoch')plt.

legend(['Train', 'Val'], loc='upper right')plt.

show()We’ll explain each line of the above code snippet.

The first two lines says that we want to plot the loss and the val_loss.

The third line specifies the title of this graph, “Model Loss”.

The fourth and fifth line tells us what the y and x axis should be labelled respectively.

The sixth line includes a legend for our graph, and the location of the legend will be in the upper right.

And the seventh line tells Jupyter notebook to display the graph.

Your Jupyter notebook should look something like this:A graph of model loss that you should see in your Jupyter notebookWe can do the same to plot our training accuracy and validation accuracy with the code below:plt.

plot(hist.

history['acc'])plt.

plot(hist.

history['val_acc'])plt.

title('Model accuracy')plt.

ylabel('Accuracy')plt.

xlabel('Epoch')plt.

legend(['Train', 'Val'], loc='lower right')plt.

show()You should get a graph that looks a bit like this:Plot of model accuracy for training and validation setSince the improvements in our model to the training set looks somewhat matched up with improvements to the validation set, it doesn’t seem like overfitting is a huge problem in our model.

Summary: We use matplotlib to visualize the training and validation loss / accuracy over time to see if there’s overfitting in our model.

Adding Regularization to our Neural NetworkFor the sake of introducing regularization to our neural network, let’s formulate with a neural network that will badly overfit on our training set.

We’ll call this Model 2.

model_2 = Sequential([ Dense(1000, activation='relu', input_shape=(10,)), Dense(1000, activation='relu'), Dense(1000, activation='relu'), Dense(1000, activation='relu'), Dense(1, activation='sigmoid'),])model_2.

compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])hist_2 = model_2.

fit(X_train, Y_train, batch_size=32, epochs=100, validation_data=(X_val, Y_val))Here, we’ve made a much larger model and we’ve use the Adam optimizer.

Adam is one of the most common optimizers we use, which adds some tweaks to stochastic gradient descent such that it reaches the lower loss function faster.

If we run this code and plot the loss graphs for hist_2 using the code below (note that the code is the same except that we use ‘hist_2’ instead of ‘hist’):plt.

plot(hist_2.

history['loss'])plt.

plot(hist_2.

history['val_loss'])plt.

title('Model loss')plt.

ylabel('Loss')plt.

xlabel('Epoch')plt.

legend(['Train', 'Val'], loc='upper right')plt.

show()We get a plot like this:Loss curves for over-fitting modelThis is a clear sign of over-fitting.

The training loss is decreasing, but the validation loss is way above the training loss and increasing (past the inflection point of Epoch 20).

If we plot accuracy using the code below:plt.

plot(hist_2.

history['acc'])plt.

plot(hist_2.

history['val_acc'])plt.

title('Model accuracy')plt.

ylabel('Accuracy')plt.

xlabel('Epoch')plt.

legend(['Train', 'Val'], loc='lower right')plt.

show()We can see a clearer divergence between train and validation accuracy as well:Training and validation accuracy for our overfitting modelNow, let’s try out some of our strategies to reduce over-fitting (apart from changing our architecture back to our first model).

Remember from Intuitive Deep Learning Part 1b that we introduced three strategies to reduce over-fitting.

Of the three, we’ll incorporate L2 regularization and dropout here.

The reason we don’t add early stopping here is because after we’ve used the first two strategies, the validation loss doesn’t take the U-shape we see above and so early stopping will not be as effective.

First, let’s import the code that we need for L2 regularization and dropout:from keras.

layers import Dropoutfrom keras import regularizersWe then specify our third model like this:model_3 = Sequential([ Dense(1000, activation='relu', kernel_regularizer=regularizers.

l2(0.

01), input_shape=(10,)), Dropout(0.

3), Dense(1000, activation='relu', kernel_regularizer=regularizers.

l2(0.

01)), Dropout(0.

3), Dense(1000, activation='relu', kernel_regularizer=regularizers.

l2(0.

01)), Dropout(0.

3), Dense(1000, activation='relu', kernel_regularizer=regularizers.

l2(0.

01)), Dropout(0.

3), Dense(1, activation='sigmoid', kernel_regularizer=regularizers.

l2(0.

01)),])Can you spot the differences between Model 3 and Model 2?.There are two main differences:Difference 1: To add L2 regularization, notice that we’ve added a bit of extra code in each of our dense layers like this:kernel_regularizer=regularizers.

l2(0.

01)This tells Keras to include the squared values of those parameters in our overall loss function, and weight them by 0.

01 in the loss function.

Difference 2: To add Dropout, we added a new layer like this:Dropout(0.

3),This means that the neurons in the previous layer has a probability of 0.

3 in dropping out during training.

Let’s compile it and run it with the same parameters as our Model 2 (the overfitting one):model_3.

compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])hist_3 = model_3.

fit(X_train, Y_train, batch_size=32, epochs=100, validation_data=(X_val, Y_val))And now, let’s plot the loss and accuracy graphs.

You’ll notice that the loss is a lot higher at the start, and that’s because we’ve changed our loss function.

To plot such that the window is zoomed in between 0 and 1.

2 for the loss, we add an additional line of code (plt.

ylim) when plotting:plt.

plot(hist_3.

history['loss'])plt.

plot(hist_3.

history['val_loss'])plt.

title('Model loss')plt.

ylabel('Loss')plt.

xlabel('Epoch')plt.

legend(['Train', 'Val'], loc='upper right')plt.

ylim(top=1.

2, bottom=0)plt.

show()We’ll get a loss graph that looks like this:You can see that the validation loss much more closely matches our training loss.

Let’s plot the accuracy with similar code snippet:plt.

plot(hist_3.

history['acc'])plt.

plot(hist_3.

history['val_acc'])plt.

title('Model accuracy')plt.

ylabel('Accuracy')plt.

xlabel('Epoch')plt.

legend(['Train', 'Val'], loc='lower right')plt.

show()And we will get a plot like this:Compared to our model in Model 2, we’ve reduced overfitting substantially!.And that’s how we apply our regularization techniques to reduce overfitting to the training set.

Summary: To deal with overfitting, we can code in the following strategies into our model each with about one line of code:L2 RegularizationDropoutIf we visualize the training / validation loss and accuracy, we can see that these additions have helped deal with overfitting!Consolidated Summary:In this post, we’ve written Python code to:Explore and Process the DataBuild and Train our Neural NetworkVisualize Loss and AccuracyAdd Regularization to our Neural NetworkWe’ve been through a lot, but we haven’t written too many lines of code!.Building and Training our Neural Network has only taken about 4 to 5 lines of code, and experimenting with different model architectures is just a simple matter of swapping in different layers or changing different hyperparameters.

Keras has indeed made it a lot easier to build our neural networks, and we’ll continue to use it for more advanced applications in Computer Vision and Natural Language Processing.

What’s Next: In our next Coding Companion Part 2, we will explore how to code up our own Convolutional Neural Networks (CNNs) to do image recognition!Build your first Convolutional Neural Network to recognize imagesA step-by-step guide to building your own image recognition software with Convolutional Neural Networks using Keras on…medium.

comBe sure to first get an intuitive understanding of CNNs here: Intuitive Deep Learning Part 2: CNNs for Computer VisionAbout the author:Hi there, I’m Joseph!.I recently graduated from Stanford University, where I worked with Andrew Ng in the Stanford Machine Learning Group.

I want to make Deep Learning concepts as intuitive and as easily understandable as possible by everyone, which has motivated my publication: Intuitive Deep Learning.

.. More details

Leave a Reply