Deep Learning Models for Multi-Output Regression

Last Updated on August 28, 2020Multi-output regression involves predicting two or more numerical variables.

Unlike normal regression where a single value is predicted for each sample, multi-output regression requires specialized machine learning algorithms that support outputting multiple variables for each prediction.

Deep learning neural networks are an example of an algorithm that natively supports multi-output regression problems.

Neural network models for multi-output regression tasks can be easily defined and evaluated using the Keras deep learning library.

In this tutorial, you will discover how to develop deep learning models for multi-output regression.

After completing this tutorial, you will know:Let’s get started.

Deep Learning Models for Multi-Output RegressionPhoto by Christian Collins, some rights reserved.

This tutorial is divided into three parts; they are:Regression is a predictive modeling task that involves predicting a numerical output given some input.

It is different from classification tasks that involve predicting a class label.

Typically, a regression task involves predicting a single numeric value.

Although, some tasks require predicting more than one numeric value.

These tasks are referred to as multiple-output regression, or multi-output regression for short.

In multi-output regression, two or more outputs are required for each input sample, and the outputs are required simultaneously.

The assumption is that the outputs are a function of the inputs.

We can create a synthetic multi-output regression dataset using the make_regression() function in the scikit-learn library.

Our dataset will have 1,000 samples with 10 input features, five of which will be relevant to the output and five of which will be redundant.

The dataset will have three numeric outputs for each sample.

The complete example of creating and summarizing the synthetic multi-output regression dataset is listed below.

Running the example creates the dataset and summarizes the shape of the input and output elements.

We can see that, as expected, there are 1,000 samples, each with 10 input features and three output features.

Next, let’s look at how we can develop neural network models for multiple-output regression tasks.

Many machine learning algorithms support multi-output regression natively.

Popular examples are decision trees and ensembles of decision trees.

A limitation of decision trees for multi-output regression is that the relationships between inputs and outputs can be blocky or highly structured based on the training data.

Neural network models also support multi-output regression and have the benefit of learning a continuous function that can model a more graceful relationship between changes in input and output.

Multi-output regression can be supported directly by neural networks simply by specifying the number of target variables there are in the problem as the number of nodes in the output layer.

For example, a task that has three output variables will require a neural network output layer with three nodes in the output layer, each with the linear (default) activation function.

We can demonstrate this using the Keras deep learning library.

We will define a multilayer perceptron (MLP) model for the multi-output regression task defined in the previous section.

Each sample has 10 inputs and three outputs, therefore, the network requires an input layer that expects 10 inputs specified via the “input_dim” argument in the first hidden layer and three nodes in the output layer.

We will use the popular ReLU activation function in the hidden layer.

The hidden layer has 20 nodes, which were chosen after some trial and error.

We will fit the model using mean absolute error (MAE) loss and the Adam version of stochastic gradient descent.

The definition of the network for the multi-output regression task is listed below.

You may want to adapt this model for your own multi-output regression task, therefore, we can create a function to define and return the model where the number of input and number of output variables are provided as arguments.

Now that we are familiar with how to define an MLP for multi-output regression, let’s explore how this model can be evaluated.

If the dataset is small, it is good practice to evaluate neural network models repeatedly on the same dataset and report the mean performance across the repeats.

This is because of the stochastic nature of the learning algorithm.

Additionally, it is good practice to use k-fold cross-validation instead of train/test splits of a dataset to get an unbiased estimate of model performance when making predictions on new data.

Again, only if there is not too much data and the process can be completed in a reasonable time.

Taking this into account, we will evaluate the MLP model on the multi-output regression task using repeated k-fold cross-validation with 10 folds and three repeats.

Each fold the model is defined, fit, and evaluated.

The scores are collected and can be summarized by reporting the mean and standard deviation.

The evaluate_model() function below takes the dataset, evaluates the model, and returns a list of evaluation scores, in this case, MAE scores.

We can then load our dataset and evaluate the model and report the mean performance.

Tying this together, the complete example is listed below.

Running the example reports the MAE for each fold and each repeat, to give an idea of the evaluation progress.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision.

Consider running the example a few times and compare the average outcome.

At the end, the mean and standard deviation MAE is reported.

In this case, the model is shown to achieve a MAE of about 8.

184.

You can use this code as a template for evaluating MLP models on your own multi-output regression tasks.

The number of nodes and layers in the model can easily be adapted and tailored to the complexity of your dataset.

Once a model configuration is chosen, we can use it to fit a final model on all available data and make a prediction for new data.

The example below demonstrates this by first fitting the MLP model on the entire multi-output regression dataset, then calling the predict() function on the saved model in order to make a prediction for a new row of data.

Running the example fits the model and makes a prediction for a new row.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision.

Consider running the example a few times and compare the average outcome.

As expected, the prediction contains three output variables required for the multi-output regression task.

This section provides more resources on the topic if you are looking to go deeper.

In this tutorial, you discovered how to develop deep learning models for multi-output regression.

Specifically, you learned:Do you have any questions? Ask your questions in the comments below and I will do my best to answer.

with just a few lines of PythonDiscover how in my new Ebook: Deep Learning With PythonIt covers end-to-end projects on topics like: Multilayer Perceptrons, Convolutional Nets and Recurrent Neural Nets, and more.

Skip the Academics.

Just Results.

.

Leave a Reply