Custom TensorFlow Loss Functions for Advanced Machine LearningAnd few-shot transfer learning exampleHaihan LanBlockedUnblockFollowFollowingJan 2In this article, we’ll look at:The use of custom loss functions in advanced ML applicationsDefining a custom loss function and integrating to a basic Tensorflow neural net modelA brief example of knowledge distillation learning using a Gaussian Process reference applied to a few-shot learning problemLinks to my other articles:Random forestsSoftmax classificationClimate analysisHockey riots and extreme valuesIntroductionThe beaten path of machine learning involves journeying to familiar landmarks and scenic locales..One set of familiar landmarks are predefined loss functions that give you a suitable loss value for the problem you are trying to optimize over..We’re familiar with the cross-entropy loss for classification and the mean squared error (MSE) or root-mean square error (RMSE) for regression problems..Popular ML packages including front-ends such as Keras and back-ends such as Tensorflow, include a set of basic loss functions for most classification and regression tasks..But off the beaten path there exist custom loss functions you may need to solve a certain problem, which are constrained only by valid tensor operations.In Keras you can technically create your own loss function however the form of the loss function is limited to some_loss(y_true, y_pred) and only that..If you tried to add additional parameters to the loss in the form of some_loss_1(y_true, y_pred, **kwargs), Keras will throw a runtime exception and you lose the compute time that went into aggregating the datasets..There are hacks to work around this, but in general we want a scalable way to write a loss function accepting any valid arguments we pass to it, and operates on our tensors in an standard and expected way..We’ll see how to use Tensorflow directly to write a neural network from scratch and build a custom loss function to train it.TensorflowTensorflow (TF) is a symbolic and numeric computation engine that allows us to string tensors* together into computational graphs and do backpropogation over them..Keras is an API or front-end running on top of Tensorflow that conveniently packages standard constructs built using Tensorflow (such as various pre-defined neural net layers) and abstracts many of the low level mechanics of TF from the programmer..However, in the process of making these constructs ‘off-the-shelf’, granular level control and the power to do very specific things is lost.*For simplicity’s sake, tensors are multi-dimensional arrays with a shape tuple like (feature_dim, n_features)One example is the ability to define custom loss functions accepting an arbitrary number of parameters, and can compute losses with arbitrary tensors internal to the network and input tensors external to the network..Strictly speaking, a loss function in TF does not even need to be a python function, but only a valid combination of operations on TF tensor objects..The previous point is important because the power of custom losses comes from the ability to compute losses over arbitrary tensors, not strictly just your supervised target tensor and the network output tensor, in the form of (y_true, y_pred).Before we get to custom losses, let’s briefly review a basic 2-layer dense net (MLP) and see how it’s defined and trained in TF.. More details