Automatic White Balance Adjustment of Selfies

Let’s reframe the problem — we want to drive all of those different skin tones to a single skin tone.

By the way, for the purposes of testing our approach, we’ll assume that the correct ‘good’ answer is the mean of all corrected skin tones using Solution 1.

Solution 2Machine learning 101.

Get a large set of training data to train a model.


Our training data starts with the ‘good’ models shown previously.

Each one of those skin tones is at a specific mean RGB value.

And it has a white balance correction factor of 1.

0, 1.

0, 1.


That is, no white balance correction is required to correct it to a good skin tone.

What we need is a training set that covers the range of possible ‘bad’ skin tones — and the corresponding white balance correction factor to move it back to a ‘good’ skin tone.

What we’ll do is generate synthetic skin tones that are bad with a different white balance factor.

For example, if we want to create a reasonable ‘bad’ skin tone we can multiple a ‘good’ skin tone by say 1.

1, 0.

9, 1.


The white balance used to get back to ‘good’ is the inverse of that, or 1/1.

1, 1/0.

9, 1/1.


Easy peasy.

Synthetic Training SetThere are different ways to create a large data set.

One way is to use a Gaussian distribution over the possible range.

Another way is a triangular distribution (numpy.


normal / numpy.



What we end up with is a large set of training data.

A valuable optimization here is to create the training set so as to cover the expected ‘bad’ skin tones, but no more.

This allows us to concentrate our training data points on where it does good.

(That’s a technical term I think.

)From here let’s jump to Python.

cls = Classifier()cls.

fit(X_train, y_train)predictions = cls.

predict(X_test)What we need is to train one of the many possible classifiers.

These come from the scikit-learn library as well as some custom neural network approaches.

Here are the results of various approaches:The columns above mean:Approach — the actual classification method used.

For example, Linear Regression refers to sklearn.



The last three are explained below.

‘Same Model’ MSE to mean — From our definition previously of the correct answer, this shows the Mean Squared Error of our predicted result to that mean.

The lower the better.

Clusters — Remember that our goal was not just to get a ‘good’ skin tone, but rather to get one ‘good’ skin tone.

Turns out that it’s not possible — but 3 is better than 10 — and we can still address that — in a bit.

Btw, we use the elbow method to determine the number of clusters — ‘No elbow’ means that there wasn’t a clear answer.

The last three classifiers were custom built neural networks using Keras.

DNN-mean colors only — this is a dense neural network with the inputs being simply the skin tone RGB and the output being the white balance correctionCNN — image only — this has a CNN input to start which takes an image surrounding the eye, including an ample amount of flesh and the output being the white balance correction.

The thought is that, heck, let the CNN do it’s magic and maybe incorporate some of the eye colors — sclera and iris as well.

CNN + Scalar Injection — this is a combination of the two methods above.

Each has their corresponding inputs, but they’re joined in a dense neural network to produce the outputs of the white balance correctionAs you can see from the chart, the DNN-mean colors only outperform the rest of the approaches.

So we still have 3 clusters, we wanted 1…Okay, so it doesn’t fully solve our problem.

We don’t get one truth for the correct skin tone, but rather 3 potential truths in this case.

I’m just going to claim that not every possible corrupted image of the same person can be corrected to one skin tone without more knowledge than that provided by a single image.

So I dragged two more tricks out of the hat:Trained a CNN to detect really bad images.

This is based on the full selfie image, I classified images as to whether they were unacceptable or acceptable.

Reasons to be unacceptable were elements such as bright lighting, dark lighting, and glare.

Less related to white balancing, but important for the broader goal (of personalized, optimal color palette creation for the individual) were unacceptable elements such as glasses and shadows.

This CNN does a great job of filtering out obviously bad images where we can request the user to take another image.

Manually limiting the number of ‘good’ skin tones to a number that fosters ‘one’ answer without overly constraining all humans to look similar.

Not Bad!Altogether this is a powerful approach for automatic white balance correction.

Future directions are really just the machine learning 101 directions, i.


, tuning of the models and more data points.

Sample Before/After SelfiesFootnotes¹ https://www.


com/watch?v=B-W44JJToG8² http://johnthemathguy.



html³ https://itunes.


com/us/app/all-eyes-on-hue-portable-personalized-seasonal-color/id939633468?mt=8⁴ https://itunes.



. More details

Leave a Reply