Image Generator – Drawing Cartoons with Generative Adversarial Networks

Image Generator – Drawing Cartoons with Generative Adversarial NetworksGenerating Simpsons with DCGANsGreg SurmaBlockedUnblockFollowFollowingFeb 10In today’s article, we are going to implement a machine learning model that can generate an infinite number of alike image samples based on a given dataset.

In order to do so, we are going to demystify Generative Adversarial Networks (GANs) and feed it with a dataset containing characters from ‘The Simspons’.

By the end of this article, you will be familiar with the basics behind the GANs and you will be able to build a generative model on your own!To get a better idea about the GANs’ capabilities, take a look at the following example of the Homer Simpson evolution during the training process.

Fascinating, right?Let’s dive into some theory to get a better understanding of how it actually works.

Generative Adversarial Networks (GANs)Let’s start our GAN journey with defining a problem that we are going to solve.

We would like to provide a set of images as an input, and generate samples based on them as an output.

Input Images -> GAN -> Output SamplesWith the following problem definition, GANs fall into the Unsupervised Learning bucket because we are not going to feed the model with any expert knowledge (like for example labels in the classification task).

The idea of generating samples based on a given dataset without any human supervision sounds very promising.

Let’s find out how it is possible with GANs!The underlying idea behind GAN is that it contains two neural networks that compete against each other in a zero-sum game framework, i.

e.

generator and a discriminator.

GeneratorThe Generator takes random noise as an input and generates samples as an output.

It’s goal is to generate such samples that will fool the Discriminator to think that it is seeing real images while actually seeing fakes.

We can think of the Generator as a counterfeit.

DiscriminatorDiscriminator takes both real images from the input dataset and fake images from the Generator and outputs a verdict whether a given image is legit or not.

We can think of the Discriminator as a policeman trying to catch the bad guys while letting the good guys free.

Minimax RepresentationIf we think once again about Discriminator’s and Generator’s goals, we can see that they are opposing each other.

Discriminator’s success is a Generator’s failure and vice-versa.

That is why we can represent GANs framework more like Minimax game framework rather than an optimization problem.

(source: http://cs231n.

stanford.

edu/slides/2017/cs231n_2017_lecture13.

pdf)(source: http://cs231n.

stanford.

edu/slides/2017/cs231n_2017_lecture13.

pdf)GANs are designed to reach a Nash equilibrium at which each player cannot reduce their cost without changing the other players’ parameters.

For those of you who are familiar with the Game Theory and Minimax algorithm, this idea will seem more comprehensible.

For those who are not, I recommend you to check my previous article that covers the Minimax basics.

Tic Tac Toe — Creating Unbeatable AIIntroduction to Minimax Algorithmtowardsdatascience.

comData Flow and BackpropagationWhile Minimax representation of two adversarial networks competing with each other seems reasonable, we still don’t know how to make them improve themselves to ultimately transform random noise to a realistic looking image.

Let’s start with the Discriminator.

It gets both real images and fake ones and tries to tell whether they are legit or not.

We, as the system designers know whether they came from a dataset (reals) or from a generator (fakes).

We can use this information to label them accordingly and perform a classic backpropagation allowing the Discriminator to learn over time and get better in distinguishing images.

If the Discriminator correctly classifies fakes as fakes and reals as reals, we can reward it with positive feedback in the form of a loss gradient.

If it fails at its job, it gets negative feedback.

This mechanism allows it to learn and get better.

Now let’s move on to the Generator.

It takes random noise as input and samples the output in order to fool the Discriminator that it’s the real image.

Once the Generator’s output goes through the Discriminator, we know the Discriminator’s verdict whether it thinks that it was a real image or a fake one.

We can use this information to feed the Generator and perform backpropagation again.

If the Discriminator identifies the Generator’s output as real, it means that the Generator did a good job and it should be rewarded.

On the other hand, if the Discriminator recognized that it was given a fake, it means that the Generator failed and it should be punished with negative feedback.

If you think about it for a while, you’ll realize that with the above approach we’ve tackled the Unsupervised Learning problem with combining Game Theory, Supervised Learning and a bit of Reinforcement Learning.

GAN data flow can be represented as in the following diagram.

(source: https://www.

oreilly.

com/ideas/deep-convolutional-generative-adversarial-networks-with-tensorflow)And with some underlying math.

(source: https://medium.

com/@jonathan_hui/gan-whats-generative-adversarial-networks-and-its-application-f39ed278ef09)I hope you are not scared by the above equations, they will definitely get more comprehensible as we will move on to the actual GAN implementation.

Image Generator (DCGAN)As always, you can find the full codebase for the Image Generator project on GitHub.

Everything is contained in a single Jupyter notebook that you can run on a platform of your choice.

For more info about the dataset check simspons_dataset.

txt.

I encourage you to check it and follow along.

gsurma/image_generatorDCGAN image generator ????️.

Contribute to gsurma/image_generator development by creating an account on GitHub.

github.

comSince we are going to deal with image data, we have to find a way of how to represent it effectively.

It can be achieved with Deep Convolutional Neural Networks, thus the name – DCGAN.

ModelIn our project, we are going to use a well-tested model architecture by Radford et al.

, 2015 that can be seen below.

You can find my TensorFlow implementation of this model here in the discriminator and generator functions.

As you can see in the above visualization.

Generator and Discriminator have almost the same architectures, but reflected.

We won’t dive deeper into the CNN aspect of this topic but if you are more curious about the underlying aspects, feel free to check the following article.

Image Classifier – Cats????.vs Dogs????Leveraging Convolutional Neural Networks (CNNs) and Google Colab’s Free GPUtowardsdatascience.

comLoss FunctionsIn order for our Discriminator and Generator to learn over time, we need to provide loss functions that will allow backpropagation to take place.

While the above loss declarations are consistent with the theoretic explanations from the previous chapter, you may notice two extra things:Gaussian noise added to the real input in line 4.

One-sided label smoothening for the real images recognized by the Discriminator in line 12.

You’ll notice that training GANs is notoriously hard because of the two loss functions (for the Generator and Discriminator) and getting a balance between them is a key to the good results.

Because of the fact that it’s very common for the Discriminator to get too strong over the Generator, sometimes we need to weaken the Discriminator and we are doing it with the above modifications.

We’ll cover other techniques of achieving the balance later.

OptimizersWe are going to optimize our models with the following Adam optimizers.

Similarly to the declarations of the loss functions, we can also balance the Discriminator and the Generator with appropriate learning rates.

LR_D = 0.

00004LR_G = 0.

0004BETA1 = 0.

5As the above hyperparameters are very use-case specific, don’t hesitate to tweak them but also remember that GANs are very sensitive to the learning rates modifications so tune them carefully.

TrainingFinally, we can begin training.

Above function contains a standard machine learning training protocol.

We are dividing our dataset into batches of a specific size and performing training for a given number of epochs.

The core training part is in lines 20–23 where we are training Discriminator and Generator.

Same as with the loss functions and learning rates, it’s also a possible place to balance the Discriminator and the Generator.

Some researchers found that modifying the ratio between Discriminator and Generator training runs may benefit the results.

In my case 1:1 ratio performed the best but feel free to play with it as well.

Moreover, I have used the following hyperparameters but they are not written in stone, so don’t hesitate to modify them.

IMAGE_SIZE = 128NOISE_SIZE = 100BATCH_SIZE = 64EPOCHS = 300It’s very important to regularly monitor model’s loss functions and its performance.

I recommend to do it every epoch, like in the code snippet above.

Let’s see some samples that were generated during training.

We can clearly see that our model gets better and learns how to generate more real-looking Simpsons.

Let’s focus on the main character, the man of the house, Homer Simpson.

Homer Simpson evolving over timeFinal ResultsUltimately, after 300 epochs of training that took about 8 hours on NVIDIA P100 (Google Cloud), we can see that our artificially generated Simpsons actually started looking like the real ones!.Take a look at the following cherry-picked samples.

As expected, there were some funny-looking malformed faces as well.

What’s next?While GAN image generation proved to be very successful, it’s not the only possible application of the Generative Adversarial Networks.

For example, take a look at the following Image-to-Image translation with CycleGAN.

(source: https://junyanz.

github.

io/CycleGAN/)Amazing, right?I encourage you to dive deeper into the GANs field as there is still more to explore!Don’t forget to check the project’s github page.

gsurma/image_generatorDCGAN image generator ????️.

Contribute to gsurma/image_generator development by creating an account on GitHub.

github.

comQuestions?.Comments?.Feel free to leave your feedback in the comments section or contact me directly at https://gsurma.

github.

io.

.. More details

Leave a Reply