4 Steps to Start Machine Learning with Computer Vision

If so, you will not get into trouble using Keras, because this framework makes designing deep learning models incredibly simple–even simpler than putting Lego blocks together to build a castle.

Of course, the model architecture depends on the complexity of the problem, the amount of data, and other parameters, but there’s no need to code convolutions or to compute recursive chains of derivatives for backpropagation with Keras.

Just compose your network layer by layer and feed it with data.

As soon as you get into the main idea of training neural networks, the next items to learn might include how to properly pick hyperparameters such as learning rate or batch size, which activation function to choose, what Nesterov momentum is, and how learning rate can be optimized.

To this end, you can create a categorical classification model trained on the Iris dataset, then apply convolutional neural networks to train handwritten digits recognizer with MNIST dataset.

Don’t be afraid if something is going wrong — training neural networks usually fails on first attempts.

If everything above seems to be boring, feel free to create your own dataset.

For example, shoot a few hundred photos of your pets and train your own pet classification model based on VGG-16, Inception, or ResNet architectures, or simply jump to the next step.

2.

Create a Convolutional Neural Network from scratch with NumpyThe goal here is to get your hands dirty with coding a convolutional neural network without deep learning frameworks.

This experience is really important, as debugging deep learning models without any understanding of what is inside is similar to playing Russian roulette with your model.

That’s why it is crucial to understand how convolutions work and what backpropagation is, and generally to develop a deeper level of deep learning.

Try to code manually all of the techniques that were set up in the previous step as parameters, and you will appreciate the difference between a high-level Keras framework and low-level programming with Numpy.

3.

Train an object detection model with TensorFlow Object Detection APIOnce you’ve eliminated the white spots regarding image classification, we can go deeper into computer vision.

Now we are going to train the FasterRCNN object detection model based on ResNet, using TensorFlow object detection API on our own dataset including 3 to 10 classes, though the number of classes can be different.

Training an object detector is an interesting task: To start, just clone Tensorflow models repository from Github and follow the installation setup.

Now it is time to create a dataset (or at least download an existing one).

To start creating a dataset, download a few hundred images for each desired category and map all the images with annotations manually.

Fortunately, there is no shortage of tools to make this process simple, though labeling images can take up to 80% of the time.

[Related Article: Automating Machine Learning: Just How Much?]Aside from labeling images, the dataset creation process also requires training and validation subsets and generating *.

record files that serve as input for both sets with a script.

This may sound a little complicated, but there are plenty of step-by-step tutorials to help you get by, and TF models documentation is really handy and helpful.

We are not going to train the object detection model from scratch because even a tiny model with random weights initialization requires a couple of days of training with GPU.

Instead, we have to download a pre-trained FasterRCNN ResNet101 model from the model zoo and train it with our data.

If your machine is powerful enough, it is worth running an evaluation in conjunction with training and launching Tensorboard to visualize the training process.

4.

Learn how to work with OpenCVOpenCV is considered to be a universal tool for computer vision problems.

It includes many algorithms for image and video processing.

The source code is written on C++ to make it run incredibly fast.

Moreover, it has a Python API that makes OpenCV very handy and easy to use.

The initial challenge to understand OpenCV better is to detect objects on the video using the model from the previous step.

If this was easy, try digging into OpenCV documentation, where you will find interesting algorithms like YOLO v3 object detection model, trained on COCO dataset.

Or use a camera to create a solution to check if there is any food in the fridge.

[Related Article: 20 Free ODSC Resources to Learn Machine Learning]As soon as you finish creating your project, pay attention to its productivity.

Don’t forget about refactoring and consider applying threading.

The application should run faster within multiple threads, though some frames may be lost.

Original post here.

Read more data science articles on OpenDataScience.

com, including tutorials and guides from beginner to advanced levels!.Subscribe to our weekly newsletter here and receive the latest news every Thursday.

.

. More details

Leave a Reply