Which Model and How Much Data?

Let’s start with the selection of the correct deep learning model.Selecting a Baseline ModelThe first thing to figure out when exploring an artificial intelligence(AI) problem is to determine whether its a deep learning problem or not. Many AI scenarios are perfectly addressable using basic machine learning algorithms. However, if the problem falls into the category of “AI-Complete” scenarios such as vision analysis, speech translation, natural language process or others of similar nature, then we need to start thinking about how to select the right deep learning model.Identifying the correct baseline model for a deep learning problem is a complex task that can be segmented into two main parts:I) Select the core learning algorithm.II)Select the optimization algorithm that complements the algorithm selected on step 1.Most deep learning algorithms are correlated to the structure of the training dataset. Again, there is no silver bullet for selecting the right algorithm for a deep learning problem but, some of the following design guidelines should help in the decision:a) If the input dataset is based on images or similar topological structures, then the problem can be tackled using convolutional neural networks(CNNs)(see my previous articles about CNNs).b)If the input is a fixed-size vector, we should be thinking of using a feed-forward network with inter layer connectivity.c) If the input is sequential in nature, then we have a problem better suited for recurrent or recursive neural networks.Those principles are mostly applicable to supervised deep learning algorithms..However, there are plenty of deep learning scenarios that can benefit from unsupervised deep learning models..In scenarios such as natural language processing or image analysis, using unsupervised learning models can be a useful technique to determine relevant characteristics of the input dataset and structure it accordingly.In terms of the optimization algorithm, you can rarely go wrong using stochastic gradient descent(SGD)..Variations of SGD such as the ones using momentum or learning decay models are very popular in the deep learning space..Adam is, arguably, the most popular alternative to SGD algorithms especially when combined with CNNs.Now we have an idea of how to select the right deep learning algorithm for a specific scenario..The next step is to validate the correct structure of the training dataset..We will discuss that in the next part of this article.Gathering the Right Training DatasetStructuring a proper training dataset is an essential aspect of effective deep learning models but one that is particularly hard to solve..Part of the challenge comes from the intrinsic relationship between a model and the corresponding training dataset..If the performance of a model is below expectations, it is often hard to determine whether the causes are related to the model itself or to the composition of the training dataset..While there is no magic formula for creating the perfect training dataset, there are some patterns that can help.When confronted with a deep learning model with poor performance, data scientists should determine if the optimization efforts should focus on the model itself or on the training data..In most real-world scenarios, optimizing a model is exponentially cheaper than gathering additional clean data and retraining the algorithms..From that perspective, data scientists should make sure that the model has been properly optimized and regularized before considering collecting additional data.Typically, the first rule to consider when a deep learning algorithm is underperforming is to evaluate whether it’s using the entire training dataset.. More details

Leave a Reply