Deep Learning Model Training Loop

Deep Learning Model Training LoopImplementing a simple neural network training loop with Python, PyTorch, and TorchVision.Several months ago I started exploring PyTorch — a fantastic and easy to use Deep Learning framework..This time I would like to focus on the topic essential to any Machine Learning pipeline — a training loop.The PyTorch framework provides you with all the fundamental tools to build a machine learning model..Nevertheless, I believe that every once in a while most of the software engineers have a strong desire to implement things “from scratch” to get a better understanding of underlying processes and to get skills that do not depend on a particular implementation or high-level library.In the next sections, I am going to show how one can implement a simple but useful training loop using pytorch and torchvision packages.TL;DR: Please follow this link to get right into the repository where you can find the source code discussed in this post..Also, here is a link to the notebook that contains the whole implementation in a single place, as well as additional information not included in the post to make it concise.Out-of-the-Box SolutionsAs it was noted, there are some high-level wrappers built on top of the framework that simplify the model training process a lot..In the order of the increasing complexity, from minimalistic to very involved:Ignite — an official high-level interface for PyTorchTorchsample — a Keras-like wrapper with callbacks, augmentation, and handy utilsSkorch — a scikit-learn compatible neural network libraryfastai — a powerful end-to-end solution to train Deep Learning models of various complexity with high accuracy and computation speedThe main benefit of high-level libraries is that instead of writing custom utils and wrappers to read and prepare the data, one can focus on the data exploration process itself — no need to find bugs in the code, hard-working maintainers improving the library and ready to help if you have issues..No need to implement custom data augmentation tools or training parameters scheduling, everything is already here.Using a well-maintained library is a no-doubt choice if you’re developing a production-ready code, or participating in a data science competition and need to search for the best model, and not sitting with a debugger trying to figure out where this memory error comes..If so, let’s processed to the next section and start reinventing the bicycle!The Core ImplementationThe very basic implementation of the training loop is not that difficult..To keep the post concise, we’re going to describe only a couple of them and move the rest few into a Jupyter notebook.LossThe very first thing that comes into mind when talking about Machine Learning model training is a loss function..Modern neural network training algorithms don’t use fixed learning rates..The recent papers (one, two, and three) shows an educated approach to tune Deep Learning models training parameters..The idea of this schedule is to use a single cycle of learning rate increasing-decreasing during the whole training process as the following picture shows.One-cycle policy schedulerAt the very beginning of the training process, the model weights are not optimal, yet so we can allow yourself use larger update steps (i.e., higher learning rates) without risk to miss optimal values..After a few training epochs, the weights become better and better tailored to our dataset, so we’re slowing down the learning pace and exploring the learning surface more carefully.The One-Cycle Policy has a quite straightforward implementation if we use the previously shown class..However, you can find a fully functional code in the aforementioned Jupyter notebook.Stream LoggerThe last thing we would like to add is some logging to see how well our model performs during the training process.. More details

Leave a Reply