Dog Breed Identification

The initial red layers will have small learning rate as we don’t want to disturb them much, the middle blue layers will have learning rate higher than initial layers and the final green layers will be having the highest learning rate that’s optimal.How less/more the learning rates for initial and middle layers depends on the data correlation between the pre-trained model and our required model..For example, if the task is to create a dog/cat classifier and our pre-trained model is already good at recognizing cats, then we can use learning rates of less magnitude..But if our task is to create some model on satellite imagery/medical imagery then we will have learning rates of slightly higher magnitude.Test Time Augmentation(TTA):During the testing phase, all our test images are automatically cropped to a square as they are passed through our model..The reason for this is kind of a minor technical detail, but GPU is not able to perform efficiently if you have different dimensions for images..This may be fixed in the future but for now, that is the state of the technology we have.This might lead to omitting of some very important characteristics of the image which are crucial for its accurate prediction.In Test Time Augmentation, we are going to take 4 data augmentations at random as well as the un-augmented original (center-cropped)..We will then calculate predictions for all these images, take the average, and make that our final prediction..This ensures that we capture the whole picture within these 5 different squares.ResultUsing the above practices, I was able to achieve a 92.6% accuracy on the validation set i.e the pictures that the model had not seen before.Referencesfast.aiLeslie N..Smith..Cyclical Learning Rates for Training Neural Networks..arXiv preprint arXiv:1506.01186 More details

Leave a Reply