Image Classification for E-Commerce — Part I

No, it’s still not there!One of the most crucial procedures to be performed for getting the sought-after results follows.

Data Cleaning-We want to prepare a homogeneous input data set to avoid chances of the model getting biased towards certain micro category ID’s images at the training time and hence, always predicting those micro-categories.

We devised a shell script which converts all the images to .

jpeg (both in name and format, excluding .

png, .

gif, etc.

),removes erroneous files downloaded, deletes duplicate files and reduces down all the images’ dimensions to 224 X 224 px (if the width or height is greater than 224 px).

We chose the image dimension size 224 X 224 px because the top1 and top5 errors are observed to be lesser for this setting.

For more details check this out.

In top1 score, you check if the top class predicted (the one having the highest probability) is the same as the target label.

In the case of top5 score, you check if the target label is one of your top 5 predictions (the 5 ones with the highest probabilities).

Label Loading-As the name suggests, it’s a process for manipulating the data load on our labels in such a manner that each of the labels’ images count reaches a comparable level.

In order to feed the model with the data covering various kinds of product images received from users every day, we performed image augmentations on the existing data set.

Here, the input image has been augmented in various forms (by introducing noise and inclination) to increase scarcely available training data set.

The following python script was used to mirror each image, rotating them clockwise, anti-clockwise, also a grainy image for each was created.

We also have to ensure that some Micro Category IDs do not end up with more weight than others and bias the model thus, predicting incorrect results.

This may be the case where some micro category has more listed products (with images) than the other.

For instance, In Macro Category ‘Inflatable Furniture’ , the Micro Category ‘Inflatable Sofa’ has more number of products with images than Micro category ‘Inflatable Couch’.

We used a Python script which multiplies images in the folder train’s Micro category ID folders.

Currently, we have taken the multiplication factor as 98 percentile (i.


that one image count out of all individual folder’s image counts where 98% images of the total count are covered).

Good to go for the training finally!Training the model-Use the main.

lua file under the folder downloaded from Github i.




torch to train the model’s layers with our own data set.

th ~/fb.



lua -nClasses 122 -nEpochs 100 -data ~/imageclassification/train/ -save ~/Desktop/imageclassification_c122e100b30t4g1 -batchSize 30 -nThreads 4 -nGPU 1-nClasses is our number of labels (i.


micro Category IDs)-data is the path of the train folder-save is the folder path where all models (i.



t7 files) will be saved (A model is generated at the end of each epoch in this new folder specified i.


~/Desktop/imageclassification_c122e100b10t4g1, where each model is trained above its preceding model’s data.

)-batchSize is the number of images taken to train in each epoch-nEpochs is the number of iterations we want our model to run for (We also get the top1 and top5 error at the end of each epoch which is used for the analysis of the best model)-nThreads is the number of threads used of the GPU-nGPU is the numbers of GPU we are going to use for the trainingAnother parameter is -depth (not used here) and hence, by default, we have a ResNet-34 model.

Had it been 50, we would have a ResNet-50 model.

Various other parameters can be used at the time of training as per your convenience and resource availability.

They can be explored by:th main.

lua –helpOKAY! Let’s fire the command!DONE!Depending upon your training data set size and system speed, you should be patient and provide your student with ample time to learn well :)Training…Training……Training!!!main.

lua automatically generates two extra models in parallel i.



t7 and latest.

t7 in the -save folder.


t7 is the replica of the model which was created on an epoch which had the least top1 and top5 errors.


t7 is the fully trained model i.


the model created by the last epoch.

These two models are not always the same.

In our case, the best model was observed to be generated at the epoch 99 but the latest model was that from epoch 100.

So, we used the model_best.

t7 for testing.

Testing the model-The classify.

lua (in the folder fb.


torch/pretrained/) is used to get the top 5 predictions from model_best.

t7 for all our test images.

An important thing to note here is that the classify.

lua derives the prediction labels from the imagenet.

lua(in the same folder i.



So, we substitute the old labels in imagenet.

lua (names of birds, animals, etc.

from the ImageNet database) with our own label values i.


Micro category IDs.

The left image shows the original labels in the imagenet.

lua file.

The micro category ID’s which are used as our labels are being substituted in the same file as shown in the right picture.

Now, let us test our best model!We picked some labels from the ‘train’ and copied them in the ‘val’ for testing the accuracy of our trained model on the same data it was trained on.

The testing command below outputs the top 5 predicted results for each image in ‘val’ along with their predicted probabilities:for f in ~/imageclassification/val/* ;do ( [ -d $f ] && cd "$f" && echo Entering into $f && th ~/fb.



lua ~/Desktop/imageclassification_c122e100b30t4g1/model_best.

t7 $f/* >> ~/Desktop/imageclassification_c122e100b30t4g1.

txt); doneThe AftermathThis text file created was analyzed by converting it into an excel file.

The testing file created by running classify.

luaThe excel file has the above predictions separated as columns (using an R script) for Original Micro Category ID and Product ID (from the images’ local path) and also the Predicted Micro Category 1, Predicted Micro Category 1’s probability.

By putting a match check on the predicted and original Micro Category, we observed the predictions to be true in 99.

28% of the cases.

These are those cases on which the model was trained.

Later, we repeated the testing activity for 70 new products added(thus, the trained model had not learned them earlier).

In this case, above a threshold of 50% model confidence(probability) which covered 80% of the data, we observed the accuracy as 95.


The Precision is observed as 1.

00 and Sensitivity/Recall as 0.

95 for the model.

The confusion matrix for this result is given as:The matrix shows results for 65/70 cases excluding 5 cases where the user uploaded product image was not clear or was irrelevant for this macro category.

What follows ahead…We are working on analyzing the prediction accuracy at Macro Category level as well.

We intend to employ feature extraction to find similar images among different Micro Categories.

A useful script for this purpose i.



lua is provided in the fb.


torch/pretained folder.

We wish to leverage Image classification in identifying banned content on the platform.

Further posts in this series will follow soon to highlight the implementation of the above goals.

I hope this piece helped you expand your knowledge about using Resnet for custom business cases, If you have any questions or comments I would love to hear from you.

You can reach me at jain.


comThanks to Vikram Varshney, Ayush Gupta, Ashutosh Singh, Mukesh Kumar for the extended support.


. More details

Leave a Reply