Review: ResNeXt — 1st Runner Up in ILSVRC 2016 (Image Classification)

And ResNeXt becomes the 1st Runner Up of ILSVRC classification task.ILSVRC 2016 Classification Ranking http://image-net.org/challenges/LSVRC/2016/results#locResidual Block in ResNet (Left), A Block of ResNeXt with Cardinality = 32 (Right)Compared with ResNet (The winner in ILSVRC 2015, 3.57%) and PolyNet (2nd Runner Up, 3.04%, Team name CU-DeepLink), ResNeXt got 3.03% Top-5 error rate, which is a large relative improvement of about 15%!!And it is published in 2017 CVPR, which has already got over 500 citations while I was writing this story..Aggregated TransformationsA Block of ResNeXt with Cardinality = 32 (Left), and Its Generic Equation (Right)In contrast to “Network-in-Network”, it is “Network-in-Neuron” expands along a new dimension..Relationship with Inception-ResNet and ResNetTo compare, the above 3 blocks are having the SAME INTERNAL DIMENSIONS within each block.ResNet Block (Right)Conv1×1–Conv3×3–Conv1×1 are done at the convolution path, which is a bottleneck design suggested in ResNet..Thus, the convolution path is learning the residual representation.Inception-ResNet Block (Middle)This is suggested in Inception-v4 to combine the Inception module and ResNet block..And Conv1×1 is used to restore the dimensions from 128 to 256.Finally the output is added with the skip connection path.ResNeXt Block (Left)For each path, Conv1×1–Conv3×3–Conv1×1 are done at each convolution path..If we sum up the dimension of each Conv3×3 (i.e. d×C=4×32), it is also the dimensions of 128.The dimension is increased directly from 4 to 256, and then added together, and also added with the skip connection path.Compared with Inception-ResNet that it needs to increase the dimension from 4 to 128 then to 256, ResNeXt requires minimal extra effort designing each path.Compared with ResNet, ResNeXt is a wider but sparsely connected module..Importance of CardinalityAblation Study for Different Settings of 2× Complexity ModelsResNet-200: 21.7% top-1 and 5.8% top-5 error rates.ResNet-101, wider: only obtains 21.3% top-1 and 5.7% top-5 error rates, which means only making it wider does not help much.ResNeXt-101 (2×64d): By just making C=2 (i.e. two convolution paths within the ResNeXt block), an obvious improvement is already obtained with 20.7% top-1 and 5.5% top-5 error rates.ResNeXt-101 (64×4d): By making C=64 (i.e. two convolution paths within the ResNeXt block), an even better improvement is already obtained with 20.4% top-1 and 5.3% top-5 error rates..It is also the dataset for ILSVRC classification task.With standard size image used for single crop testing, ResNeXt-101 obtains 20.4% top-1 and 5.3% top-5 error rates,With larger size image used for single crop testing, ResNeXt-101 obtains 19.1% top-1 and 4.4% top-5 error rates, which has better results than all state-of-the-art approaches, ResNet, Pre-Activation ResNet, Inception-v3, Inception-v4 and Inception-ResNet-v2.4.2.. More details

Leave a Reply