Review: DSSD — Deconvolutional Single Shot Detector (Object Detection)

{1.6, 2.0, 3.0} are used.Based on all above, ablation study is performed on PASCAL VOC 2007:Results on PASCAL VOC 2007SSD 321: Original SSD with input 321×321, 76.4% mAP.SSD 321 + PM(c): Original SSD using Prediction Module (c), 77.1 % mAP, which is better than those using PM (b) and PM (d).SSD 321 + PM(c) + DM(Eltw-prod): DM means Deconvolutional Module, thus, this is DSSD using PM(c) with element-wise product used for feature combination, 78.6% mAP..It outperforms the one using element-wise addition a little bit.SSD 321 + PM(c) + DM(Eltw-prod) + Stage 2: Using two-stage training, the performance is decreased.5..Results5.1..PASCAL VOC 2007SSD and DSSD are trained on the union of 2007 trainval and 2012 trainval.SSD300* and SSD512* (* means the new data augementation trick are used.): With *, the original SSD already outperforms other state-of-the-art approaches except R-FCN.SSD 321 and SSD 513: With ResNet as backbone, the performances are already similar to SSD300* and SSD512*.DSSD 321 and DSSD 513: With the deconvolutional path, they outperforms SSD 321 and SSD 513 respectively.Particularly, DSSD513 outperforms R-FCN.5.2..PASCAL VOC 2012VOC2007 trainval+test and 2012 trainval are used for training..Since two-stage training is found to be useless, one-stage training is used here.DSSD 513 outperforms others, with 80.0% mAP..And it is trained without using extra training data from COCO dataset.5.3..MS COCOAgain, no two-stage training.SSD300* is already better than Faster R-CNN and ION.DSSD321 has better AP on small object, with 7.4% compared with SSD321 with only 6.2%.For larger model, DSSD513 obtains 33.2% mAP, which is better than R-FCN of 29.9% mAP..And it already has competitive result with Faster R-CNN+++..(+++ means it also used VOC2007 and VOC2012 for training as well.)5.4..Inference TimeTo simply the network during testing, BN is removed, and merged with the conv as follows:In brief, they try to merge the BN effect into the conv layer’s weight and bias calculation so that the network is simplified..This improves the speed by 1.2 to 1.5 times, and reduce memory up to three times.SSD 513 has similar speed (8.7 fps) and accuracy compared with R-FCN (9 FPS).. More details

Leave a Reply