# SSD: Single Shot MultiBox Detector

## Summary

**SSD** is an adaptation of **YOLO** to support prior boxes. Prior boxes (called default boxes in the paper) are pre-calculated boxes with different aspect ratios and scales.
At prediction time, **SSD** predicts the correct prior box and the associated class. Also, SSD is using multiples feature maps to achieve a better performance.

Model | Mean avg precision | FPS | Input size |
---|---|---|---|

Faster R-CNN | 73.2 | 7 | 1000x600 |

YOLO (VGG-16) | 66.4 | 21 | 443x443 |

SSD512 | 76.8 | 22 | 512x512 |

SSD300 | 74.3 | 59 | 300x300 |

Fast-YOLO | 52.7 | 155 | 443x443 |

##### Faster R-CNN works on any input size

## Model

The model is using the **VGG-16** model for its base. It then uses several feature maps to produce its output.

It’s a more complex model than **YOLO** but it’s faster because the input size is smaller.

Using Atrous Convolution speeds up the model by 20%

### Default boxes

Default boxes are computed from the training sets, they are similar to the anchor boxes from Faster R-CNN. They help the network getting the right aspect-ratio.

## Loss

The loss function is similar to **YOLO**’s loss function. Instead of multiple detections per cell, it predicts a box per prior box. The loss is computed on the prior boxes with a Jaccard overlap bigger than 0.5. This allows multiple predictions per cell.