Generalization of Inception-ResNet modules. A PolyInception module is represented as a polynomial composition of Inception blocks, which may share parameters inside the same module.
- I is Input
- \(F, G, H\) are inception “blocks”
- Same letter = Shared parameters
- poly-2: \(I + F + F^2\)
- mpoly-2: \(I + F + GF\)
- 2-way: \(I + F + G\)
NOTE: “mpoly-2” and “2-way” modules posess stronger expressive power but increase parameter size.
Could go even further:
- poly-3: \(I + F + F^2 + F^3\)
- mpoly-3: \(I + F + GF + HGF\)
- 3-way: \(I + F + G + H\)
Inception-ResNet vs PolyNet structure
Inception-ResNet is composed of 3 “stages” (A-B-C) that operate on different spatial resolutions.
Experiments and Results
Dataset: ILSVRC (ImageNet - 1000 classes)
Replacing Inception-ResNet-v2 stage B with PolyInceptions results in greater performance gains (compared to replacing the other 2 stages).
Using mixed PolyInception modules provides the best results.
PolyNet performance scales better with depth than ResNet/Inception-ResNet.
Performance gains over SotA are of the order of >1%.
Take home message: Structural diversity is recommended when deepening a CNN.