CNN translation/scaling invariance



it
Press "→" or "↓" button to make horizontal or vertical scan (scan step is one pixel). Press "↔" button to test scaling invariance. You can set new number of testing points - "it".

Drag mouse (from the top to the right) to mark a new region for inference. Use the A,S,D,W keys to translate and J,K,L,I to enlarge bounding box (hold the Shift key to move x10 faster). This region from the source image (see its position, width and height in console) is scaled into 224x224 image and is shown to the right.

Open a new browser window (or tab) and Copy / Paste(Ctrl+V) images from image.google.com, unsplash.com, ... You can Copy images from your PC (+ flip, grayscale and much more) e.g. by Photos or Paint.

Test results

Convolution filters have translation invariance but probabilities are noisy and can have long oscillations.
"→" translation scan
Scaling scan has big oscillations and noise too.
"↔" MobileNet v2 1 scaling scan

For this drawing the Top5 classes order is much more stable. Noise reduction requires averaging over large amount of data.

Size does matter

Smaller nets are more noisy. You see below that ResNet50 really is scale invariant.
"↔" ResNet50 scaling scan

"Making convolutional networks shift-invariant again"

R Zhang - arXiv preprint arXiv:1904.11486, 2019 - arxiv.org

"Modern convolutional networks are not shift-invariant, as small input shifts or translations can cause drastic changes in the output. Commonly used downsampling methods, such as max-pooling, strided-convolution, and average-pooling, ignore the sampling theorem. The well-known signal processing fix is anti-aliasing by low-pass filtering before downsampling..."

We need more tests and testers. I think "I'll be back" here in a while.


TFjs notes     updated 12 Nov 2019