Data-Free Quantization. Comments

Qualcomm team get amazing results in [1] (MobileNetV2 ImageNet, MobileNetV2 SSD-lite on Pascal VOC segmentation challenge).
   
"3.1. Weight tensor channel ranges
The fact that per-channel quantization yields much better performance on MobileNetV2 than per-tensor quantization suggests that, in some layers, the weight distributions differ so strongly between output channels that the same set of quantization parameters cannot be used to quantize the full weight tensor effectively"

"approach relies on equalizing the weight ranges in the network by making use of a scale-equivariance property of activation functions"

So it should work for int8 or fp32 (fp16 ;) Math. They use some "magic" with biases too (not quite clear for me yet).

Is it possible:
1. just get these quantized models? (as "Pixels" use Qualcomm chips)
2. reproduce these results in TF + Cloud + ImageNet/Pascal VOC (maybe the TF-lite team can help)?

I'll try a few experimsnts in TFjs + "daisy tests" (I like 2 too)

Newest links from the top of the Google Scholar

[1] "Data-Free Quantization Through Weight Equalization and Bias Correction"
Markus Nagel, Mart van Baalen, Tijmen Blankevoort, Max Welling
https://arxiv.org/abs/1906.04721

[2] "Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization"
Eldad Meller, Alexander Finkelstein, Uri Almog, Mark Grobman
https://arxiv.org/abs/1902.01917

[3] "A Quantization-Friendly Separable Convolution for MobileNets"
Tao Sheng, Chen Feng, Shaojie Zhuo, Xiaopeng Zhang, Liang Shen, Mickey Aleksic
https://arxiv.org/abs/1803.08607

[4] "Understanding straight-through estimator in training activation quantized neural nets"
P Yin, J Lyu, S Zhang, S Osher, Y Qi, J Xin

TFjs notesupdated 4 Dec 2019