So it should work for int8 or fp32 (fp16 ;) Math. They use some "magic" with biases too (not quite clear for me yet).
Is it possible:
1. just get these quantized models? (as "Pixels" use Qualcomm chips)
2. reproduce these results in TF + Cloud + ImageNet/Pascal VOC
(maybe the TF-lite team can help)?
I'll try a few experimsnts in TFjs + "daisy tests" (I like 2 too)
Newest links from the top of the Google Scholar [1] "Data-Free Quantization Through Weight Equalization and Bias Correction" Markus Nagel, Mart van Baalen, Tijmen Blankevoort, Max Welling https://arxiv.org/abs/1906.04721 [2] "Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization" Eldad Meller, Alexander Finkelstein, Uri Almog, Mark Grobman https://arxiv.org/abs/1902.01917 [3] "A Quantization-Friendly Separable Convolution for MobileNets" Tao Sheng, Chen Feng, Shaojie Zhuo, Xiaopeng Zhang, Liang Shen, Mickey Aleksic https://arxiv.org/abs/1803.08607 [4] "Understanding straight-through estimator in training activation quantized neural nets" P Yin, J Lyu, S Zhang, S Osher, Y Qi, J Xin