当前位置：网站首页>A summary of the quantification of deep network model

A summary of the quantification of deep network model

2022-06-30 01:31:00 【Martin の Blog】

Abstract

Because of work , We have worked on model quantification for half a year , Today, I want to make a small summary of the quantitative work over the past six months .

Here I will not explain the principle and calculation method of quantification , A lot of searches on the Internet .（ It mainly promotes the quantification of commercial soup ~~）

Here is my summary of the quantitative work .（ Share with you ）

Why should models be quantified

At ordinary times, we are making models forward and backward When , Most of them are able to support 32bit Computing equipment .

When saving the model , What we often get is parameters , Its scope is also 32bit Floating point range of

Although the accuracy of the model is very high , But models are often very large , The speed of parameter calculation will also be slightly slow

And for the chip industry , Very few chips are equipped with 32bit Calculated （ What I know ）

at present , Chip support 8bit and 16bit There are still more .

In order to make the chip more AI turn , That is to say, the model is 8bit perhaps 16bit Normal reasoning can also be carried out within the parameter range of , And maintain a certain accuracy .

So , Quantitative work is necessary .

The nature of quantification

In fact, the essence of quantification is ：

It can reduce the loss of accuracy （ It can be understood as the precision loss of floating point number to fixed point number ）

Fix the weights of floating-point models with continuous values or the tensor data flowing through the model , A process approximating a finite number of discrete values

Quantization is mainly based on data types with fewer digits, such as 8bit or 16bit To map 32bit Data range of

Pay attention to is , The inputs and outputs of the model are still floating point types （ Here is the process of inverse quantization after quantizing the parameters ）

This reduces model size , The model memory consumption is small , And accelerate the reasoning speed of the model .&#

原网站

版权声明
本文为[Martin の Blog]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/181/202206300124453838.html

当前位置：网站首页>A summary of the quantification of deep network model

A summary of the quantification of deep network model

边栏推荐

猜你喜欢

随机推荐