当前位置:网站首页>I-BERT
I-BERT
2022-07-06 08:57:00 【cyz0202】
background
In this paper, ICML2021 I-BERT: Integer-only BERT Quantization
The purpose of this article is to BERT Perform more thorough quantization and integer calculations ;
The author believes that the previous quantitative scheme is not right gelu、softmax These nonlinear operations are quantified ( Here's the picture 1), That is to keep float Type of calculation , Not only affects the computational efficiency , And it cannot be deployed on some chips that only support integer Computing ;

The quantitative scheme adopted by the author is 8bits Symmetric quantization ;
Existing schemes and deficiencies
The author mainly solves GELU、softmax The quantization problem of these two kinds of nonlinear layers ;
First look at it. GELU The expression of , as follows ,erf go by the name of error function
among
And
GELU It is difficult to quantify directly , Forced quantization will lead to a large loss of accuracy ;
Unlike linear layers ( Such as matrix product 、 Piecewise linear RELU etc. ), The linear property can be used to inverse quantize to float The result of the calculation is ( The author gives an example MatMul(Sq) = S*MatMul(q), among x=Sq,S by scale,q by x Quantized value of );
Some existing approximations GELU The plan , Include :
- sigmoid The approximate , as follows , Introduce nonlinearity sigmoid, It's still not good for integer calculation
- ReLU6 The approximate , as follows , Use ReLU6, Although it can be integer , But it didn't work ; The program is also known as h-GELU
The figure below 2 The picture on the left shows h-GELU The shortcomings of

GELU Solutions for
By analyzing , It is considered that second-order polynomial pairs can be introduced erf Make an approximation , Further to GELU Make an approximation , The calculation method is as follows
This idea comes from the theory that any function can be fitted by polynomial function , This type of polynomial is called interpolating polynomials( Interpolation polynomial ); For details, please move to the original ;
The result obtained by directly optimizing the above formula is not ideal , as a result of erf The definition domain of is a real domain scope ;
in consideration of erf The value range of is [-1, 1], And erf It's an odd function , namely
Therefore, the author designs the positive real number field part , And extended to negative real number field , Get the following L(x),
, among
clip Medium max Express |x| The maximum value is -b;
therefore , And it is an odd function ;
a、b By looking for some GELU To solve the fitting problem ;
As can be seen from the above ,
i-GELU Quantitative scheme of
With GELU Polynomial expression of , You can start designing quantitative solutions ;
L(x) It's a polynomial , So you have to know how to quantize polynomials first ;
The author gives a polynomial Quantization Algorithm I-POLY, as follows

Can verify ,
So arbitrary 2 Quantization of order polynomials 、 The above algorithm can be used for inverse quantization ;
( notes : I feel that the quantification here belongs to a kind of quantification for calculating quantification ; The calculation process is ok , The feeling is deliberately constructed ,q_out and S_out Are not necessarily the real quantized values of polynomial results and scale)
------
With the polynomial quantization method , You can continue to realize I-GELU The quantitative scheme of , The calculation process is as follows

The call stack is I-GELU -> I-ERF -> I-POLY
Pay attention to the picture 4 Some implementation tips in the algorithm , Such as
,
Notice the above formula max=-b/S, It may have to be changed to max=round(-b/S), Otherwise q’ There is no guarantee that it is integer ...
------
The above is the I-GELU Implementation process , The effect is as follows

SOFTMAX Solutions for
- Use higher-order polynomials for approximation , Available scenarios are limited ;
SOFTMAX Quantitative scheme of
For numerical stability , The author first gives a brief introduction to softmax To deal with , as follows
It is worth mentioning that ,
For a non positive real number , It can be approximated by the following formula
among z( merchant ) Is a non negative integer ,p( remainder ) Value range ;
Then there are
Upper form >> Indicates the right shift operation ;
further , If you can Expressed as integer calculation , Then it can be used for all
as well as Softmax Perform integer calculation ;
and in p Value range of relative x perhaps
Much smaller , It can be approximated better ;
To recall GELU, The author proposes to adopt 2 Order polynomial approximates nonlinear function ; You can do the same here ;
Author search The method of approximating second-order polynomials , It is through
Calculate the optimal solution of the following formula in the range :
The resulting
be
among ,
chart 2 The figure on the right shows that the above approximation has a good effect ;
Quantitative calculation method of polynomials I-POLY It has been introduced above , So the whole thing Softmax The quantitative calculation method of is

Basic ideas and I-GELU almost
#TODO#: The last step There seems to be a problem ...
LayerNorm Quantitative scheme of
- To be continued
I-BERT Analysis of the implementation of
- Will be discussed in another article
summary
- This paper introduces I-BERT Improvement points and GELU/SOFTMAX Integer calculation of Implementation method ;
- The main idea is through 2 Order polynomial approximation , Right again 2 Order polynomial for quantitative calculation ;
边栏推荐
- pytorch查看张量占用内存大小
- Implement window blocking on QWidget
- LeetCode:剑指 Offer 04. 二维数组中的查找
- After reading the programmer's story, I can't help covering my chest...
- MySQL uninstallation and installation methods
- Navicat Premium 创建MySql 创建存储过程
- 软件压力测试常见流程有哪些?专业出具软件测试报告公司分享
- Detailed explanation of dynamic planning
- Hutool gracefully parses URL links and obtains parameters
- Variable length parameter
猜你喜欢
[MySQL] multi table query
[embedded] cortex m4f DSP Library
项目连接数据库遇到的问题及解决
Chapter 1 :Application of Artificial intelligence in Drug Design:Opportunity and Challenges
MYSQL卸载方法与安装方法
Swagger setting field required is mandatory
SAP ui5 date type sap ui. model. type. Analysis of the parsing format of date
Guangzhou will promote the construction of a child friendly city, and will explore the establishment of a safe area 200 meters around the school
UML图记忆技巧
Delay initialization and sealing classes
随机推荐
@Jsonbackreference and @jsonmanagedreference (solve infinite recursion caused by bidirectional references in objects)
有效提高软件产品质量,就找第三方软件测评机构
After PCD is converted to ply, it cannot be opened in meshlab, prompting error details: ignored EOF
LeetCode:34. Find the first and last positions of elements in a sorted array
LeetCode:41. Missing first positive number
[MySQL] multi table query
Notes 01
Problems encountered in connecting the database of the project and their solutions
Unsupported operation exception
The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
ROS compilation calls the third-party dynamic library (xxx.so)
Export IEEE document format using latex
UML圖記憶技巧
Leetcode: Sword finger offer 42 Maximum sum of continuous subarrays
LeetCode:39. 组合总和
LeetCode:124. 二叉树中的最大路径和
Revit secondary development Hof method calls transaction
Computer graduation design PHP Zhiduo online learning platform
LeetCode:34. 在排序数组中查找元素的第一个和最后一个位置
Mise en œuvre de la quantification post - formation du bminf