当前位置:网站首页>Possible problems of long jump in gaussdb
Possible problems of long jump in gaussdb
2022-06-12 16:35:00 【Hua Weiyun】
GaussDB Possible problems in medium and long jump
Problem description , stay GaussDB In coding practice , Found in debug There is no problem with the version without compiler optimization , But in release edition , Some variables fail after assignment , Still old bug, This paper will make a simple analysis from two angles .
What is a long jump ?
stay C In language ,goto Statements often implement short-range jumps in program execution (local jump),longjmp() and setjmp() Function to realize remote jump in program execution (nonlocaljump, Also called farjump).
Mainly related to the signature of two functions :
int setjmp(jmp_buf env); void longjmp(jmp_buf env, int value); It is generally understood as :setjmp Function saves all kinds of context information when executing this function , Store in jmp_buf in , Mainly Current stack position , Register state .longjmp The function jumps to the parameter env The context saved in the buffer ( snapshot ) In the middle . And some people have suggested that they will cooperate with Realization way implementation of .

I think the following sentence is more believable :
The setjmp() function saves the contents of most of the general purpose registers, in the same way as they would be saved on any function entry. It also saves the stack pointer and the return address. All these are placed in the buffer. It then arranges for the function to return zero.
Compiler optimization problems
The problem occurred in debug Version and release The version has different results , The main difference is the optimization process of compiler in compilation and construction . The common methods of compiler optimization are : Cache memory variables into registers .
Because accessing registers is much faster than accessing memory units , When the compiler accesses variables , To improve access speed , Compiler optimizations sometimes read variables into a register first ; When the variable value is taken later, it will be taken directly from the register . But in many cases, dirty data will be read , Seriously affect the running effect of the program .
resolvent C++ Volatile keyword
Volatile, The explanation in the dictionary is : Volatile ; Changeable ; Volatile . Personal understanding is that after each assignment to the variable , It needs to be put into memory , Instead of using registers directly , This can be avoided because jump The non written memory caused by and function jump leads to unsuccessful assignment ( It's still the old value ), Or compiler optimization , Put the value directly in the register ( This value may be used more than once , Avoid multiple reads back and forth from memory ).
Problem recurrence
Instance not optimized ,debug Unoptimized version
#include <stdio.h>#include <stdlib.h>#include <setjmp.h>static jmp_buf env;static voiddoJump(int nvar, int rvar, int vvar){ printf("Inside doJump(): nvar=%d, rvar=%d, vvar=%d\n" , nvar,rvar, vvar); // Dead code block int nvar0 = nvar; int rvar0 = rvar; int vvar0 = vvar; longjmp(env, 1);}int main(int argc, char** argv){ int nvar; register int rvar; volatile int vvar; nvar = 111; rvar = 222; vvar = 333; if(setjmp(env) == 0) { nvar = 777; rvar = 888; vvar = 999; doJump(nvar, rvar, vvar); } else { int nvar1 = nvar; int rvar1 = rvar; int vvar1 = vvar; printf("After longjmp(): nvar =%d, rvar=%d, vvar=%d\n", nvar, rvar, vvar); } exit(EXIT_SUCCESS);}Program run results
Pass the program through gcc Compiling and constructing , No optimizations are used . Run the resulting binary file , The following results can be obtained :

You can find , Register variables rvar The value of is not affected by the subsequent assignment , It's still the old value 222, Different from expectations , But ordinary int The type and volatile The type values are correct . After a long jump , The re assignment of register variables in the jump is easy to cause the problem of loss .
Assembly Perspective
The following figure shows , In assignment ,rvar It's directly in ESI In the register , Without overwriting what was previously stored in memory 222 value , That is to say 888 Assignment to register , And the memory should also be 222, The rest 777,999 All into memory .
And enter the next custom function Function time , All three variables are placed in registers . Carry out value transmission .

The picture below shows , Namely jump Come back when ,rvar True value of ( Value in register 888) Has been lost , The value of the register is jump buffer Flushed by cache value , Later, when printing the variable value , Old value read from memory .

Memory perspective


The above figure shows the completion of assignment 777,888,999, At this time, we found that , This 888 Assigned to register ( As can be seen from the compilation ), Here we find 222 Not covered .
Finally through jump return , Read the values , At this time, the reading is from memory , I found it read 777,222,999, Something unexpected happened to the program . The following figure shows the values in the memory address ,222 stay -0x28 + 0x7fffffffe160 Address bit .

Instance optimization O2,release edition
Program run results
Add in compilation O2 Compiler optimization , And run the program . At this time, it is found that ,nvar and rvar The values of all have changed , Not stored in what we expected 777 and 888, It is old The value has not been changed .
Because there are compiler optimization problems , Variable nvar and rvar In jump , The rewritten value is put into the register ,jump after , The value of the register is flushed , To cause such problems . Variables vvar The value of is put into memory ,jump after , It can still be called through the register pointer .

Next, check the running process of the program and analyze the results .
Assembly Perspective

adopt objdump -d volatile_og You can view the disassembly code of the compiled file . We mainly observe main function , Its from 10c0 Start , According to the above figure Judge env Is it equal to 0 For boundaries , Divided into 3 block , Easy to understand and read .
It is found that there is no pair of functions in the assembly Dojump Call to (callq Not present after command Dojump), The guess is that the compiler is optimized for inline functions . At the same time, the variable in this function nvar0,rvar0,vvar0 Initialize to a dead code block , It was also removed during the optimization process .
The following figure can illustrate , Only use keywords volatile Of vvar Its value can be found in the stack memory , The other variables are not lvalue.

Memory perspective
By looking at jump Values in memory before and after , To see exactly where jump What happened in :
Figure 1 below shows jump Before , Value in register , Only 333 It's in memory . You can also query through figure 2 , Find out rvar and nvar Not accessible through memory address .


stay jump after , Memory e15c Change the value in to 999.
Jump after , The space of stack memory is shown in the figure below :
The following figure , At this time only vvar You can take the address operation .
appendix
Reference material
Linux Assembly language development guide Intel Format --AT&T Format
utilize C In language Setjmp and Longjmp, To implement exception capture and coroutine
Specific optimization parameters that may be involved
l -fforce-mem: Before doing arithmetic , Force memory data copy Into the register and then execute . This causes all memory references to potentially common expressions , To produce more efficient code , When there is no common subexpression , Instruction merging will discharge individual registers into . This optimization is for variables involving only a single instruction , This may not have a great optimization effect . But for many more instructions ( Mathematical operation is required ) For the variables involved in , This will be a significant optimization , Because compared with accessing values in memory , The processor accesses the value in the register much faster .
l -fregmove: The compiler tried to reallocate move The number of registers of instructions or other simple instructions such as operands , In order to Maximum number of bundle registers . This optimization is especially helpful for machines with double operand instructions .
l -fschedule-insns: Compiler attempt Reorder instructions , To eliminate the delay caused by waiting for unprepared data . This optimization will be useful for slow floating-point machines as well as those that need load memory The execution of instructions is helpful , Because other instructions are allowed to execute at this time , until load memory The command of the , Or floating-point instructions again need cpu.
-fschedule-insns Allow data processing to complete other instructions first ;
-fforce-mem May cause the data between the memory and the register to produce the similar dirty data inconsistency and so on . For some logic that depends on the order of memory operations , It needs to be treated strictly before optimization .
for example , use volatile Keywords restrict how variables operate , Or make use of barrier force cpu Executed in strict accordance with the order of instructions .
Memory barrier Memory Barriers
Cache The root cause of the consistency problem is the existence of Exclusive to multiple processors Cache, Instead of multiple processors . It has many restrictions : Multicore , Monopoly Cache,Cache Write strategy .
When one of the conditions is not satisfied, it does not exist cache Consistency issues .
in the light of CPU The multistage of Cache And storage read-write consistency :
CPU In order to improve instruction execution , Added two buffers store buffer, invalidate queue.
Store Buffer:
benefits :store In order to CPU0 and 1 Between reading and writing , No need to wait from another CPU Of Cache Data in .( Increase speed ).
Disadvantage ( Problem description ):CPU0 Modified value , But it sent “ Reading makes invalid ” Later than CPU1 Real reading time , It led to a late step , The data is wrong .
The solution of conflict :
- On the hardware :store forwarding. If the local Store Buffer There's data , Read the team directly first Store Buffer.
- Software : The hardware designer provides memory barrier Instructions , Let the software tell CPU This kind of relationship .
Failure queue :
store buffer It's usually very small , therefore CPU Carry out a few store The operation will fill up , Now CPU Must wait invalidation ACK news ( obtain invalidation ACK After the news will be storebuffer The data in is stored in cache in , Then take it from store buffer Remove ), To release store buffer Buffer space .
benefits :CPU1 May be under heavy load , Executing a large number of failed commands will have a heavier composite . Speed up
Disadvantage ( Problem description ): The value itself may be invalid , But the queue did not execute to .( It's late again ).
solve : Still, adding barriers can solve .
边栏推荐
- Cookies and sessions
- generate pivot data 1
- Sha6 of D to large integer
- 33-【go】Golang sync. Usage of waitgroup - ensure that the go process is completed before the main process exits
- Kill program errors in the cradle with spotbugs
- js監聽用戶是否打開屏幕焦點
- Acwing 1927 automatic completion (knowledge points: hash, bisection, sorting)
- <山东大学项目实训>渲染引擎系统(三)
- 如何基于CCS_V11新建TMS320F28035的工程
- Acwing 1927 自动补全(知识点:hash,二分,排序)
猜你喜欢

大规模实时分位数计算——Quantile Sketches 简史

有哪些特容易考上的院校?

Acwing788. number of reverse order pairs

如何基于CCS_V11新建TMS320F28035的工程

Super detailed dry goods! Docker+pxc+haproxy build a MySQL Cluster with high availability and strong consistency

acwing 800. Target and of array elements

关于组件传值

看《梦华录》上头的人都该尝试下这款抖音特效

统计机器学习代码合集

程序员爆料:4年3次跳槽,薪资翻了3倍!网友:拳头硬了......
随机推荐
The C Programming Language(第 2 版) 笔记 / 8 UNIX 系统接口 / 8.6 实例(目录列表)
Kill program errors in the cradle with spotbugs
Acwing794 high precision Division
Acwing 797 differential
MySQL interview arrangement
34-【go】Golang channel知识点
Gopher to rust hot eye grammar ranking
std::set compare
[fishing artifact] UI library second change lowcode tool -- List part (I) design and Implementation
Comprendre le go des modules go. MOD et go. SUM
Nacos Config 动态刷新源码剖析
pbootcms的if判断失效直接显示标签怎么回事?
《安富莱嵌入式周报》第268期:2022.05.30--2022.06.05
What's the matter with pbootcms' if judgment failure and direct display of labels?
The C programming language (version 2) notes / 8 UNIX system interface / 8.1 file descriptor
MySQL - server configuration related problems
Project training of Shandong University rendering engine system (VII)
The C programming language (version 2) notes / 8 UNIX system interface / 8.5 instance (implementation of fopen and Getc functions)
The C programming language (version 2) notes / 8 UNIX system interface / 8.3 open, create, close, unlink
canvas 处理图像(上)