当前位置:网站首页>Possible problems of long jump in gaussdb

Possible problems of long jump in gaussdb

2022-06-12 16:35:00 Hua Weiyun

GaussDB Possible problems in medium and long jump

Problem description , stay GaussDB In coding practice , Found in debug There is no problem with the version without compiler optimization , But in release edition , Some variables fail after assignment , Still old bug, This paper will make a simple analysis from two angles .

What is a long jump ?

stay C In language ,goto Statements often implement short-range jumps in program execution (local jump),longjmp() and setjmp() Function to realize remote jump in program execution (nonlocaljump, Also called farjump).

Mainly related to the signature of two functions :

int setjmp(jmp_buf env); void longjmp(jmp_buf env, int value); 

It is generally understood as :setjmp Function saves all kinds of context information when executing this function , Store in jmp_buf in , Mainly Current stack position , Register state .longjmp The function jumps to the parameter env The context saved in the buffer ( snapshot ) In the middle . And some people have suggested that they will cooperate with Realization way implementation of .

data The change of

I think the following sentence is more believable :
The setjmp() function saves the contents of most of the general purpose registers, in the same way as they would be saved on any function entry. It also saves the stack pointer and the return address. All these are placed in the buffer. It then arranges for the function to return zero.

Compiler optimization problems

The problem occurred in debug Version and release The version has different results , The main difference is the optimization process of compiler in compilation and construction . The common methods of compiler optimization are : Cache memory variables into registers .

Because accessing registers is much faster than accessing memory units , When the compiler accesses variables , To improve access speed , Compiler optimizations sometimes read variables into a register first ; When the variable value is taken later, it will be taken directly from the register . But in many cases, dirty data will be read , Seriously affect the running effect of the program .

resolvent C++ Volatile keyword

Volatile, The explanation in the dictionary is : Volatile ; Changeable ; Volatile . Personal understanding is that after each assignment to the variable , It needs to be put into memory , Instead of using registers directly , This can be avoided because jump The non written memory caused by and function jump leads to unsuccessful assignment ( It's still the old value ), Or compiler optimization , Put the value directly in the register ( This value may be used more than once , Avoid multiple reads back and forth from memory ).

Problem recurrence

Instance not optimized ,debug Unoptimized version

#include <stdio.h>#include <stdlib.h>#include <setjmp.h>static jmp_buf env;static voiddoJump(int nvar, int rvar, int vvar){    printf("Inside doJump(): nvar=%d, rvar=%d, vvar=%d\n"                , nvar,rvar, vvar);    // Dead code block     int nvar0 = nvar;    int rvar0 = rvar;    int vvar0 = vvar;    longjmp(env, 1);}int main(int argc, char** argv){    int nvar;    register int rvar;    volatile int vvar;    nvar = 111;    rvar = 222;    vvar = 333;    if(setjmp(env) == 0)    {        nvar = 777;        rvar = 888;        vvar = 999;        doJump(nvar, rvar, vvar);    }    else    {        int nvar1 = nvar;        int rvar1 = rvar;        int vvar1 = vvar;        printf("After longjmp(): nvar =%d, rvar=%d, vvar=%d\n", nvar, rvar, vvar);    }                                                                               exit(EXIT_SUCCESS);}

Program run results

Pass the program through gcc Compiling and constructing , No optimizations are used . Run the resulting binary file , The following results can be obtained :

527441915224889.png

You can find , Register variables rvar The value of is not affected by the subsequent assignment , It's still the old value 222, Different from expectations , But ordinary int The type and volatile The type values are correct . After a long jump , The re assignment of register variables in the jump is easy to cause the problem of loss .

Assembly Perspective

The following figure shows , In assignment ,rvar It's directly in ESI In the register , Without overwriting what was previously stored in memory 222 value , That is to say 888 Assignment to register , And the memory should also be 222, The rest 777,999 All into memory .
And enter the next custom function Function time , All three variables are placed in registers . Carry out value transmission .

230362014246524.png

The picture below shows , Namely jump Come back when ,rvar True value of ( Value in register 888) Has been lost , The value of the register is jump buffer Flushed by cache value , Later, when printing the variable value , Old value read from memory .

188765214244036.png

Memory perspective

image.png

image.png
The above figure shows the completion of assignment 777,888,999, At this time, we found that , This 888 Assigned to register ( As can be seen from the compilation ), Here we find 222 Not covered .

Finally through jump return , Read the values , At this time, the reading is from memory , I found it read 777,222,999, Something unexpected happened to the program . The following figure shows the values in the memory address ,222 stay -0x28 + 0x7fffffffe160 Address bit .

image.png

Instance optimization O2,release edition

Program run results

Add in compilation O2 Compiler optimization , And run the program . At this time, it is found that ,nvar and rvar The values of all have changed , Not stored in what we expected 777 and 888, It is old The value has not been changed .
Because there are compiler optimization problems , Variable nvar and rvar In jump , The rewritten value is put into the register ,jump after , The value of the register is flushed , To cause such problems . Variables vvar The value of is put into memory ,jump after , It can still be called through the register pointer .

image.png

Next, check the running process of the program and analyze the results .

Assembly Perspective

image.png

adopt objdump -d volatile_og You can view the disassembly code of the compiled file . We mainly observe main function , Its from 10c0 Start , According to the above figure Judge env Is it equal to 0 For boundaries , Divided into 3 block , Easy to understand and read .
It is found that there is no pair of functions in the assembly Dojump Call to (callq Not present after command Dojump), The guess is that the compiler is optimized for inline functions . At the same time, the variable in this function nvar0,rvar0,vvar0 Initialize to a dead code block , It was also removed during the optimization process .

The following figure can illustrate , Only use keywords volatile Of vvar Its value can be found in the stack memory , The other variables are not lvalue.

image.png

Memory perspective

By looking at jump Values in memory before and after , To see exactly where jump What happened in :

Figure 1 below shows jump Before , Value in register , Only 333 It's in memory . You can also query through figure 2 , Find out rvar and nvar Not accessible through memory address .

Jump Previous value , It is known that 333 It's in storage 0xC(%rsp 0x7fffffffe150) in

 Only vvar You can take the address , by lvalue, The rest are stored in the register

stay jump after , Memory e15c Change the value in to 999.
Jump after , The space of stack memory is shown in the figure below :
image.png
The following figure , At this time only vvar You can take the address operation .
image.png

appendix

Reference material

Specific optimization parameters that may be involved

l -fforce-mem: Before doing arithmetic , Force memory data copy Into the register and then execute . This causes all memory references to potentially common expressions , To produce more efficient code , When there is no common subexpression , Instruction merging will discharge individual registers into . This optimization is for variables involving only a single instruction , This may not have a great optimization effect . But for many more instructions ( Mathematical operation is required ) For the variables involved in , This will be a significant optimization , Because compared with accessing values in memory , The processor accesses the value in the register much faster .
l -fregmove: The compiler tried to reallocate move The number of registers of instructions or other simple instructions such as operands , In order to
Maximum number of bundle registers
. This optimization is especially helpful for machines with double operand instructions .
l -fschedule-insns: Compiler attempt Reorder instructions , To eliminate the delay caused by waiting for unprepared data . This optimization will be useful for slow floating-point machines as well as those that need load memory The execution of instructions is helpful , Because other instructions are allowed to execute at this time , until load memory The command of the , Or floating-point instructions again need cpu.

-fschedule-insns Allow data processing to complete other instructions first ;
-fforce-mem May cause the data between the memory and the register to produce the similar dirty data inconsistency and so on . For some logic that depends on the order of memory operations , It needs to be treated strictly before optimization .
for example , use volatile Keywords restrict how variables operate , Or make use of barrier force cpu Executed in strict accordance with the order of instructions .

Memory barrier Memory Barriers

Cache The root cause of the consistency problem is the existence of Exclusive to multiple processors Cache, Instead of multiple processors . It has many restrictions : Multicore , Monopoly Cache,Cache Write strategy .
When one of the conditions is not satisfied, it does not exist cache Consistency issues .

in the light of CPU The multistage of Cache And storage read-write consistency :
CPU In order to improve instruction execution , Added two buffers store buffer, invalidate queue.
image.png

Store Buffer
benefits :store In order to CPU0 and 1 Between reading and writing , No need to wait from another CPU Of Cache Data in .( Increase speed ).
Disadvantage ( Problem description ):CPU0 Modified value , But it sent “ Reading makes invalid ” Later than CPU1 Real reading time , It led to a late step , The data is wrong .
The solution of conflict :

  • On the hardware :store forwarding. If the local Store Buffer There's data , Read the team directly first Store Buffer.
  • Software : The hardware designer provides memory barrier Instructions , Let the software tell CPU This kind of relationship .

Failure queue
store buffer It's usually very small , therefore CPU Carry out a few store The operation will fill up , Now CPU Must wait invalidation ACK news ( obtain invalidation ACK After the news will be storebuffer The data in is stored in cache in , Then take it from store buffer Remove ), To release store buffer Buffer space .
benefits :CPU1 May be under heavy load , Executing a large number of failed commands will have a heavier composite . Speed up
Disadvantage ( Problem description ): The value itself may be invalid , But the queue did not execute to .( It's late again ).
solve : Still, adding barriers can solve .

原网站

版权声明
本文为[Hua Weiyun]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202281724388790.html