当前位置:网站首页>MIPS general purpose register + instruction

MIPS general purpose register + instruction

2022-06-12 17:57:00 _ kerneler


MIPS General registers

MIPS Yes 32 General registers ($0-$31), The function of each register and the usage convention in the assembler are as follows :

The following table describes 32 Aliases and uses of general registers

REGISTER

NAME

USAGE

$0

$zero

Constant 0(constant value 0)

$1

$at

Reserved for assembler (Reserved for assembler)

$2-$3

$v0-$v1

Function call return value (values for results and expression evaluation)

$4-$7

$a0-$a3

Function call parameter (arguments)

$8-$15

$t0-$t7

Temporary ( Or whatever )

$16-$23

$s0-$s7

The saved ( Or if you use , need SAVE/RESTORE Of )(saved)

$24-$25

$t8-$t9

Temporary ( Or whatever )

$28

$gp

Global pointer (Global Pointer)

$29

$sp

Stack pointer (Stack Pointer)

$30

$fp

Frame pointer (Frame Pointer)

$31

$ra

The return address (return address)



I'll give you a detailed explanation :
$0: namely $zero , This register always returns zero , by 0 This useful constant provides a concise form of coding .
           move $t0,$t1
        For the actual
           add $t0,$0,$t1
        Using pseudo instructions can simplify tasks , Assembler provides a richer instruction set than hardware .
$1: namely $at, This register is reserved for assembly , because I The immediate digit segment of type B instruction is only 16 position , When loading large constants , Compiler or assembler requires
        Take the big constant apart , And then reassemble it into the register . For example, loading a 32 Bit immediacy requires lui( Load high immediate ) and addi Two article
        Instructions . image MIPS Large constants are disassembled and reassembled by assembler , The assembler must have a temporary register to reassemble large constants , this
        It's also for assembly Retain $at One of the reasons .
$2..$3:($v0-$v1) A non floating point result or return value for a subroutine , For subroutines how to pass parameters and how to return ,MIPS There is a set of conventions
        set , The contents in a few places in the stack are loaded CPU register , Its corresponding memory location remains undefined , When these two registers are not enough
        When putting the return value , The compiler does it in memory .
$4..$7:($a0-$a3) Used to pass the first four parameters to the subroutine , Not enough stack . a0-a3 and v0-v1 as well as ra Together To support subroutines / The process
        call , They are used to pass parameters , Return results and store return address . When more registers are needed , You need a stack (stack)
        了 ,MIPS The compiler always leaves space in the stack for parameters in case they need to be stored .
$8..$15:($t0-$t7) Temporary register , Subroutines can use them without reservation .
$16..$23:($s0-$s7) Save register , In the process of procedure call, it is necessary to guarantee leave ( The callee saves and recovers , It also includes $fp and $ra ),MIPS
        Temporary register and save register are provided , This reduces register overflow (spilling, The process of putting infrequently used variables into memory ),
        The compiler is compiling a leaf (leaf) The process ( A procedure that does not call another procedure ) When , Always use it after temporary register allocation
        Saved registers .
$24..$25:($t8-$t9) Same as ($t0-$t7)
$26..$27:($k0,$k1) For the operating system / Exception handling reserved , Reserve at least one . abnormal ( Or interrupt ) It's a program that doesn't need to show
        Invoked procedure .MIPS There's an exception counter (exception program counter,EPC) The register of , Belong to CP0 register ,

        The address used to hold the instruction that caused the exception . The only way to see the control register is to copy it into the general register , Instructions mfc0
       (move from system control) Can be EPC The address in is copied to a general register , By jumping statements (jr), The program can return to
        Go back to the instruction that caused the exception and continue to execute .MIPS Programmers must keep two registers $k0 and $k1, For the operating system .

        When something goes wrong , The values of these two registers will not be recovered , The compiler doesn't use k0 and k1, The exception handler can put the return address here
        Any one of the two , And then use jr Jump to the command that caused the exception and continue .
$28:($gp) To simplify access to static data ,MIPS The software keeps a register : Global pointer gp(global pointer,$gp), Global pointer
        Just want the runtime determined address in the static data area , At access location gp Value up and down 32KB Data in range , Just one to gp As the base
        The instruction of the pointer is just . At compile time , Data must be based on gp Base pointer 64KB Within the scope of .
$29:($sp)MIPS The hardware doesn't support the stack directly , You can use it for other purposes , But in order to use someone else's program or let someone else use your program
        order , We still have to abide by this agreement , But it has nothing to do with hardware .
$ 30:($fp)GNU MIPS C The compiler uses a frame pointer (frame pointer), and SGI Of C The compiler is not using , And take this register as a guarantee
        Use of memory register ($s8), This saves call and return overhead , But it increases the complexity of code generation .
$31:($ra) Store return address , MIPS There is one jal(jump-and-link, Jump and link ) Instructions , When you jump to an address , Take the next order
        Address to $ra in . Used to support subroutines , For example, call the program to put the parameters in $a0~$a3, then jal X Jump to the X The process , After being transferred

        Put the results in $v0,$v1, And then use jr $ra return .


MIPS Instructions

 

Instructions

function

Application example

LB

Read a byte of data from memory into a register

LB R1, 0(R2)

LH

Read half a word of data from memory into a register

LH R1, 0(R2)

LW

Read a word of data from memory into a register

LW R1, 0(R2)

LD

Read doubleword data from memory to register

LD R1, 0(R2)

L.S

Read single precision floating-point numbers from memory into registers

L.S R1, 0(R2)

L.D

Read double precision floating-point numbers from memory into registers

L.D R1, 0(R2)

LBU

Function and LB The same instruction , But what is read is unsigned data

LBU R1, 0(R2)

LHU

Function and LH The same instruction , But what is read is unsigned data

LHU R1, 0(R2)

LWU

Function and LW The same instruction , But what is read is unsigned data

LWU R1, 0(R2)

SB

To store a byte of data from a register into memory

SB R1, 0(R2)

SH

To store half a byte of data from a register into memory

SH R1,0(R2)

SW

To store a word of data from a register into memory

SW R1, 0(R2)

SD

To store two bytes of data from a register into memory

SD R1, 0(R2)

S.S

Store single precision floating-point numbers from registers into memory

S.S R1, 0(R2)

S.D

Store double precision data from memory to memory

S.D R1, 0(R2)

DADD

Add the contents of two fixed-point registers , That is, fixed-point addition

DADD R1,R2,R3

DADDI

Add an immediate value to the contents of a register

DADDI R1,R2,#3

DADDU

Unsigned plus

DADDU R1,R2,R3

DADDIU

Add an unsigned immediate to the contents of a register

DADDIU R1,R2,#3

ADD.S

Add a single precision floating-point number to a double precision floating-point number , The result is a single precision floating point number

ADD.S F0,F1,F2

ADD.D

Add a double precision floating-point number to a single precision floating-point number , The result is a double precision floating-point number

ADD.D F0,F1,F2

ADD.PS

Add two single precision floating-point numbers , The result is a single precision floating point number

ADD.PS F0,F1,F2

DSUB

The contents of the two registers are subtracted , That is, the subtraction of fixed-point numbers

DSUB R1,R2,R3

DSUBU

Unsigned minus

DSUBU R1,R2,R3

SUB.S

Subtract a single precision floating-point number from a double precision floating-point number , The result is single precision

SUB.S F1,F2,F3

SUB.D

Subtract a single precision floating-point number from a double precision floating-point number , The result is a double precision floating-point number

SUB.D F1,F2,F3

SUB.PS

Subtracting two single precision floating-point numbers

SUB.SP F1,F2,F3

DDIV

Divide the contents of two fixed-point registers , That's fixed-point division

DDIV R1,R2,R3

DDIVU

Unsigned division

DDIVU R1,R2,R3

DIV.S

A double precision floating-point number divided by a single precision floating-point number , The result is a single precision floating-point number

DIV.S F1,F2,F3

DIV.D

A double precision floating-point number divided by a single precision floating-point number , The result is a double precision floating-point number

DIV.D F1,F2,F3

DIV.PS

Divide two single precision floating-point numbers , The result is single precision

DIV.PS F1,F2,F3

DMUL

Multiply the contents of two fixed-point registers , That is, fixed-point multiplication

DMUL R1,R2,R3

DMULU

Unsigned multiplication

DMULU R1,R2,R3

MUL.S

Multiply a double precision floating-point number by a single precision floating-point number , The result is a single precision floating-point number

DMUL.S F1,F2,F3

MUL.D

Multiply a double precision floating-point number by a single precision floating-point number , The result is a double precision floating-point number

DMUL.D F1,F2,F3

MUL.PS

Multiply two single precision floating-point numbers , The result is a single precision floating-point number

DMUL.PS F1,F2,F3

AND

And operation , The contents of the two registers are identical

ANDR1,R2,R3

ANDI

The contents of a register are summed with an immediate number

ANDIR1,R2,#3

OR

Or operations , The contents of the two registers are the same or

ORR1,R2,R3

ORI

The contents of a register are associated with an immediate number

ORIR1,R2,#3

XOR

Exclusive or operation , The contents of the two registers are different or

XORR1,R2,R3

XORI

The contents of a register are XOR with an immediate number

XORIR1,R2,#3

BEQZ

Conditional transfer instructions , When the contents of the register are 0 When transfer occurs

BEQZ R1,0

BENZ

Conditional transfer instructions , When the contents in the register are not 0 When transfer occurs

BNEZ R1,0

BEQ

Conditional transfer instructions , The transfer occurs when the contents of two registers are equal

BEQ R1,R2

BNE

Conditional transfer instructions , The transfer occurs when the contents of two registers are not equal

BNE R1,R2

J

Direct jump command , The address of the jump is in the instruction

J name

JR

Jump instructions using registers , The jump address is in the register

JR R1

JAL

Direct jump command , With link function , The jump address of the instruction is in the instruction , When a jump occurs, the return address should be stored in R31 In this register

JAL R1 name

JALR

Jump instructions using registers , And it has the function of linking , The jump address of the instruction is in the register , When the jump occurs, the return address of the instruction is placed in R31 In this register

JALR R1

MOV.S

Copy a single precision floating-point number from one floating-point register to another

MOV.S F0,F1

MOV.D

Copy a double precision floating-point number from one floating-point register to another

MOV.D F0,F1

MFC0

Copy a data from a general register to a special register

MFC0 R1,R2

MTC0

Copy a data from a special register to a general register

MTC0 R1,R2

MFC1

Copy a data from a fixed-point register to a floating-point register

MFC1 R1,F1

MTC1

Copy a data from a floating-point register to a fixed-point register

MTC1 R1,F1

LUI

Put one 16 The immediate number of bits is filled into the high of the register 16 position , low 16 Place zero

LUI R1,#42

DSLL

Double word logic shift left

DSLL R1,R2,#2

DSRL

Double word logical shift right

DSRL R1,R2,#2

DSRA

Double word arithmetic shift right

DSRA R1,R2,#2

DSLLV

Variable double word logic shift left

DSLLV R1,R2,#2

DSRLV

Variable doubleword Roy shift right

DSRLV R1,R2,#2

DSRAV

Variable double word arithmetic shift right

DSRAV R1,R2,#2

SLT

If R2 The value is less than R3, Then set R1 The value of is 1, Otherwise set R1 The value of is 0

SLT R1,R2,R3

SLTI

If register R2 The value of is less than the immediate number , Then set R1 The value of is 1, Otherwise, set the register R1 The value of is 0

SLTI R1,R2,#23

SLTU

Function and SLT Agreement , But the signed

SLTU R1,R2,R3

SLTUI

Function and SLT Agreement , But without a symbol

SLTUI R1,R2,R3

MOVN

If the contents of the third register are negative , Then copy the contents of one register to another

MOVN R1,R2,R3

MOVZ

If the contents of the third register are 0, Then copy the contents of one register to another

MOVZ R1,R2,R3

TRAP

Transfer to the tube state according to the address vector

ERET

Return to the user state from the exception

MADD.S

Multiply a double precision floating-point number by a single precision floating-point number and add , The result is single precision

MADD.D

Multiply a double precision floating-point number by a single precision floating-point number and add , The result is double precision

MADD.PS

Multiply two single precision floating-point numbers and add , The result is single precision



MIPS Instruction characteristics

MIPS Instruction characteristics :
1、 All instructions are 32 Bit code ;
2、 Some instructions have 26 Bit for destination address encoding ; Some have only 16 position . So to load any one 32 A value , You have to use two load instructions .16 The destination address of the bit means , The position of the jump or sub function of the instruction must be in 64K within ( Up and down 32K);
3、 In principle, all actions must be performed in 1 Complete... In one clock cycle , One action, one stage ;
4、 Yes 32 General registers , Every register 32 position ( Yes 32 It's a plane ) or 64 position ( Yes 64 It's a plane );
5、 There is no flag register to help the operation , To realize the corresponding functions , This is done by testing whether two registers are equal ;
6、 All operations are based on 32 Bit , No operation on bytes and half words (MIPS in , The word is defined as 32 position , Half word is defined as 16 position );
7、 There are no separate stack instructions , All operations on the stack are unified memory access methods . because push and pop An instruction is actually a compound operation , Contains writes to memory and moves to stack pointers ;

8、 because MIPS Fixed instruction length , Therefore, the ratio of binary file and memory occupation after compilation is x86 What's bigger ,(x86 The average instruction length is only 3 A little more than a byte , and MIPS yes 4 Bytes );

9、 Addressing mode : There is only one memory addressing method . Is the base address plus one 16 Bit address offset ;

10、 Data access in memory must be strictly aligned ( At least 4 Byte alignment )

11、 The jump command only has 26 Bit destination address , Plus 2 Bit alignment bit , Addressable 28 Bit space , namely 256M. Which means that , In a C In process ,goto A statement can only jump to the one before it 128M And later 128M Within this address space

12、 Conditional branch instructions have only 16 Bit jump address , add 2 Bit alignment bit , common 18 Bit addressing space , namely 256K. Which means that , In a C In process ,if A statement can only jump to the one before it 128K And later 128K Within this address space ;

13、MIPS By default, the return address of the sub function is not set ( Is the victim instruction address of the calling function ) Put it on the stack , It's stored in $31 In the register ; This is good for those leaf functions . If you encounter nested functions , There is another mechanism to deal with ;

14、 Pipelining effect . Due to the high level of assembly line , The result is some effects that are visible to programmers , We need to pay attention to . The two most important effects are the branch delay effect and the load delay effect .
    a The statement following any branch jump statement is called a branch delay slot . In fact, when the program executes the branch statement , When he has just filled in the address to jump to ( To the code counter ), This article has not been completed yet Instructions , The instruction after the branch statement is executed . This is because of the pipelining effect , Several instructions are being executed at the same time , Just at different stages . Specifically, it is said in the book that half an instruction is executed in advance , Do not understand . Branch delay Slots are often used to complete some parameter initialization and other related work , Instead of being wasted .
    b The load delay is like this . When executing a command to load data from memory , Is loaded into the cache first , And then get it into the register , This process is relatively slow . At the end of this process front , There may have been several instructions executed on the pipeline . These instructions that are executed after loading instructions are called load delay slots . Now there's a problem , If the following instructions need to use the load instruction What about the loaded data ? A common approach is , Apply an internal lock to the data loading process , such , When this instruction is to be used for subsequent instructions , Just stop running first ( stay ALU Stage ), Wait for this one After the data loading instruction is completed, it can be run .

*MIPS Five stage pipeline of instructions : Each instruction contains five execution phases .
The first stage : Fetch an instruction from the instruction buffer . Take up one clock cycle ;
The second stage : From the source register field in the instruction ( There may be two ) Value ( Is a number , Appoint $0~$31 One of them ) Read data from the register represented by . Half a clock cycle ;
The third stage : Do an arithmetic or logic operation in one clock cycle . Take up one clock cycle ;
The fourth stage : Instruction to read a memory variable from the data buffer . On average , There are about 3/4 Your instructions do nothing at this stage , But it is the guarantee of the order of instructions ( Why guarantee , I haven't seen it clearly yet ?). Take up one clock cycle ;
The fifth stage : The stage of storing the result of a calculation into a buffer or memory . Half a clock cycle ;
=> So an instruction takes four clock cycles ;

15、MIPS Virtual address memory mapping space :
a  0x0000 0000 ~ 0x7fff ffff
User level space ,2GB, To go through MMU(TLB) Address translation .kuseg. You can control whether or not to buffer .

b 0x8000 0000 ~ 0x9fff ffff
kseg0. This area is the area occupied by the operating system kernel , common 512M. When using , No address translation , Remove the highest bit and map it linearly to the low bit of memory 512M( If not enough, cut off the top ). But you have to go through a buffer transition .

c 0xa000 0000 ~ 0xbfff ffff
kseg1. This area is the area occupied by the system initialization , common 512M. When using , No address translation , No buffer . Will be the highest 3 Bits are removed and mapped linearly to the low end of memory 512M( If not enough, cut off the top ).

d 0xc000 0000 ~ 0xffff ffff
kseg2. This area is also a kernel level area . To go through the address translation . You can control whether or not to buffer .

16、MIPS The coprocessor
CP0: This is a MIPS Chip configuration unit . essential , It's called a coprocessor , But it's usually made on a chip . most MIPS Configuration of functions , Buffer control , abnormal / Interrupt control , The control of memory management is all in this . So it is necessary for a complete system ;

17、 MIPS Cache for
MIPS There are usually two or three levels of buffering , The first level buffer data and instructions are stored separately . The advantage is that instructions and data can be accessed at the same time , Increase of efficiency . But the disadvantage is that it increases the complexity . Second level buffer and third level buffer ( If any ) It is no longer stored separately .

Buffered units are called buffered lines (cache line). In every line , There is one tag, Then followed by some flag bits and some data . The buffer lines are arranged linearly in order , It forms the whole buffer .

cache line There is a complete mechanism for indexing and accessing .
18、MIPS Abnormal mechanism of
The concept of an exact exception : Exceptions that do not have any superfluous effects in the running process . That is, when an exception occurs , The instruction before the victim instruction is fully executed , However, the injured instruction and the subsequent instructions have not yet been implemented ( notes : Say accept It is wrong that the harmful instruction and the subsequent instruction have not done anything yet , In fact, the victim instruction is just completed in the third stage of its instruction cycle , namely ALU Phase has just been completed ). Precise exceptions help ensure that the software design does not Affected by hardware implementation .

CP0 Medium EPC The register is used to point to the execution position before the instruction jump when the exception occurs , Generally, it is the victim instruction address . When abnormal , Return to this address to continue execution . But if the victim instruction is in the branch delay slot , Will be automatically processed by the hardware EPC Refer back to an instruction , Branch instruction . When the branch instruction is re executed , The instruction in the branch delay slot will be executed again .

The implementation of precise exceptions has a certain impact on the smoothness of the pipeline , If there are too many exceptions , The execution efficiency of the system will be affected .

* Exceptions can be divided into two types: routine exceptions and interrupts . General exceptions are generally software exceptions , Interrupts are generally hardware exceptions , Interrupts can be internal to the chip , It can also be triggered outside the chip .

When an exception occurs , The last instruction executed before jump is its MEM The instruction whose phase has just been executed . The victimized order is its ALU The instruction that stage just finished executing .

When an exception occurs , Will jump to the exception vector entry to execute .MIPS The exception vector of is a bit special , It's usually just 2 One or more interrupt vector entries , An entry for general exceptions , An entrance to TLB miss Abnormal use ( In this case , You can save time calculating exception types . With the help of this mechanism , The system only uses 13 A clock cycle can turn TLB Refill ).

CP0 There is a mode bit in the register ,SR(BEV), As long as it's set , The exception entry point is moved to the unbuffered memory address space (kseg1).

MIPS The system treats the restart as a non returnable exception .
Cold start :CPU The hardware was completely reconfigured , Software reload ;
Hot start : The software is completely reinitialized ;

MIPS The philosophy of exception handling is to assign some types to exceptions , Then the software defines some priorities for them , Then enter the exception allocator from the same entry , In the allocator, the corresponding function to be executed is determined according to the type and priority . This mechanism is also suitable for cases where two or more exceptions occur at the same time .

Here is when an exception occurs MIPS CPU The things that were done :
a Set up EPC Point to the position of regression ;
b Set up SR(EXL) Force CPU Get into kernel state , And disable all interrupt responses .
c Set up Cause register , So that the software can get the exception type information ; There are other registers that will be set in case of some exceptions ;
d CPU Start fetching instructions from the exception entry , Then all the future affairs are handled by the software .

k0 and k1 Register is used to store the address of the exception handling function .
After the execution of the exception handling function , Will return to the exception allocation function , In the exception allocation function , There is one eret Instructions , Used to return to the original interrupted program to continue execution ;eret The instruction atomically turns on the interrupt response ( Set up SR(EXL)), And change the state level from kernel go to user level , And return to the original address to continue .

19、 interrupt
MIPS CPU Yes 8 Independent interrupt bits ( stay Cause In the register ), among ,6 Are external interrupts ,2 Are internal interrupts ( Accessible by software ). Generally speaking , On chip clock count / Timer , Will be connected to a hardware bit .

SR(IE) Bit controls the global interrupt response , by 0 Words , All interrupts are forbidden ;
SR(EXL) and SR(ERL) position ( Any one of them ) If you put 1 Words , Will prohibit interruptions ;
SR(IM) Yes 8 position , Corresponding 8 Broken source , To generate interrupts , I have to put this 8 The corresponding position in the bit 1 Talent ;

Interrupt handlers also use generic exception entries . But something new CPU There are changes .

* A scheme to implement interrupt priority in software
a Prioritize interrupts ;
b CPU Always at a certain priority at runtime ( That is, define a global variable );
c When the interruption occurs , Only equal to or higher than CPU The interrupt priority of priority can be executed ;( If CPU Lowest priority , Then all interrupts can be executed );
d When multiple interrupts occur at the same time , Give priority to the interrupt program with the highest priority ;

20、 Large and small end problems
There are also big end and small end problems in the hardware , Such as serial communication , Byte by byte , The first is to start low .
And the display of the graphics card , For example, display black and white images , A point on the screen corresponds to a bit in the video memory , At this time , This bit correspondence is that the point at the top right corner of the screen corresponds to the first byte of the video memory 7 No. A , That is, the highest position . First row 8 The bit corresponds to the first byte 0 No. A .

21、MIPS Upper Linux Operation of the

User state and nuclear state of mind : In user mode , You can't access the kernel code and data store at will , Only the user state space and the kernel can be accessed ( By some mechanism ) Kernel page of . Can't do it CP0 Relevant instructions . The user mode needs to execute some services of the kernel , You have to use system calls (system_call), At the end of the system call , It's a eret Instructions .

anytime Linux There is at least one thread running ,Linux Generally, interruption is not prohibited . In the event of an interruption , The environment is borrowed from the interrupted thread .

Interrupt service routine (ISR) It should be short .

MIPS Linux The upper half address space of the system can only be accessed by kernel privilege level . The kernel does not pass TLB Address translation .

All threads share the same kernel address space , But only the same set of threads use the same user address space ( Point to the same mm_struct structure ).

If the physical memory is higher than 512M, Then it doesn't work kseg0 and kseg1 To map higher than 512M The memory part of . Only use kseg2 To map .kseg2 To pass TLB.

In a way , The kernel is a set of subroutines called by exception handlers . The kernel , Thread scheduler is such a small subroutine . By each thread ( An exception handler can also count as a special thread , In other words, in his book ) call .

MIPS Linux There are abnormal patterns , and x86 There is no such concept .

Handle abnormally with care . It can't be solved only with software lock .

21、 Atomic manipulation
MIPS To support atomic operation of operating system , Specially added a set of instructions ll/sc. They are used in this way :

Write a sentence first
atomic_block:
LL XX1, XXX2
….
sc XX1, XXX2
beq XX1, zero, automic_block
….

stay ll/sc Write the body of the code you want to execute in the middle , This ensures that the written body of code is atomically executed ( Will not be preempted ).

Actually ,LL/sc The two statements themselves do not guarantee atomic execution , But he played a trick :
Use a temporary register XX1, perform LL after , hold XXX2 Load the values in XX1 in , Then in CPU Set a flag bit internally , We can't see , And save XXX2 The address of ,CPU Will monitor it . During the execution of the intermediate code body , If you find that XXX2 The content of ( That is, other threads execute , Or an interrupt occurs ), Just put CPU The internal sign is clear 0. perform sc when , hold XX1 The content of ( It may be a new value ) Deposit in XXX2 in , And return a value to store XX1 in , If the flag bit is still 1, Then the returned value is 1; If the flag bit is 0, So this The return value is 0. by 1 Words , It indicates that the code in the middle of the pair of instructions is executed at one time , Instead of some interruption in the middle , Then the atomic operation is successful ; by 0 Words , It means that atomic operation is not success , After execution beq When the command , It will jump to ll Instruction re execution , Until the atomic operation succeeds .

therefore , We need to pay attention , Plug in LL/sc The code in the middle of the instruction must be short .

According to experience , Generally, the cycle of atomic operation will not exceed 3 Time .

22、 system call syscall
System calls also enter the system kernel through an exception entry , choice 8 No. exception code handling function , After entering the system call allocation function , According to the parameters passed in, they should be allocated to specific function functions again . The system call passes parameters in registers .

The system call number is stored in v0 in , The parameters are stored in a0-a3. If there are too many parameters , There will be another mechanism to deal with . The return value of a system call is usually placed in v0 in . If the system call goes wrong , Will be in a3 An error number is returned in .

23、 The exception entry point is located at kseg0 The bottom of , It is specified by hardware .

24、 Be careful : Address space 0x0000 0000 It doesn't work , from 0 The first page or pages will not be mapped .

25、 Memory page mapping has the following advantages :
a Hide and protect data ;
b Assign successive addresses to programs ;
c Extended address space ;
d Load code and data on demand ( In an unusual way );
e Easy to reposition ;
f Code and data are shared in threads , Easy to exchange data ;

All threads are equal , All threads have their own memory management structures ; Thread groups running in the same address space , Sharing has most of these data structures . In the thread , A page table that holds pages already used in this address space , It is used to record the mapping relationship between each used virtual page and the actual physical page ;

26、ASID Is used in conjunction with the virtual page high order . Used to describe the TLB and Cache Different threads in , Only 8 position , So it can only run at the same time at most 256 Threads . This figure is generally enough . If you exceed this number , Will the Cache Refreshed reload . therefore , At this point , And x86 Is different .

27、MIPS Linux Memory resident page table structure
It uses a two-level page table , A page table of contents , A page table , Each item in the page table is a EntryLo0-1.
( This is related to x86 Similar way ). It doesn't work MIPS Original design scheme .

28、TLB Of refill The process - hardware component
a CPU First generate a virtual address , To get data from the physical address corresponding to this address ( Or order ) Or write data ( Or order ).
low 13 Bits are separated . And then high 19 A be VPN2, And the current thread ASID( from EntryHi(ASID) take ) Cooperate with TLB Compare the items in the table .( In the process of comparison , Will receive PageMask and G The influence of the marker )
b If there is a match , Just choose that . The... In the virtual address 12 Bit is used to select whether to use the physical address item on the left or the physical address item on the right .
And then they will investigate V and D Sign a ,V Flag bit indicates whether this page is valid ,D Indicates whether this page is dirty ( It was written ).
If V=0, or D=1, Will cause translation exceptions ,BadVAddr The virtual address that is being processed will be saved ,EntryHi Will fill in the high order of the virtual address , also Context The contents in will be filled in again .
And then they will investigate C Sign a , If C=1, Will use buffer as transit , If C=0, No buffering .
After these levels of investigation have passed , The corresponding physical address is found correctly .
c If there is no match , Will trigger a TLB refill abnormal , Then there is the work of the software ;

29、TLB Of refill The process - Software part
a Calculate whether this virtual address is a correct virtual address , Is there a physical address corresponding to it in the memory page table ; without , Then the address error handling function ;
b If the corresponding physical address is found in the memory page table , Just load it into the register ;
c If TLB Is already full , Just use random Pick an item to discard ;
d Copy new entry TLB.

30、MIPS Linux The way to mark that the memory page has been dirty is the same as x86 Different . It wants to play a trick :
a When a writable page is first loaded into memory ( Load from disk ? Allocate a physical page when loading , At the same time, a corresponding virtual page is allocated , And add a... To the memory page table Entry), Put it Entry Of D The flag is clear 0;
b then , When there is an instruction to write this page later , Will trigger an exception ( Load... First TLB Medium judgement ), In this exception handling function, we put the flag bit in the memory page table entry D Set up 1. So the next one can write . also , The flag bit is changed due to this exception , We think this physical page is dirty .
c as for TLB The one that already exists in Entry Copy and modify its D Sign a , In this way, the write operation can continue .

31、MIPS Medium C Language parameter passing mechanism ?

32、MIPS Stack structure and distribution in memory ?

 

Instruction length and number of registers
MIPS All instructions of are 32 Bit , The instruction format is simple . Unlike x86 like that ,x86 The instruction length of is not fixed , With 80386 For example , The instruction length can be from 1 byte ( for example PUSH) To 17 byte , This has the advantage of high code density , therefore MIPS The binary file of is better than x86 About 20%~30%. Fixed length instructions and formats The advantage of simplicity is that it is easy to decode and more in line with pipeline operation , Because the register position specified in the instruction is fixed , The process of decoding and the process of reading instructions can be carried out at the same time , That is, fixed field decoding .
32 General registers , The number of registers depends on the requirements of the compiler . Register allocation is one of the most important optimizations in compilation optimization ( Maybe doing something important ). Current register allocation algorithms are based on graph coloring Technique . The basic idea is to construct a graph , Used to represent various schemes for allocating registers , Then use this diagram to allocate registers . Roughly speaking, it is to use a limited number of colors to make the adjacent nodes in the graph have different colors , The picture shows The chromatic problem is an exponential function of graph size , Some heuristics produce an almost linear allocation of time runs . If there is in the global allocation 16 General purpose registers for integer variables , There are also additional registers for floating point Count , Then graph coloring will work well . Graph coloring does not work well when the number of registers is small .
   ask : Since it cannot be less than 16 individual , Then why not 64 A? ?
answer : Use 64 One or more registers not only need more instruction space to encode registers , It also increases the burden of context switching . Except for those functions that are too big to feel very complex ,32 Registers are enough to hold Frequently used data . Using more registers is not necessary , At the same time, there is a principle of computer design called “ The smaller, the faster ”, But it doesn't mean using 31 A register will be better than 32 Personality can be better ,32 General registers It is a popular practice .
Command format
all MIPS The instruction length is the same , All are 32 position , But to make the format of the instruction just right , So the designer made a compromise : All instructions are fixed in length , But different instructions have different formats .MIPS The instruction has three formats :R Format ,I Format ,J Format . Each format consists of several fields (filed) form , Here is the following :
I Type command
      6    5     5     16
   ------|-----|-----|------------------|
   | op | rs | rt   | Immediate operations |
       ------|-----|-----|------------------|
load / Store bytes , Half word , word , Two words
Conditional branch , Jump , Jump and link registers
R Type command
      6    5     5     5     5     6
   ------|-----|-----|-----|-----|--------|
   |op | rs   | rt   | rd |shamt|funct |
   ------|-----|-----|-----|-----|---------|
register - register ALU operation
Read and write special registers
J Type command
      6             26
   ------|------------------------------|
   |op   |   Jump address          |
       ------|------------------------------|
Jump , Jump and link
Traps and returns from exceptions

Meaning of each field :
op: Instruction basic operation , It's called the opcode .
rs: First source operand register .
rt: The second source operand register .
rd: The destination operand for storing the operation result .
shamt: Displacement
funct: function , This field selects op A particular variant of the operation .  
All instructions are encoded according to one of three types , The position of common fields in each format is the same .
    This kind of instruction coding with fixed length and simple format is very regular , It is easy to see the machine code , for example :
add $t0,$s0,$s1
    Express $t0=$s0+$s1, namely 16 Number register (s0) Content and 17 Number register (s1) Add the contents of , The result is put in 8 Number register (t0).
    The decimal representation of each field of the instruction is
   ------|-----|-----|-----|-----|------|
   |   0 | 16 | 17 |   8 |   0 |   32 |
   ------|-----|-----|-----|-----|------|
op=0 and funct=32 It means that this is addition ,16=$s0 Represents the first source operand (rs) stay 16 In register number ,17=$s1 Represents the second source operand (rt) stay 17 In register number ,8=$t0 Represents the destination operand (rd) stay 8 In register number .
Write the fields in binary , by
------|-----|-----|-----|-----|------|
   |000000|10000|10001|01000|00000|100000|
------|-----|-----|-----|-----|------|
This is the machine code of the above instruction (machine code), It can be seen that it is very regular .

General registers (GPR)
Yes 32 General registers ,$0 To $31:
$0: namely $zero, This register always returns zero , by 0 This useful constant provides a concise form of coding .MIPS The compiler uses slt,beq,bne And so on $0 To obtain the 0 Come on Generate all comparison conditions : equal , Unequal , Less than , Less than or equal to , Greater than , Greater than or equal to . You can also use add Instruction creation move Pseudo instruction , namely
move $t0,$t1
For the actual
add $t0,$0,$t1
Elder Jiaolin mentioned that he transplanted fpc when move Command error , Switch to add Instead of the .
   Using pseudo instructions can simplify tasks , Assembler provides a richer instruction set than hardware .
$1: namely $at, This register is reserved for assembly , Just now I mentioned that using pseudo instructions can simplify the task , But the price is to reserve a register for the assembler , Namely $at.
from On I The immediate digit segment of type B instruction is only 16 position , When loading large constants , Compilers or assemblers need to take large constants apart , And then reassemble it into the register . For example, loading a 32 Bit immediacy requires lui( Load high immediate ) and addi Two instructions . image MIPS Large constants are disassembled and reassembled by assembler , The assembler must have a temporary register to reassemble large constants , This is also for compilation Retain $at One of the reasons .
$2..$3:($v0-$v1) A non floating point result or return value for a subroutine , For subroutines how to pass parameters and how to return ,MIPS Scope has a set of conventions , The contents in a few places in the stack are loaded CPU register , Its corresponding memory location remains undefined , When these two registers are not enough to store the return value , The compiler does it in memory .
$4..$7:($a0-$a3) Used to pass the first four parameters to the subroutine , Not enough stack .a0-a3 and v0-v1 as well as ra Support subroutines together / Procedure call , They are used to pass parameters , Return results and store return address . When more registers are needed , You need a stack (stack) 了 ,MIPS The compiler always leaves space in the stack for parameters in case they need to be stored .
$8..$15:($t0-$t7) Temporary register , Subroutines can use them without reservation .
$16..$23:($s0-$s7) Save register , You need to keep ( The callee saves and recovers , It also includes $fp and $ra),MIPS Temporary register and save register are provided , This reduces register overflow (spilling, The process of putting infrequently used variables into memory ), The compiler is compiling a leaf (leaf) The process ( A procedure that does not call another procedure ) When , Always use the register to be saved after the temporary register is allocated .
$24..$25:($t8-$t9) Same as ($t0-$t7)
$26..$27:($k0,$k1) For the operating system / Exception handling reserved , Reserve at least one . abnormal ( Or interrupt ) Is a procedure that does not need to show calls in the program .MIPS There's an exception counter (exception program counter,EPC) The register of , Belong to CP0 register , The address used to hold the instruction that caused the exception . The only way to see the control register is to copy it into the general register , Instructions mfc0(move from system control) Can be EPC The address in is copied to a general register , By jumping statements (jr), The program can return to the instruction that caused the exception and continue to execute . A careful analysis will reveal an interesting thing :
by Check the control register EPC And jump to the instruction that caused the exception ( Use jr), Must take EPC To a general-purpose register , In this case , When the program returns to the interrupt, it will not be able to The register returns to its original value . If you restore all the registers first , So from EPC The copied values will be lost ,jr You cannot return to the interrupt ; If we just recover from EPC Except for the copied return address The register of , But this means that a certain register of the program is changed for no reason after an exception , That's not gonna work . To get rid of this dilemma ,MIPS Programmers must keep two registers $k0 and $k1, For the operating system . When something goes wrong , The values of these two registers will not be recovered , The compiler doesn't use k0 and k1, The exception handler can put the return address in either of these two , And then use jr Jump to the command that caused the exception and continue .
$28:($gp)C There are two storage types in a language , Automatic and static , since A dynamic variable is a local variable in a process . Static variables exist when entering and exiting a process . To simplify access to static data ,MIPS The software keeps a register : Global pointer gp(global pointer,$gp), If there is no global pointer , Loading data from static data requires two instructions : One with compiler and connector calculations 32 Significant bits in bit address constants ; Make a real suit Input data . The global pointer only wants the address determined by the runtime in the static data area , At access location gp Value up and down 32KB Data in range , Just one to gp Is the instruction of the base pointer . At compile time , Count It is necessary to gp Base pointer 64KB Within the scope of .
$29:($sp)MIPS The hardware doesn't support the stack directly , for example , It has no x86 Of SS,SP,BP register ,MIPS Although the definition $29 For stack pointer , It is also a general-purpose register , Just for special purposes , You can use it for other purposes , But for Use other people's programs or let others use your programs , We still have to abide by this agreement , But it has nothing to do with hardware .x86 There are separate PUSH and POP Instructions , and MIPS No, , But it doesn't affect MIPS Using stack . When a procedure call occurs , The caller pushes the register to be used after the procedure call onto the stack , The callee returns the address register $ra And Reserved registers pushed onto the stack . At the same time, adjust the stack finger The needle , When returning , Recover register from stack , Also adjust the stack pointer .
$30:($fp)GNU MIPS C The compiler uses a debugger pointer (frame pointer), and SGI Of C The compiler is not using , Use this register as a save register ($s8), This saves call and return overhead , But it increases the complexity of code generation .
$31:($ra) Store return address ,MIPS There is one jal(jump-and-link, Jump and link ) Instructions , When you jump to an address , Put the address of the next instruction in $ra in . Used to support subroutines , For example, the calling program puts the parameter Put it in $a0~$a3, then jal X Jump to the X The process , After the transferred process is completed, put the results in $v0,$v1, And then use jr $ra return .
The register to be saved during the call is $a0~$a3,$s0~$s7,$gp,$sp,$fp,$ra.
Jump range
J The address field of the instruction is 26 position , Used to jump to the target . The instruction is stored in memory as 4 Byte alignment , The lowest two significant bits do not need to be stored . stay MIPS in , The lowest two bits of each address specify one word of the word section ,cache The subscript of the mapping does not use these two bits , This means 28 Bit byte addressing , The allowed address space is 256M.PC yes 32 Bit , Other 4 Where did you come from ?MIPS Of The jump instruction only replaces PC It's low 28 position , And high 4 Bits retain their original values . therefore , Loading and linking programs must avoid crossing 256MB, stay 256M Within the paragraph of , The branch jump address is treated as an absolute address , and PC irrelevant , If exceeded 256M( Jump out of segment ) It is necessary to use the jump register instruction .
Again , In conditional branch instructions 16 Bit immediate if not enough , have access to PC Relative addressing , That is, use the branch address in the branch instruction and (PC+4) And make branch goals . Because of the general cycle and if Statements are less than 2^16 A word (2 Of 16 Power ), This method is ideal .

 

0 zero Always return a value of 0
1 at Temporary variables used as assemblers
2-3 v0, v1 The sub function call returns the result
4-7 a0-a3 Parameters of sub function call
8-15 t0-t7 Temporary variable , There is no need to save or restore the sub functions when they are used
24-25 t8-t9
16-25 s0-s7 Subfunction register variable . Child functions must save and restore used variables before the function returns , Thus the calling function knows that the values of these registers have not changed .
26,27 k0,k1 It is usually used by interrupt or exception handlers to save some system parameters
28 gp Global pointer . Some operating systems maintain this pointer for easier access “static“ and ”extern" Variable .
29 sp Stack pointer
30 s8/fp The first 9 Two register variables . Subfunctions can be used as frame pointers
31 ra The return place of the subfunction □

The usage of these registers follows a series of conventions . These conventions really have nothing to do with hardware , But if you want to use someone else's code , Compilers and operating systems , You'd better follow these conventions .

Register name conventions and usage

*at: This register is used by some of the assembly's synthetic instructions . If you want to display using this register ( Such as saving and restoring registers in exception handlers ), There is a compilation directive Can be used to disable the assembler in directive Use it later at register ( However, some of the assembled macros will no longer be available ).

*v0, v1: Used to store a subroutine ( function ) The result or return value of a non floating point operation of . If these two registers are not enough to store the value to be returned , The compiler will do this in memory . The details can be seen 10.1 section .


*a0-a3: When it is used to pass a sub function call 4 A non floating point parameter . In some cases , It's not right . Please refer to 10.1 details .

* t0-t9: According to the agreement , A sub function can use these registers without saving . When evaluating expressions , These registers are very good temporary variables . compiler / What programmers must be aware of is , When a sub function is called , The values in these registers may be destroyed by sub functions .

*s0-s8: According to the agreement , The subfunction must ensure that when the function returns, the contents of these registers must be restored to the values before the function call , Or do not use these registers in a subfunction or save them on the stack and restore them when the function exits . This convention makes these registers very suitable as register variables or to store some original values that must be saved during function calls .

* k0, k1: By OS The exception or interrupt handler for . The original value will not be restored after being used . So they are rarely used elsewhere .

* gp: If there is a global pointer , It will point to the... Determined by the runtime , Your static data (static data) A location in the area . It means , utilize gp As a base pointer , stay gp The pointer 32K About data access , The system only needs one instruction to complete . If there is no global pointer , Accessing a static data area Value requires two instructions : One is to get a compiler and loader It's decided 32 Bit address constant . The other is real access to data . In order to use gp, At compile time, the compiler must know whether a data is in gp Of 64K Within limits . Usually it's impossible , Only by guessing . The general practice is to put small global data ( Small global data ) Put it in gp Within the scope of coverage ( For example, a variable is 8 Bytes or less ), And let linker Alarm if the small global data is still too large to exceed gp The range that can be accessed as a base pointer .

Not all compile and run systems support gp Use .

*sp: The up and down of the stack pointer need to be displayed through instructions . therefore MIPS Usually, the stack pointer is adjusted only when the sub function enters and exits . This is achieved through the called sub functions .sp It is usually adjusted to the lowest place on the stack required by the called subfunction , Thus, the compiler can pass relative to sp To access stack variables on the stack . Please refer to 10.1 The section stack uses .

* fp: fp The other agreed name of is s8. If a sub function wants to dynamically expand the stack size at run time ,fp As a frame pointer, it can be used by sub functions to record the stack . Some programming languages show support for this . Assembler programmers often use fp This usage of .C Library functions of a language alloca() Is to make use of fp To dynamically adjust the stack .

If the bottom of the stack cannot be determined at compile time , You can't pass sp To access stack variables , therefore fp Is initialized to a position relative to a constant on the function stack . This usage is not visible to other functions .

* ra: When any one of the sub functions is called , The return address is stored in ra In the register , So usually the last instruction of a subroutine is jr ra.

If the sub function needs to call other sub functions , Must be saved ra Value , Usually through the stack .

Usage of floating-point registers , There is also a corresponding standard agreement . ad locum , We've already introduced it MIPS Imported deposit

Instruction instance :

1. load/store
  la $t0, val_1 Copy val_1 Indicates the address to t0 In the register       notes : val_1 It's a Label
 lw $t2, ($t0) t0 The value in the register is used as the address , Start this address with Word Copied to the t2 in
 lw $t2, 4($t0) t0 The value in the register is used as the address , Add the offset to this address 4 after Started by Word Copied to the t2 in
 sw $t2, ($t0) hold t2 Register value (1 Word), Store in t0 The value of points to RAM in
 sw $t2, -12($t0) hold t2 Register value (1 Word), Store in t0 Subtract the offset from the value of 12, The point is RAM in

2. Arithmetic instructions
   All operands of an arithmetic instruction are registers , Can't be used directly RAM Address or indirection .
   The size of the operands is Word (4-Byte)
   Instruction format and examples notes
  move $t5, $t1       // $t5 = $t1;
  add $t0, $t1,       // $t2 $t0 = $t1 + $t2; Add signed numbers
  sub $t0, $t1,       // $t2 $t0 = $t1 - $t2; Subtract signed numbers
  addi $t0, $t1, 5    // $t0 = $t1 + 5;
  addu $t0, $t1, $t2  // $t0 = $t1 + $t2; Add unsigned numbers
  subu $t0, $t1, $t2  // $t0 = $t1 - $t2; Subtract unsigned numbers
  mult $t3, $t4       // $t3 * $t4, hold 64-Bits Product of , Store in Lo,Hi in . namely : (Hi, Lo) = $t3 * $t4;
  div $t5, $t6        // Lo = $t5 / $t6 (Lo Is the integral part of the quotient ); Hi = $t5 mod $t6 (Hi For the remainder )
  mfhi $t0            // $t0 = Hi
  mflo $t1            // $t1 = Lo

3. Branch jump instruction
  Branch instruction format and instance notes
  b target Unconditional branch jump , Will jump to target At the label
  beq $t0, $t1, target       // If $t0 == $t1, The jump to target At the label
  blt $t0, $t1, target       // If $t0 < $t1,  The jump to target At the label
  ble $t0, $t1, target       // If $t0 <=$t1,  The jump to target At the label
  bgt $t0, $t1, target       // If $t0 > $t1,  The jump to target At the label
  bge $t0, $t1, target       // If $t0 >= $t1, The jump to target At the label
  bne $t0, $t1, target       // If $t0 != $t1, The jump to target At the label

4. Jump instruction
  Instruction format and examples notes
  j target          // Unconditional jump , Will jump to target At the label
  jr $t3            // Jump to t3 At the address pointed to by the register (Jump Register)

5. Sub function call instruction
  Instruction format and examples notes
  jal sub_routine_label Execution steps :
  a. Copy the current PC(Program Counter) To $ra In the register . Because of the current PC The value is the return after the execution of the sub function
       Address .
  b. The program jumps to the subroutine tag sub_routine_label It's about .  
   notes : The return of a subfunction , Use jr $ra  
   If another sub function is called in the sub function , that $ra The value of should be saved to the stack . because $ra The value of always corresponds to the current execution
    The return address of the line's child function .



原网站

版权声明
本文为[_ kerneler]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/163/202206121707443752.html