当前位置:网站首页>[compilation principle] overview
[compilation principle] overview
2022-06-25 17:30:00 【No card, no change】
Chapter one summary
1.1 Compiler Overview
1.1.1 Basic concepts
translator : The software that can complete the semantic transformation from one language to another is called a translator , These two languages are called the source language and target language of the translator respectively .
compiler : It's a translator , Its characteristic is that the target language is lower than the source language .
compile : The process of translating a high-level language into assembly language or machine language .
A general high-level language is the source language in the compilation process , Assembly language and machine language are the target languages in the compilation process .
Compiler's position in language processing system :
![[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-bN0q8e7c-1656144887051)(C:\Users\23343\AppData\Roaming\Typora\typora-user-images\image-20220607192229927.png)]](/img/af/246750beb72842e83a19e42270c09e.png)
The preprocessor : Aggregate source programs stored in different files , Convert abbreviated statements called macros into original statements .
Relocatable : The starting position of the assembly code generated by the assembler in memory is not fixed .
loader : Modify relocatable address , Put the modified instructions and data in the proper place in memory .
The linker : Multiple relocatable machine code files ( Include library files ) Connect together , Solve the problem of external memory address .
External address : Code in one file may refer to data objects or procedures in another file , Then the addresses of these data objects and procedures are external addresses to the file .
Machine translation ( compile ) Process understanding :
To translate English sentences by analogy .
In the room, he broke a window with a hammer.
Overall speaking , For such a sentence , If we want to translate it into Chinese , Sentence analysis is required , First understand the meaning of the sentence , Then use Chinese to express the meaning .
First determine the part of speech of each word , such as ,in and with It's preposition ,the and a Is an article ,room、window and hammer It's a noun ,he It's a pronoun .( Corresponds to the lexical analysis phase of the compilation process )
According to the part of speech of each word , You can determine the phrases formed by the combination of different words , such as ,in the room and with a hammer It's a prepositional phrase ,he and a window It's a noun phrase ,broke Is a verb phrase . Divide the sentence components according to different phrases ,he It's the subject ,broke It's the predicate ,a window It's an object ,in the room It's an adverbial ,with a hammer It's a complement .( Corresponds to the parsing phase of the compilation process )
After dividing the sentence components, we can understand the meaning of the sentence .( Corresponds to the semantic analysis phase of the compilation process )
In fact, the process of understanding a sentence can be abstractly understood as presenting the meaning of the sentence in the mind in the form of thinking , This expression corresponds to the intermediate representation of the compilation process .
Finally, we present our understanding of sentences in the form of Chinese language ( The build target language corresponding to the compilation process )
Compiler structure :
![[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-C8SqE27a-1656144887053)(C:\Users\23343\AppData\Roaming\Typora\typora-user-images\image-20220607200221349.png)]](/img/d0/e1deeb06dea0addeff0ffbacf7ee62.png)
- Red Part is called the front end ( Different from software development “ front end ” Concept ): Related to source language .
- Blue Part is called the back end ( Different from software development “ Back end ” Concept ): Related to target language .
- green Partially language independent , Act as a bridge .
Be careful : The phases distinguished above are the logical organization of the compiler , In the process of implementation , Multiple phases may be combined :
- When actually implementing the compiler , The results of semantic analysis are usually directly expressed in the form of intermediate code , Therefore, semantic analysis and intermediate code generation are usually implemented together , It is called semantic translation ;
- Semantic translation while parsing , That is to say, grammar analysis and semantic translation are implemented together , This technique is called grammar guided translation .
Grammar - guided translation uses context - free language (CFG) To guide the translation of language , It is a grammar oriented translation technology .
1.1.2 Lexical analysis stage
Main task : Scan the characters of the source program line by line from left to right , Recognize each word , Determine the type of word , Convert the recognized words into a unified internal representation —— lexical unit ( t o k e n token token) form .
t o k e n : < Kind of other code , Belong to sex value > token:< Species code , Property value > token:< Kind of other code , Belong to sex value >
| Word type | differentiation by type | Species code | |
| 1 | keyword | program、if、else、then、... | One word one code : A keyword corresponds to a category code |
| 2 | identifier | Variable name 、 Array name 、 Record name 、 The process of 、... | Multiple words and one code : There are too many types of user-defined identifiers , Therefore, the identifier is represented by a uniform category code |
| 3 | Constant | integer 、 floating-point 、 Character 、 Boolean type 、... | One size, one type : All constants of the same type are represented by a variety of different codes , Constants of different class types use different codes |
| 4 | Operator | Count (+ - * / ++ --) Relationship (> < == != >= <=) Logic (& ! ~) | One word one code or One size, one type |
| 5 | Boundary sign | ; ( ) = { } ... | One word one code |
Not all t o k e n token token Property values of all save values . such as , keyword while Corresponding t o k e n token token The value of the attribute is not saved , And constant 100 Corresponding t o k e n token token The category code of is i n t e g e r integer integer, The property value is 100 100 100.
“XX analysis phase ” The main task of the description is to provide a complete overview of the corresponding analyzer functions , and “XX analyzer ” The analyzer function described in is just an enumeration of typical functions . Therefore, the functions described by the two do not conflict .
1.1.3 The stage of grammatical analysis
Main task : Output from lexical analyzer t o k e n token token Recognize all kinds of phrases in the sequence , And construct a syntax analysis tree .
The parsing tree describes the grammatical structure of a sentence .
Distinguish parsing tree from syntax tree
Parsing tree and syntax tree are not the same thing . Habitually , We call the former “ Concrete syntax tree ”, It can reflect the process of derivation ; The latter is called “ Abstract syntax tree ”, It does not reflect the process , Only care about the final result .
Definition of parsing tree :
about CFG G The sentence pattern of , An analysis tree is defined as a tree with the following properties :
- The root is marked by the start symbol ;
- Each leaf has a terminator 、 Non terminator or ε Mark ;
- Each internal node is a non terminal ;
- if A A A Is the internal tag of a node , And X 1 X_1 X1、 X 2 X_2 X2、…、 X n X_n Xn Is the mark of all children of the node from left to right . be : A → X 1 X 2 . . . X n A\rightarrow X_1X_2...X_n A→X1X2...Xn It's a generative . if A→ε, Then it is marked with A A A Can have only one node marked ε ε ε The children of . if A → ε A\rightarrow ε A→ε , Then it is marked with A A A Can have only one node marked ε ε ε The children of .
In short , Use production to deduce sentence patterns G The process of is represented in the form of a tree .
Definition of syntax tree :
about CFG G The sentence pattern of , A syntax tree is defined as a tree with the following properties :
- The root and inner nodes are marked by operators in the expression ;
- Leaves are marked by operands in expressions ;
- Parentheses used to change the priority and associativity of operations , Hidden in the structure of the syntax tree .
In short , Leaves are all operands , It's all operators inside , There are no nonterminal characters and no parentheses in the tree .
You can draw analysis tree and syntax tree respectively :
![[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-2UUBsBL0-1656144887053)(C:\Users\23343\AppData\Roaming\Typora\typora-user-images\image-20220607210109734.png)]](/img/d3/89008fca4ae7649f9584f5c3b06556.png)
so , Each non leaf node of the analysis tree is an expression , The non leaf nodes of the syntax tree are all operators .
1.1.4 Semantic analysis stage
** Main task :** Collect attribute information for identifiers , Include : Species 、 type 、 Storage location 、 length 、 value 、 Scope, etc , For the process ( function ) It also includes parameter information and return value information , Save it in the symbol table ; Also, check for semantic errors , Include : Variables or procedures are used without declaration 、 Variable and procedure names are declared repeatedly 、 Operation component type mismatch 、 Type mismatch between operator and operand .
Symbol table is a data structure used to store attribute information of identifier .
The language symbols in the symbol table can be divided into Keyword symbols , Operator symbol And Identifier symbol .
The attribute items generally set for identifiers in symbol tables are :
- Symbol name
- Type of symbol
- Storage category of symbols
- Scope and visibility of symbols
- Storage allocation information of symbolic variables
- Other properties of symbols
The function of symbol table :
- Collect symbol attributes ( Lexical analysis )
- The basis for checking the legitimacy of context semantics ( Syntax analysis )
- As the basis for address allocation in the target code generation stage ( Semantic analysis )
1.1.5 Intermediate code generation phase
The commonly used intermediate form is three address code . Three address codes are composed of a sequence of instructions similar to assembly language. Each instruction has at most three operands .
The intermediate representation must have two properties : Easy to generate and translate into target programs .
1.1.6 Code optimization phase
Equivalent program transformation for improving code , Make it run faster 、 Less space , Or both .
Code optimization is divided into machine independent code optimization and machine related code optimization , The former is optimized at the intermediate code level , The latter is optimized at the object code level .
1.1.7 Code generation phase
Object code generation takes the intermediate representation of the source program as input , And map it to the target language .
An important task of object code generation is Allocate registers reasonably for variables used in the program .
边栏推荐
- Mobx学习之路----强大的“状态管理工具”
- STM32 hardware error hardfault_ Handler processing method
- 好胖子带你学Flink系列-Flink源码剖析第一集Standalone启动脚本分析
- Win10开启热点共享后断网怎么解决?
- 杰理之系统时钟设置出现复位或无效问题【篇】
- 【UVM实战 ===> Episode_2 】~ VIP、VIP的开发、VIP的发布
- Distinguishing seven kinds of facial expressions by deep separable convolution neural network
- 匯編語言(5)寄存器(內存訪問)
- 杰理之增加加密文件播放功能【篇】
- 旧手机变废为宝,充当服务器使用
猜你喜欢
随机推荐
杰理之定时器捕获(timer_cap.c)使用注意事项【篇】
启牛蜻蜓点金下载是可以开户吗?开户安全吗
Mathematical modeling -- integer programming
Comprehensive optimization of the six topics, Alibaba performance optimization booklet open source, leading you to the ultimate performance
Old mobile phones turn waste into treasure and serve as servers
"Podcast with relish" 386 Yuan Tang Hua Yuan Shi: who is not a "Mr. White character"?
【Matlab】曲线拟合
杰理之定时器使用注意事项【篇】
FreeRTOS内核时钟不对的问题解决
cgi通过odbc连接数据库
n-queens problem
Redis distributed lock collation
软考中的嵌入式系统设计师为什么考的人少?
国泰君安证券靠谱吗?是否合法?开股票账户安全吗?
Treasure and niche Chinese painting 3D texture material website sharing
Sword finger offer II 010 Subarray prefix sum difference with sum K
【Matlab】数值微积分与方程求解
tasklet api使用
Website arrangement of super all metal PBR multi-channel mapping materials
Can I open an account? Is it safe to open an account



![[UVM practice== > episode_1] ~ MCDF design update, AMBA standard interface, UVM verification environment update](/img/22/0c13e98e634a99d1680dd4bb12eaef.png)




