当前位置:网站首页>Compiler introduction
Compiler introduction
2022-07-25 22:31:00 【You came back to your home】
1 Language processor
In short , A compiler is a program , It can be read in a certain language ( source language ) Written program , And translate the program into - An equivalent 、 In another language ( target language ) Written program , See chart 1-1. One of the important tasks of the compiler is to report the errors found in the source program during the translation .
If the target program is - An executable machine language program , Then it can be called by the user , Process input and produce output . See chart 1-2.

Interpreter ( interpreter) Is another common language processor . It does not generate the target program through translation . From the perspective of users , The interpreter directly uses the input provided by the user to perform the operations specified in the source program . See chart 1-3.
In the process of mapping user input to output , Machine language object programs produced by a compiler are usually better than - An interpreter is much faster . However , The error diagnosis effect of the interpreter is usually better than that of the compiler , Because it executes the source program statement by statement .
Java The language processor combines the compilation and interpretation process , Pictured 1-4 Shown .
One Java The source program is first compiled into a program called bytecode (bytecode) The middle representation of . Then a virtual machine interprets and executes the obtained bytecode . One of the benefits of this arrangement is that the bytecode compiled on one machine can be interpreted and executed on another machine . The migration between machines can be completed through the network .
In order to complete the processing from input to output faster , Some are called instant (just in time) Compiler Java The compiler first translates the bytecode into machine language before running the intermediate program to process the input , Then execute the program .
Pictured 1-5 Shown , Besides the compiler , establish - - An executable object program also needs some other programs .
A source program may be divided into multiple modules , And stored in a separate file . The task of aggregating source programs is sometimes performed by a so-called preprocessor (preprocessor) The program is completed independently . The preprocessor is also responsible for converting abbreviations called macros into source language statements .
then , Pass the preprocessed source program as input to – A compiler . The compiler may produce an assembly language program as its output , Because assembly language is easier to output and debug . next , This assembly language program is called an assembler ( assembler) Process with the program , And generate relocatable machine code .
Large programs are often compiled in multiple parts , therefore , Relocatable machine code must be connected to other relocatable object files and library files - rise , Form code that actually runs on the machine . The code in one file may point to a location in another file , And linker (linker) Can solve the problem of external memory address . Last , loader (loader) Put all executable target files into memory for execution .
To be added 18 Exercises
2 The structure of a compiler
Up to now , We think of compilers as - A black box , It can map the source program to the semantically equivalent target program . If you open this box a little , We will see that the mapping process consists of two parts : Analysis and synthesis .
analysis ( analysis) Part decomposes the source program into multiple constituent elements , And add grammatical structure to these elements . then , It uses this structure to create an intermediate representation of the source program . If the analysis section checks that the source program does not follow the correct syntax , Or semantically not - Cause , It must provide useful information , So that users can correct according to this . The analysis section also collects information about the source program , And the information is stored in a table called symbol table ( symbol table) In the data structure of . Symbol table and intermediate representation - - Send it to the integrated part .
comprehensive (synthesis) The second part constructs the target program expected by users according to the information in the intermediate representation and symbol table . The analysis part is often referred to as the front end of the compiler (frontend), The integrated part is called the back end (backend).
If we study the compilation process in more detail , You will find that it executes one in sequence - Group steps (phase). Each step converts one representation of the source program into another .- A typical way of decomposing a compiler into multiple steps is shown in Figure 1-6 Shown . In practice , Multiple steps may be combined , The intermediate representation between these combined steps does not need to be explicitly constructed . The symbol table storing the information of the whole source program can be used by all steps of the compiler .

Some compilers have a machine independent optimization step between the front end and the back end . The purpose of this optimization step is to transform on top of the intermediate representation , So that the back-end program can generate a better target program . If based on . Intermediate representation to generate code , Then the quality of the code will be affected . Because optimization is optional , So figure 1-6 The two optimization steps shown in can be omitted .
2.1 Lexical analysis
The first step of the compiler is called lexical analysis (lexical analysis) Or scan (scanning). The lexical analyzer reads the character stream that makes up the source program , And organize them into meaningful morphemes (lexeme) Sequence . For each morpheme , The lexical analyzer produces lexical units in the following form (token) As the output :
< token-name, attribute-value >
This lexical unit is transferred to the next step , That is, parsing . In this lexical unit , The first component token-name Is an abstract symbol used by the parsing step , And the second component attribute-value Point to the lexical unit in the symbol table , The entry of . The information of symbol table entries will be used by semantic analysis and code generation steps .
such as , Suppose a source program contains the following assignment statement
position = initial + rate * 60 (1.1)
The characters in this assignment statement can be combined into the following morphemes , And mapped into the following lexical units . These lexical units will be passed to the parsing stage .
边栏推荐
- Win10 set up a flutter environment to step on the pit diary
- MapGIS格式转ArcGIS方法
- scrapy无缝对接布隆过滤器
- Randomly generate 10 (range 1~100) integers, save them to the array, and print the array in reverse order. And find the average value, the maximum value and the subscript of the maximum value, and fin
- internship:普通常用的工具类编写
- Today, I sorted out some problems about high collapse
- Based on if nesting and function call
- Smart S7-200 PLC channel free mapping function block (do_map)
- Div drag effect
- D3.js learning
猜你喜欢

【集训DAY15】Boring【树形DP】

Tfrecord write and read

Xiaobai programmer's sixth day

Data governance under data platform

About vscode usage+ Solutions to the problem of tab failure

(1) Integrating two mapping frameworks of Dao

Visitor mode

H5 lucky scratch lottery free official account + direct operation

Perform Jieba word segmentation on the required content and output EXCEL documents according to word frequency

【集训DAY12】Minn ratio 【dfs】【最小生成树】
随机推荐
D3.js learning
数据平台下的数据治理
数学规划分类 Math Programming Classfication
Synchronized and volatile
VIM usage record
ORM common requirements
scrapy无缝对接布隆过滤器
torchvision
平台架构搭建
Output Yang Hui triangle with two-dimensional array
【PMP学习笔记】第1章 PMP体系引论
C语言逆序打印字符串的两种方法
Based on if nesting and function call
Minor GC 和 Full GC 有什么不同呢?
vim用法记录
Advanced database · how to add random data for data that are not in all user data - Dragonfly Q system users without avatars how to add avatar data - elegant grass technology KIR
Using simple scripts to process data in 3dslicer
LabVIEW develops PCI-1680U dual port can card
Document flow definition, box model related knowledge
(1) Integrating two mapping frameworks of Dao
