当前位置:网站首页>Compilation principle learning notes 1 (compilation principle overview and lexical analysis)
Compilation principle learning notes 1 (compilation principle overview and lexical analysis)
2022-07-28 17:55:00 【jsBeSelf】
1 summary
1.1 The build process
The compilation process has Five Stage , Pictured 1( Fundamentals of image source compilation The second edition ) Shown .
Lexical analysis , Syntax analysis , Semantic analysis + Intermediate code generation , Optimize , And object code generation . Learn only the first two stages in your notes , There is no in-depth study in the last three stages .
These five stages are very similar to English Translation , Can compare memory . Like lexical analysis -> Recognize English words , Syntax analysis -> Sentence structure analysis , Semantic analysis + Intermediate code generation -> Preliminary translation , Optimize -> Modify the translation , Object code generation -> Write the final translation .
The output of each stage is the input of the next stage . The whole compilation process can also be divided into two parts , That is, the analysis part (123) And the integrated part (45), Also called front-end and back-end , We only study the front end .
1.2 Tools and other treatments
Five stages , There are also some analysis tools :
1) Lexical analysis : Such as finite automata , Used to describe word formation rules ;
2) Syntax analysis : Such as context free grammar , Used to describe grammatical rules .
Besides , Symbol table management and error handling also run through the whole compilation process .
2 Lexical analysis
2.1 Main work
Lexical analysis here has double meanings :
1) Stipulate the rules of word formation , Also known as word formation rules ;
2) Recognize input sequences according to word formation rules , Also known as lexical analysis .
Lexical analyzer is the only part of the compiler that deals with the source program .
The main work is :
1) Filter out the comments in the source program , Useless parts like spaces ;
2) Identification mark , Give it to the parser ;
3) Call symbol table manager or error handler ( Handle lexical errors ).
The function of lexical analyzer is to identify each mark in the source program , Form a token stream , Pass to parser .
2.2 String related
The operations between strings are : and , hand over , Connect , Bad , Closure , Positive closure, etc .
Sometimes strings may not be enumerable , But there is a certain law , It can be used Normal form To describe it , Represents a kind of string .
2.3 Identification of marks
The identification of marks can be achieved by Finite automaton To complete .
NFA( Uncertain finite automata ) It's a quintuple ,M=(S, ∑, move, s0, F), Respectively represent finite state sets , Limited input character set , State transition function , Unique initial state , Final state set .
There are three ways to describe finite automata : Text definition ; Conversion chart ; Transformation matrix . As shown in the figure below ( Source compilation principle Second Edition )


NFA Is characterized by uncertainty , For the same character , There may be more than one next state transition , namely move The function is one to many .
And here is the corresponding DFA( Deterministic finite automata ), yes NFA A special case of , among : The state transition diagram is not marked ε The edge of , For each state s And every character a, At most, there is only one next state , namely move The function is one-to-one .
stay DFA No backtracking is required for identifying input sequences on , Improved recognition efficiency . So think more about using DFA.
If two finite automata recognize the same normal set , Then two finite automata are equivalent , by NFA Turn into DFA Foreshadowing .
2.4 From normal form to lexical analyzer ( a key )
The whole process is :
1) Describe patterns in normal form ;
2) For each normal form, construct a NFA;
3) Will be constructed NFA Convert to equivalent DFA;
4) Optimize DFA, namely DFA To minimize ;
5) According to the minimum DFA Construct a lexical analyzer ( Write code ).
1) There's no way , It's about finding the rules
2) From normal form to NFA: There are two ways :Thompson Algorithm ; Decomposition .
Thompson Algorithm ( Source compilation principle Second Edition ):
The decomposition method is relatively simple , Such as below ( I drew it myself ):
3) from NFA To DFA
The basic idea : Determine the next state of uncertainty . Method : Subset method , Here's the picture ( Source compilation principle Second Edition ):
4)DFA To minimize the
To minimize the DFA, Will a DFA Think in reverse , Could become NFA, It indicates that some states are redundant , At this time, if you turn the opposite NFA Also become DFA, It's both positive and negative DFA Minimum DFA 了 .
practice : First, divide all non terminal states and all terminal states into two sets , Then determine the output direction of each element in each set , If each element in a single set is also input , The destination is the same set , Then there is no need to distinguish , Otherwise, different elements will be eliminated , Until it is impossible to distinguish .
Miscellany
1) How to judge whether two sets are equal : That is, every set A The elements in , It can be deduced that it belongs to the set B, meanwhile , Each set B The elements in , It can be deduced that it belongs to the set A.
边栏推荐
- 2.1 operator
- Grid in pytorch_ How to use sample
- [unity scriptable object] tutorial | using scriptable object to store object data information in unity
- Leetcode systematic question brushing (3) -- binary tree, binary search
- Understanding of virtual (virtual method) in C and its difference from abstract (abstract method)
- ROS scattered knowledge points and error resolution
- 3.2- random numbers
- [machine learning notes] regularization: ridge regression
- On the non recursive and recursive implementation of finding the nth Fibonacci number respectively
- [C language note sharing] character function and string function (recommended Collection)
猜你喜欢

Collection集合

从0到1:基于云开发的投票小程序开发笔记

IO的操作

3D point cloud processing series - ---- PCA

Complete MySQL interview questions (updated in succession)

概率函数P(x)、概率分布函数F(x)与概率密度函数f(x)的区别
![[C language note sharing] custom type: structure, enumeration, Union (recommended Collection)](/img/25/4a17c260b2b506ae1224520d9b85d1.png)
[C language note sharing] custom type: structure, enumeration, Union (recommended Collection)

【p5.js学习笔记】码绘的基础知识

编译原理学习笔记1(编译原理概述与词法分析)

Openpcd安装过程记录
随机推荐
分支与循环语句
C#中virtual(虚方法)的理解以及和abstract(抽象方法)的区别
wordpress提示建立数据库连接时出错
MySQL installation
mysql5.7压缩包安装教程
电工学自学笔记1.22
Leetcode systematic question brushing (I) -- linked list, stack, queue, heap
Internal class, common class
On the non recursive and recursive implementation of finding the nth Fibonacci number respectively
从0到1:基于云开发的投票小程序开发笔记
Pycharm connects to remote docker
mmdetection3d(2)---结果、log可视化
[p5.js learning notes] local variable (let) and global variable (VaR) declaration
3.2- random numbers
电脑充不进去电的解决方法
Understanding of virtual (virtual method) in C and its difference from abstract (abstract method)
三维点云处理系列----PCA
数字滤波器(六)--设计FIR滤波器
pycharm连接到远程docker
leetcode系统性刷题(一)-----链表、栈、队列、堆