当前位置:网站首页>I Regular expression to finite state automata: regular expression to NFA
I Regular expression to finite state automata: regular expression to NFA
2022-06-12 09:19:00 【symop】
original text :https://study.163.com/course/courseMain.htm?courseId=1002830012
One . Classification of finite state automata
Finite state automata , In fact, it can be divided into two categories . The first category is the one we gave above , It is called deterministic finite state automata : Deterministic finite automaton abbreviation DFA. A deterministic state machine has one feature , Given the current status and input characters , Then the next state can be uniquely determined . For example, based on the above figure , In state 1 when , Received character 0-9, The next state must be 1, If a character is received . , Then the next state , It must be 2. More seriously , DFA It is such an automaton , Every edge out of a given state corresponds to a certain character , meanwhile , Two sides out of one state , Their corresponding characters must be different .
Corresponding to DFA, Another state machine is called non deterministic finite state machine : Nondeterministic finite automaton, namely NFA. In practice , To successfully convert regular expressions to automata , need NFA With the help of the .NFA Is characterized by , Two sides out of one state , Can have the same corresponding characters . Or its edge can correspond to a special character called ” empty ” character , The symbol corresponding to this character is : ℇ. This edge representation , No input required , You can move from the current state to the next state .
for instance , expression (and | any) , Its corresponding DFA as follows :
From the beginning , Split into two sides , The characters corresponding to the two sides are the same 
perhaps :
From the initial state, divide two edges corresponding to empty characters , Then enter two corresponding state machines respectively .
The second kind NFA It is easy to realize in program design , therefore , In the next section of the code , We will use the second NFA Implementation pattern .
NFA One obvious weakness is , In code design , It's hard to represent it in a data structure . especially , When corresponding to an input character ,NFA You can jump to multiple states , that , If you want to make use of NFA It is difficult to recognize the input string . generally speaking , Use NFA All procedures need to go through the next two steps : Convert a regular expression to NFA, take NFA Convert to DFA. In the following discussion , We will show these two transformations in code .
Thompson Construction :
Convert a regular expression to NFA The algorithm was developed by Bell Laboratories Ken Thompson Given , This guy and Dennis . Richie co developed Unix, And he developed C The forerunner of language B Language .
Algorithm is as follows :
The simplest regular expression is single character matching , for example a Match input characters ”a”, Then the expression's NFA The structure is as follows :
that , A join expression composed of two such regular expressions ab It can be expressed as follows :
actually , It constructs two expressions respectively NFA, Then pass a ℇ edge , Put two NFA End to end .
So let's see , Two expressions do OR During operation | ,NFA How to construct , The structure diagram is as follows :
To construct the or operation of two expressions : exp1 | exp2, According to the diagram , First, construct two expressions respectively exp1 , exp2 Respective NFA: NFA1( Top dashed box ), NFA2( Dotted box at the bottom ), Then construct two states , The initial state ( Beginning circle node ), And end status ( End circle node ), There are two lines where the initial state is extended ℇ edge , Point to respectively NFA1 and NFA2 The beginning of , then NFA1 and NFA2 At the end of each of them ℇ edge , They all point to the end state .
Let's see a | b Of NFA chart :
The principle is the same as that described above . The dashed box above is the expression a Of NFA, The lower dashed box is the expression b Of NFA. Two NFA The connection of is exactly the same as that described above
If the expression is ( (a|b) | cd) Well , The same goes for algorithms , First construct a | b Of NFA chart , And then construct cd Of NFA chart . Finally, according to the method mentioned above , Two more NFA Connect :
The big dashed box above is (a|d) Of NFA, The dotted line of the long plaque at the bottom is cd Of NFA. Then, two state nodes and ε The edges are connected .
You can see , Thompson Constructing an algorithm is actually a self recursive process
Let's take a look at the construction process of the corresponding closure operation :
exp* Of NFA:
If it is self - Recovery 0 Time , Then go directly from the lower side to the end node .
exp+( Repeat at least once ) Of NFA:
exp? ( repeat 0 or 1 Time ) Of NFA:
Any complex regular expression has its NFA The structure of is a combination of the above structures , For example, an expression
(D*.D | D.D*)
The construction algorithm is as follows :
1. structure D Of NFA:
2. structure D*( It corresponds to the figure on the left ):
3. structure D*.D ( because . Special characters in regular expressions , If you just want to express its symbolic content , To escape, add a backslash to the front ):
. The first part of the number is D*, The latter part is D Of NFA.
4. structure D.D*, Of the expression NFA In fact, the above figure . Move the back part to the beginning .
5. according to OR The construction of , Construct the entire expression (D*.D | D.D*) Of NFA:
Above is D*.D Of NFA, Below is D.D* Of NFA
No matter how complex the expression is NFA Construction , It is a repeated combination of several basic structures .
边栏推荐
- Occupied occupied occupied occupied occupied
- Jupyter notebook sets the default browser to open with an error syntaxerror: (Unicode error) 'UTF-8' codec can't decode byte 0xd4
- Swagger documentation details
- Introduction to applet cloud development -- questionnaire evaluation applet practice (7)
- (JS) three digits are separated by commas, and two decimal places are reserved (or rounded)
- (十五) TweenRunner
- Share the basic knowledge of software testing and write something you don't know
- Minimum transfer times
- Hotspot Metaspace
- Microservice gateway
猜你喜欢
功能测试面试常见的技术问题,总结好了你不看?

Multi table operation instance

Do you know the meaning behind these questions?

Technology cloud report: how will the industrial Internet rebuild the security boundary in 2022?

Xshell startup encountered "unable to continue code execution because mfc110.dll cannot be found"

Hotspot Metaspace

2022 melting welding and thermal cutting test questions and answers
APP测试面试题汇总,面试必考一定要看
软件测试基础知识分享,写点你不知道的
数据库常见面试题都给你准备好了
随机推荐
EIP-1559
(js)三位用逗号隔开,保留两位小数(or 四舍五入取整)
L1-019 who goes first
Exists usage in SQL
(十三)文本渲染Text
Database common interview questions are ready for you
Top command meaning
MySQL - Import / export operation
Distributed transactions - Theoretical Overview
Quick sort
Offer:[day 8 dynamic planning (simple)] --- > maximum profit of stock
剑指 Offer II 016. 不含重复字符的最长子字符串-滑动窗口
Grab screen and ground glass effect
L1-002 print Hourglass (20 points)
Financial test interview questions to help you get the offer
Four steps for sending rockertmq producer messages
解决当打开Unity时 提示项目已经打开,而自己之前并没有打开过(可能之前异常关闭)的问题
Distributed transaction solution 2: message queue to achieve final consistency
Source code and scheme for target recognition, detection and 6D attitude estimation (the most advanced method and data set)
Leetcode689 wrong greedy thinking