当前位置:网站首页>Write a jison parser from scratch (4/10): detailed explanation of the syntax format of the jison parser generator
Write a jison parser from scratch (4/10): detailed explanation of the syntax format of the jison parser generator
2022-07-04 09:28:00 【Hu Zhenghui】
Write one from scratch Jison Parser (4/10):Jison Parser generator syntax format details
Write one from scratch
JisonParser (4/10):JisonParser generator syntax format details
List of articles
- Write one from scratch Jison Parser (4/10):Jison Parser generator syntax format details
- Parser parsed by line
- `Jison` Syntax structure of parser generator
- `Jison` Lexical analysis in grammar `lexical analysis` part
- `Jison` Semantic analysis in grammar `semantic analyse` part
- Contains multiple rules `rule` Grammatical rules `grammar rules`
- Recursive rule `recursive rule`
- `Jison` Summary of grammatical structure
Parser parsed by line
The parser parsed by line has been successfully run previously , This article takes this as an example to explain in detail Jison Syntax format of parser generator .
%lex
%%
[^\n\r]+ {
return 'LINE'; }
[\n\r]+ {
return 'EOL'; }
<<EOF>> {
return 'EOF'; }
/lex
%%
p
: ll EOF
{
console.log($1); }
;
ll
: ll l
{
$$ = $1 + $2; }
| l
{
$$ = $1; }
;
l
: LINE EOL
{
$$ = $1 + $2; }
;
Jison Syntax structure of parser generator
Jison Is a widely used parser generator Bison Of JavaScript transplant , except Bison The semantics of static types in and are specific to C Outside the concept of language ,Jison and Bison The concepts in are basically the same .
although Jison Its name is Bison The first letter of B Change it to JavaScript Of J, However, it is not a simple transplant , It also includes Bison And used together Flex The function of .
Use Flex and Bison The process of parsing includes two parts : Marker interpretation tokenizing And parsing parsing, Marker interpretation tokenizing Use Flex, This process is also called lexical analysis lexical analysis, and Bison Responsible for parsing parsing, This process is also called semantic analysis semantic analyse.
Describe parsing parsing The way is called formal grammar formal grammar, Follow formal grammar formal grammar The language of is called formal language formal language, Formal grammar formal grammar According to Chomsky pedigree, there are four kinds :
- Unrestricted grammar
unrestricted grammar - Context sensitive grammar
context-sensitive grammar - Context-free grammar
context-free grammar - Regular grammar
regular grammar
Jison Syntactic analysis of parsing And Bison Context free grammar is also used context-free grammar.
Jison The grammar file of can analyze the morphology lexical analysis And semantic analysis semantic analyse Divided into two files , You can also include these two parts in one file . When using a file , Use %lex and /lex Mark distinguishing lexical analysis lexical analysis Part and semantic analysis semantic analyse part , %lex and /lex In the tag lex That is to say lexical Abbreviation .
Refer to the previous grammar , from %lex Start to /lex It's lexical analysis lexical analysis,/lex To the end is parsing parsing.
Jison Lexical analysis in grammar lexical analysis part
Lexical analysis lexical analysis Use in %% In two parts ,%% Before that, lexical analysis lexical analysis The definition of definitions part ,%% Then lexical analysis lexical analysis The rules of rules part , In this example, only rules rules part , Definition definitions There is no content in this part . The rules rules The section contains a series of rules in the following form rules
pattern action
among pattern For this rule rules Matching patterns ,action Action performed when matching patterns .pattern Based on regular expressions , And add some extended functions . Lexical analysis in this example lexical analysis The first two rules of rules The pattern in [^\n\r]+ and [\n\r]+ Regular expression .
[^\n\r]+ { return 'LINE'; }
[\n\r]+ { return 'EOL'; }
Lexical analysis in this example lexical analysis The third rule of rules Medium <<EOF>> That is, the pattern at the end of the matching file .
<<EOF>> { return 'EOF'; }
Three rules in this example rules Action in action Are all direct return tags token The name of , Respectively LINE、EOL、EOF.Jison Lexical analysis in lexical analysis The action of action Use JavaScript grammar .
Jison Semantic analysis in grammar semantic analyse part
Semantic analysis semantic analyse Also used in %% In two parts ,%% Before that, semantic analysis semantic analyse Statement of declarations part ,%% Then semantic analysis semantic analyse Grammatical rules grammar rules part , In this example, only syntax rules are included grammar rules part , Statement declarations There is no content in this part .
Semantic analysis semantic analyse Grammar rules in grammar rules Compare lexical analysis lexical analysis The rules in the rules More complicated , The form of foundation is
result : components...
;
among components For grammar rules grammar rules Component part ,result To match grammar rules grammar rules Result . The third syntax rule in this example grammar rules This is the form .
l
: LINE EOL
;
among LINE EOL Indicates the syntax rule grammar rules You need to match the two components LINE and EOL, When matching, you can get the result l, Or say LINE and EOL Combine into l.
LINE and EOL The space in the middle is used to split LINE and EOL, Does not participate in matching , Therefore, you can add spaces as needed .
Rule of grammar grammar rules You can also set the action to be performed when matching action, action action Put it in the component component after , The third syntax rule in this example grammar rules The action of action by { $$ = $1 + $2; }.
l
: LINE EOL
{
$$ = $1 + $2; }
;
Jison Grammar rules in grammar rules The action of action Use JavaScript grammar . In this case $1 The value of the variable is match LINE Value , That is, lexical analysis lexical analysis Back in LINE The pattern in the rule pattern Defined [^\n\r]+ Matching string . In this case $2 The value of the variable is match EOL Value , That is, lexical analysis lexical analysis Back in EOL The pattern in the rule pattern Defined [\n\r]+ Matching string . Assign a value to a variable $$ Namely LINE and EOL Combine into l Value , In this case $1 and $2 String simple connection , In practice, it can be handled on demand , There are more complex examples in the following articles .
Contains multiple rules rule Grammatical rules grammar rules
Rule of grammar grammar rules You can also match multiple situations to get the same result , Pipe symbols are used between multiple rules | Connect , In the form of
result : rule1-components...
| rule2-components...
...
;
The second syntax rule in this example grammar rules This is the form .
ll
: ll l
| l
;
The first rule ll l Indicates the syntax rule grammar rules Need to match two parts ll and l, The second rule l Indicates the syntax rule grammar rules Need to match a part l. among l The third grammatical rule explained above grammar rules Definition , From this grammatical rule grammar rules It can be seen that , The components can not only be lexical analysis lexical analysis Tags in token, It can also be grammatical rules grammar rules Rules defined in .
Recursive rule recursive rule
The components are grammatical rules grammar rules Rules defined in , It can not only be other grammatical rules grammar rules Result , For example, the second rule l, It can also be itself , For example, the first rule is ll and ·l, This rule contains its own grammatical rules grammar rules There is a special name ———— Recursive rule recursive rule.
The recursive rule in this example recursive rule It's the simplest form , First match the second rule l, The result is ll, then ll And the next l Match the first rule ll l, The result is still ll, At this time ll It's equivalent to two l, And so on .
When grammar rules grammar rules When there are multiple rules in , Each rule can set actions action, action action Then the corresponding rules , In this example, the two rules set actions { $$ = $1 + $2; } and { $$ = $1; }.
ll
: ll l
{
$$ = $1 + $2; }
| l
{
$$ = $1; }
;
These two actions action The syntax of is the same as before , Because it is a recursive rule recursive rule, action action Also corresponding recursive execution , For example, match the second rule at the beginning l, Executive action { $$ = $1; }, then ll And the next l Match the first rule ll l, Executive action { $$ = $1 + $2; }, Among them $1 That is to match the second rule at the beginning l Value , It is equivalent to two matching l The corresponding string connection , And so on . The result of the two actions in this example is to combine strings in the matching order .
action action The code in can be written freely , For example, the action of the first rule { $$ = $1 + $2; } It is amended as follows { $$ = $2 + $1; } That is, connect the strings in reverse order , There are more complex examples in the later articles .
Because of recursive rules recursive rule The result can match itself , So we are using recursive rules recursive rule when , To avoid ambiguity in matching , It's best to use recursion rules recursive rule The results of are combined into mismatched recursive rules recursive rule Reuse later , For example, the first syntax rule in this example grammar rules Lieutenant general ll and EOF Combine into p, ll and EOF The combination of does not match ll, Use this way p You can avoid using it directly ll Resulting ambiguity .
p
: ll EOF
;
because p It's a grammar rule grammar rules The last combination in , In this case, directly in p Output in the action of .
p
: ll EOF
{
console.log($1); }
;
action action The grammar of is javascript, Although it matches here EOF, But in action action Not used $2, Nor is it assigned to $$. action action The program in can be written freely , The following articles will explain the design points in detail when explaining more complex examples .
Jison Summary of grammatical structure
Combine the above ,Jison The structure of the file is
/* Lexical analysis lexical analysis Start */
%lex
/* Lexical analysis lexical analysis Defined in the definition */
%%
/* Lexical analysis lexical analysis The rules in the rule */
/* Every rule rule Include patterns pattern And the action action */
pattern1 action1
pattern2 action2
pattern3 action3
/* Lexical analysis lexical analysis end , Semantic analysis semantic analyse Start */
/lex
/* Semantic analysis semantic analyse In a statement declarations */
%%
/* Semantic analysis semantic analyse Grammar rules in grammar rules */
/* Every grammatical rule grammar rule You can define one or more rules , Each rule has several components , Each rule can define corresponding actions action */
result1 : rule11-components...
| rule12-components...
...
;
result2 : rule21-components...
| rule22-components...
...
;
result3 : rule31-components...
| rule32-components...
...
;
边栏推荐
- Daughter love in lunch box
- China electronic grade sulfur trioxide Market Forecast and investment strategy report (2022 Edition)
- C语言-入门-基础-语法-[变量,常亮,作用域](五)
- There are 100 people eating 100 apples, one adult eating 4 apples, and four children eating 1 apple. How can they eat exactly 100 apples? Use any high-level language you are familiar with
- Analysis report on the production and marketing demand and investment forecast of tellurium dioxide in the world and China Ⓣ 2022 ~ 2027
- How to ensure the uniqueness of ID in distributed environment
- HMS core helps baby bus show high-quality children's digital content to global developers
- What exactly is DAAS data as a service? Don't be misled by other DAAS concepts
- 《网络是怎么样连接的》读书笔记 - Tcp/IP连接(二)
- C language - Introduction - Foundation - syntax - data type (4)
猜你喜欢

If you can quickly generate a dictionary from two lists

2022-2028 global gasket metal plate heat exchanger industry research and trend analysis report

2022-2028 research and trend analysis report on the global edible essence industry

Solve the problem of "Chinese garbled MySQL fields"

Sword finger offer 30 contains the stack of Min function

2022-2028 global industrial gasket plate heat exchanger industry research and trend analysis report
![C language - Introduction - Foundation - syntax - [identifier, keyword, semicolon, space, comment, input and output] (III)](/img/89/0f5f4ba07c637b09abe873016cba2d.png)
C language - Introduction - Foundation - syntax - [identifier, keyword, semicolon, space, comment, input and output] (III)

What exactly is DAAS data as a service? Don't be misled by other DAAS concepts

2022-2028 global special starch industry research and trend analysis report

Daughter love: frequency spectrum analysis of a piece of music
随机推荐
Langage C - démarrer - base - syntaxe - [opérateur, conversion de type] (vi)
2022-2028 global protein confectionery industry research and trend analysis report
Horizon sunrise X3 PI (I) first boot details
2022-2028 global optical transparency industry research and trend analysis report
Global and Chinese market of bipolar generators 2022-2028: Research Report on technology, participants, trends, market size and share
【leetcode】29. Divide two numbers
UML sequence diagram [easy to understand]
2022-2028 global edible probiotic raw material industry research and trend analysis report
Solution to null JSON after serialization in golang
The child container margin top acts on the parent container
C语言-入门-基础-语法-数据类型(四)
Awk from entry to earth (12) awk can also write scripts to replace the shell
Global and Chinese market of planar waveguide optical splitter 2022-2028: Research Report on technology, participants, trends, market size and share
2022-2028 global small batch batch batch furnace industry research and trend analysis report
PMP registration process and precautions
20220701 barbarat lemma proof
1211 or chicken and rabbit in the same cage
"How to connect the network" reading notes - Web server request and response (4)
Opencv environment construction (I)
C language - Introduction - Foundation - syntax - [main function, header file] (II)