当前位置:网站首页>Write a jison parser from scratch (1/10):jison, not JSON

Write a jison parser from scratch (1/10):jison, not JSON

2022-07-04 09:27:00 Hu Zhenghui

Write one from scratch Jison Parser (1/10):Jison, No Json

The title is correct ,Jison, No Json

JSON(JavaScript Object Notation) Is derived from JavaScript Lightweight data exchange format , It is convenient for human beings to read and write , It is also easy to machine parse and generate .JSON be based on Standard ECMA-262 3rd Edition - December 1999 A subset of . So the name contains JavaScript, however JSON Use a completely language independent text format , But it also uses something like C The habits of the language family ( Include C, C++, C#, Java, JavaScript, Perl, Python etc. ). These characteristics make JSON Become the ideal data exchange language .

and Jison yes JavaScript Parser generator (parser generator).

What is a parser (parser)?

Parser generator (parser generator)? It sounds a bit awkward , Understand why Jison It is called parser generator (parser generator), Or from the JSON Speaking of . This is a JSON Example .

{
    
  "object" : {
    
    "number" : 3.1415926,
    "string" : "This is a string in an object."
  },
  "array" : [
    "First value of array",
    "Second value of array",
    "Third value of array"
  ]
}

stay JavaScript Output directly JSON.

console.log({
    
  "object" : {
    
    "number" : 3.1415926,
    "string" : "This is a string in an object."
  },
  "array" : [
    "First value of array",
    "Second value of array",
    "Third value of array"
  ]
});

If it's a string , Then you can only output strings .

console.log(`{ "object" : { "number" : 3.1415926, "string" : "This is a string in an object." }, "array" : [ "First value of array", "Second value of array", "Third value of array" ] }`);

You can use JSON.parse The function parses a string into JSON.

console.log(JSON.parse(`{ "object" : { "number" : 3.1415926, "string" : "This is a string in an object." }, "array" : [ "First value of array", "Second value of array", "Third value of array" ] }`));

the JSON.parse Functions are parsers (parser), This is what the function name means .

Then the parser (parser) And parser generator (parser generator) What are the differences and connections ?

JSON.parse Function can only be used to parse in the form JSON String , If the string format is not JSON, Even slightly changed , For example, in JSON Add comments like writing a program .

console.log(JSON.parse(`{ //this is a comment line "object" : { "number" : 3.1415926, "string" : "This is a string in an object." }, "array" : [ "First value of array", "Second value of array", "Third value of array" ] }`));

You're going to report a mistake

SyntaxError: Unexpected token / in JSON at position 4

This is because JSON.parse The function does not recognize the new //this is a comment line, You need to use JSON5.

const JSON5 = require('json5')
console.log(JSON5.parse(`{ //this is a comment line "object" : { "number" : 3.1415926, "string" : "This is a string in an object." }, "array" : [ "First value of array", "Second value of array", "Third value of array" ] }`));

What we use here JSON5.parse function , Namely JSON5 The parser provided (parser), however JSON5.parse The function can only recognize that it conforms to JSON5 Canonical string . Similar to that HjsonHOCON, All are JSON Variant parser (parser), Each parser can only parse the syntax that conforms to its own format specification .

Mentioned earlier JSON(JavaScript Object Notation) Is a lightweight data exchange format , It is convenient for human beings to read and write , It is also easy to machine parse and generate . Similar human readable ( Non binary ) Data exchange format RDF(Resource Description Framework)、XML(eXtensible Markup Language)、Atom( be based on XML)、YAML( yes YAML Ain’t Markup Language Recursive abbreviation of )、EDN(Extensible Data Notation)、Property listTOML(Tom’s Obvious, Minimal Language)、Rebol(Relative Expression Based Object Language)、Gellish(Generic Engineering Language).

These formats have corresponding parsers (parser), Similar to the previous JSON Examples of adding comments in , They also have their own shortcomings and limitations . For example, use these mature formats mentioned above , In almost all common programming languages, ready-made parsers can be used directly (parser), Sometimes there are multiple parsers (parser) Realization , But while enjoying the convenience , Also accept all its shortcomings , for example YAML The third edition of YAML 1.2 Specification manual PDF Version has 84 page ! And common YAML Usually there are only dozens of lines , In order to write dozens of lines correctly YAML And read 84 Page manual is not worth the loss . However, when using program processing YAML when , If you don't use mature packages , It's self parsing , Mistakes are inevitable , After all 84 There are too many details described in the page manual . Or like JSON Various variants of parser , Mature formats need to be improved , When the original format does not support this improvement , You also need to write your own parser (parser), At this time, we also need to fully understand the original specifications . further , If you want to customize the format , You also need to write a parser specifically (parser).

What is a parser generator (parser generator)?

I mentioned several common parsers that need to be written by myself (parser) Scenarios and challenges , So is there any way to develop a parser quickly and well (parser) Well ? There is also a class of mature products , Specifically for generating parsers (parser), Such products are parser generators (parser generator).

原网站

版权声明
本文为[Hu Zhenghui]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202141426163005.html