当前位置:网站首页>A brief introduction to the lexical analysis of PostgreSQL

A brief introduction to the lexical analysis of PostgreSQL

2022-06-24 14:27:00 happytree001

One 、 Lexical file structure

flex Lexical input file for .l Structure through %% Divided into three parts

  • Statement

  • The rules

  • c Code

In the statement %{ C Code declaration %} The contents of the package in will be copied directly to the generated C In the code ,

The last segment is also copied directly to the generated C In the code ,

The middle rule segment , Each rule consists of Pattern + action form , Patterns are usually written in regular expressions .
 Insert picture description here

Two 、pg Lexical analysis file in

2.1 Declaration paragraph

src/backend/parser/scan.l


%{

/* LCOV_EXCL_START */

/* Avoid exit() on fatal scanner errors (a bit ugly -- see yy_fatal_error) */
#undef fprintf
#define fprintf(file, fmt, msg)  fprintf_to_ereport(fmt, msg)

static void
fprintf_to_ereport(const char *fmt, const char *msg)
{
	ereport(ERROR, (errmsg_internal("%s", msg)));
}

/*
 * GUC variables.  This is a DIRECT violation of the warning given at the
 * head of gram.y, ie flex/bison code must not depend on any GUC variables;
 * as such, changing their values can induce very unintuitive behavior.
 * But we shall have to live with it until we can remove these variables.
 */
int			backslash_quote = BACKSLASH_QUOTE_SAFE_ENCODING;
bool		escape_string_warning = true;
bool		standard_conforming_strings = true;
...

%}

...
%option nodefault
%option noinput
%option nounput
%option noyywrap
%option noyyalloc
%option noyyrealloc
%option noyyfree
%option warn
%option prefix="core_yy"

...

%%

2.2 Rule segment

{whitespace}	{
					/* ignore */
				}

{xcstart}		{
					/* Set location in case of syntax error in comment */
					SET_YYLLOC();
					yyextra->xcdepth = 0;
					BEGIN(xc);
					/* Put back any characters past slash-star; see above */
					yyless(2);
				}

<xc>{
{xcstart}		{
					(yyextra->xcdepth)++;
					/* Put back any characters past slash-star; see above */
					yyless(2);
				}
...

%%

/* LCOV_EXCL_STOP */

2.3 C code paragraph

/* * Arrange access to yyextra for subroutines of the main yylex() function. * We expect each subroutine to have a yyscanner parameter. Rather than * use the yyget_xxx functions, which might or might not get inlined by the * compiler, we cheat just a bit and cast yyscanner to the right type. */
#undef yyextra
#define yyextra (((struct yyguts_t *) yyscanner)->yyextra_r)

/* Likewise for a couple of other things we need. */
#undef yylloc
#define yylloc (((struct yyguts_t *) yyscanner)->yylloc_r)
#undef yyleng
#define yyleng (((struct yyguts_t *) yyscanner)->yyleng_r)


/* * scanner_errposition * Report a lexer or grammar error cursor position, if possible. * * This is expected to be used within an ereport() call, or via an error * callback such as setup_scanner_errposition_callback(). The return value * is a dummy (always 0, in fact). * * Note that this can only be used for messages emitted during raw parsing * (essentially, scan.l, parser.c, and gram.y), since it requires the * yyscanner struct to still be available. */
int
scanner_errposition(int location, core_yyscan_t yyscanner)
{
    
	int			pos;

	if (location < 0)
		return 0;				/* no-op if location is unknown */

	/* Convert byte offset to character number */
	pos = pg_mbstrlen_with_len(yyextra->scanbuf, location) + 1;
	/* And pass it to the ereport mechanism */
	return errposition(pos);
}

...

2.4 Final generated lexical analysis file

src/backend/parser/scan.c
adopt flex Tools , Yes scan.l Document processing , It will eventually generate scan.c Code , Used for subsequent pairs of sql Lexical analysis of language .

原网站

版权声明
本文为[happytree001]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/175/202206241241168339.html