Skip to content

Backtracking parser generator, for context free grammar (written in PHP). It generates generic parsers written in PHP.

License

Notifications You must be signed in to change notification settings

dfar143/Cheope-php-parser-generator

Repository files navigation

Cheope-php-parser-generator
===========================

BACKTRACKING PARSER GENERATOR FOR PHP.

The parser generator is in the 'grammar_rule_gen' directory.
The main module is 'parser_gen_exec.php'.

Usage:
php parser_gen_exec.php |prjName| -l |logFileName|

where |prjName| is the project name and must match the |prjName|.xml
grammars definition file.

Example:
php parser_gen_exec.php example_1 -l log.txt
or (the -l flag is optional)
php parser_gen_exec.php example_1

The module can be called from a browser too: the default log file name 
is 'log.txt'.
In this case, to specify the project name, you must edit the 
'grammar_rules_gen_def.php' file and set the PRJ_NAME constant to the right 
project name.

The grammar definition file allows to write any number of grammars in the 
'grammar_rules' section.
Each grammar must have a 'grammar_rule' main section with a 'name'
not null attribute. 
The main section contains 3 sub-sections:
'tokens_def' that contains tokens regular expressions definitions ('token' tags);
'tokens_attributes' that contains tokens attributes definitions ('Attr' tags);
'productions' that contains all the productions for the grammar ('production' tags).

A generic production has a very simple sintax; it is composed by
not terminals (in uppercase), terminals (in lowercase), the equal (=) symbol that
divides the left and the right side of the production 
and the logical OR symbol (|).
The terminal 'epsilon' is the always true terminal.
The terminal 'ws' is used to specify the white space and his exact definition 
must be included (with his regular expression) in the tokens definition section.
The code for not terminal SPACE is automatically generated and the associated production
(implicitly present) is SPACE = ws | epsilon. Not terminal SPACE can be freely
used in the others productions.
The right side is typically composed by successions of terminals and not terminals 
divided by logical ORs symbols; each succession is a logical AND succession.

Each token is defined by a 'token' xml tag that have two mandatory attributes:
'type' and 'val'.
The precedence of the tokens definitions is important.
See 'example_1.xml' for an example of a grammar definition file.

Attributes are optionally, but the section 'tokens_attributes' must be present.
See directory 'Ric_sql' and files 'ric_sql_def.php' and 
'Ric_sql_parser_grammar_rules.php' for an example of using them.
 
The php module generates two others modules that are : |prjName|_def.php and
|prjName|_parser_grammar_rules.php.
The first contains constants, lex rules objects instances definitions and grammar 
rules objects instances definitions. The second contains the parser recursive 
engine.
The results of the execution are added at each call, so to create a new 
parser you have to flush the old files.
These files must then be copied in the destination directory.
As an example of destination directory you can use the 'parser_example'
directory. This one contains the others parser files.
The parser generic engine is in the 'Parser.php' file and contains the main php
Parser class.
This class has ,as entry point, the public method 'exec' that tries all the grammars 
and returns true if the text to be parsed satisfied at least one.
It fills the '$results' array with the execution status for each grammar.
See 'Parser.php' file comments.
 
In the 'Parser_example' directory there is the 'parser_exec.php' module that
calls the parser engine.

Usage:
php parser_exec.php |textToBeParsedFileName|

where |textToBeParsedFileName| is the file that contains the text to be parsed.

Example:
php parser_exec.php example.txt

The 'parser_exec.php' module acts like a validator and returns 'Ok.' if succedes
or the current error if not.
Furthermore, if it is called by a browser, it displays the symbol table.

The applications 'php_arrays' and 'ric_sql' are under development.

This application has been developed with PHP version 5.5.7, but I suppose , it can run
with many others previous, since no particulary advanced techniques has been used,except class 
construct.
 


About

Backtracking parser generator, for context free grammar (written in PHP). It generates generic parsers written in PHP.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages