before:
```sh
sqlparser-rs parsing benchmark/sqlparser::select
time: [693.20 ns 693.96 ns 694.73 ns]
change: [+7.4382% +7.6384% +7.8250%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low severe
1 (1.00%) low mild
1 (1.00%) high mild
sqlparser-rs parsing benchmark/sqlparser::with_select
time: [2.5734 µs 2.5763 µs 2.5796 µs]
change: [+16.583% +16.809% +17.024%] (p = 0.00 < 0.05)
Performance has regressed.
sqlparser-rs parsing benchmark/keyword_token
time: [3.1919 µs 3.1983 µs 3.2047 µs]
change: [+944.74% +948.97% +952.91%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) low mild
```
after:
```sh
sqlparser-rs parsing benchmark/sqlparser::select
time: [637.09 ns 638.50 ns 640.15 ns]
change: [-1.8412% -1.5494% -1.2424%] (p = 0.00 < 0.05)
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low severe
3 (3.00%) low mild
3 (3.00%) high mild
1 (1.00%) high severe
sqlparser-rs parsing benchmark/sqlparser::with_select
time: [2.1896 µs 2.1919 µs 2.1942 µs]
change: [-0.6894% -0.3923% -0.1517%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) low severe
sqlparser-rs parsing benchmark/keyword_token
time: [298.99 ns 299.82 ns 300.72 ns]
change: [-1.4726% -1.0148% -0.5702%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) low mild
6 (6.00%) high mild
```
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes #1939
LEMON parser generator modified to generate Rust code.
Lemon source and SQLite3 grammar were last synced as of July 2024.
Unsupported
Unsupported Grammar syntax
%token_destructor: Code to execute to destroy token data%default_destructor: Code for the default non-terminal destructor%destructor: Code which executes whenever this symbol is popped from the stack during error processing
https://www.codeproject.com/Articles/1056460/Generating-a-High-Speed-Parser-Part-Lemon https://www.sqlite.org/lemon.html
SQLite
SQLite lexer and SQLite parser have been ported from C to Rust. The parser generates an AST.
Lexer/Parser:
- Keep track of position (line, column).
- Streamable (stop at the end of statement).
- Resumable (restart after the end of statement).
Lexer and parser have been tested with the following scripts:
- https://github.com/bkiers/sqlite-parser/tree/master/src/test/resources
- https://github.com/codeschool/sqlite-parser/tree/master/test/sql/official-suite which can be updated with script in https://github.com/codeschool/sqlite-parser/tree/master/test/misc
TODO:
- Check generated AST (reparse/reinject)
- If a keyword in double quotes is used in a context where it cannot be resolved to an identifier but where a string literal is allowed, then the token is understood to be a string literal instead of an identifier.
- Tests
- Do not panic while parsing
- CREATE VIRTUAL TABLE args
- Zero copy (at least tokens)
Unsupported by Rust
#linedirective
API change
- No
ParseAlloc/ParseFreeanymore
Features not tested
- NDEBUG
- YYNOERRORRECOVERY
- YYERRORSYMBOL
To be fixed
- RHS are moved. Maybe it is not a problem if they are always used once. Just add a check in lemon...
%extra_argumentis not supported.- Terminal symbols generated by lemon should be dumped in a specified file.
Raison d'être
-
lemon_rust does the same thing but with an old version of
lemon. And it seems not possible to useyystackas a stack because items may be access randomly and thetop+1item can be used. -
lalrpop would be the perfect alternative but it does not support fallback/streaming (see this issue) and compilation/generation is slow.
Minimum supported Rust version (MSRV)
Latest stable Rust version at the time of release. It might compile with older versions.