This PR is a Drop-In replacement to the Predicate defined in the Simulator. Predicate is basically the same as our ast::Expr, but it supports a small number of the SQL expression syntax. By creating a NewType that wraps ast::Expr we can tap into our already mostly correctly defined parser structs. This change will enable us to easily add generation for more types of sql queries. I also added an ArbitraryFrom impl for ast::Expr that can be used in a freestyle way (for now) for differential testing. This PR also aims to implement Unary Operator logic similar to the Binary Operator logic we have for predicate. After this change we may need to adjust the Logic for how some assertions are triggered. <s>Sometimes the `Select-Select-Optimizer` property thinks that these two queries should return the same thing: ```sql SELECT (twinkling_winstanley.sensible_federations > x'66616e7461737469625e0f37879823db' AND twinkling_winstanley.sincere_niemeyer < -7428368947470022783) FROM twinkling_winstanley WHERE 1; SELECT * FROM twinkling_winstanley WHERE twinkling_winstanley.sensible_federations > x'66616e7461737469625e0f37879823db' AND twinkling_winstanley.sincere_niemeyer < -7428368947470022783; ``` However after running the shrunk plan manually, the simulator was incorrect in asserting that. Maybe this a bug a in the generation of such a query? Not sure yet. </s> <b>EDIT: The simulator was correctly catching a bug and I thought I was the problem. The bug was in `exec_if` and I fixed it in this PR.</b> I still need to expand the Unary Operator generation to other types of predicates. For now, I just implemented it for `SimplePredicate` as I'm trying to avoid to bloat even more this PR. <b>EDIT: I decided to just have one PR open for all the changes I'm making to make my life a bit easier and to avoid merge conflicts with my own branches that I keep spawning for new code.</b> PS: This should only be considered for merging after https://github.com/tursodatabase/limbo/pull/1619 is merged. Then, I will remove the draft status from this PR. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1674
LEMON parser generator modified to generate Rust code.
Lemon source and SQLite3 grammar were last synced as of July 2024.
Unsupported
Unsupported Grammar syntax
%token_destructor: Code to execute to destroy token data%default_destructor: Code for the default non-terminal destructor%destructor: Code which executes whenever this symbol is popped from the stack during error processing
https://www.codeproject.com/Articles/1056460/Generating-a-High-Speed-Parser-Part-Lemon https://www.sqlite.org/lemon.html
SQLite
SQLite lexer and SQLite parser have been ported from C to Rust. The parser generates an AST.
Lexer/Parser:
- Keep track of position (line, column).
- Streamable (stop at the end of statement).
- Resumable (restart after the end of statement).
Lexer and parser have been tested with the following scripts:
- https://github.com/bkiers/sqlite-parser/tree/master/src/test/resources
- https://github.com/codeschool/sqlite-parser/tree/master/test/sql/official-suite which can be updated with script in https://github.com/codeschool/sqlite-parser/tree/master/test/misc
TODO:
- Check generated AST (reparse/reinject)
- If a keyword in double quotes is used in a context where it cannot be resolved to an identifier but where a string literal is allowed, then the token is understood to be a string literal instead of an identifier.
- Tests
- Do not panic while parsing
- CREATE VIRTUAL TABLE args
- Zero copy (at least tokens)
Unsupported by Rust
#linedirective
API change
- No
ParseAlloc/ParseFreeanymore
Features not tested
- NDEBUG
- YYNOERRORRECOVERY
- YYERRORSYMBOL
To be fixed
- RHS are moved. Maybe it is not a problem if they are always used once. Just add a check in lemon...
%extra_argumentis not supported.- Terminal symbols generated by lemon should be dumped in a specified file.
Raison d'être
-
lemon_rust does the same thing but with an old version of
lemon. And it seems not possible to useyystackas a stack because items may be access randomly and thetop+1item can be used. -
lalrpop would be the perfect alternative but it does not support fallback/streaming (see this issue) and compilation/generation is slow.
Minimum supported Rust version (MSRV)
Latest stable Rust version at the time of release. It might compile with older versions.