Reader's guide to this PR:
The aim is to have a more structured and maintainable approach to generating bytecode from the query AST so that different parts of the query processing pipeline have clearer responsibilities, so that developing new functionality is easier. E.g.:
- If you want to implement join reordering -> you do it in `Optimizer`
- If you want to implement `GROUP BY` -> you change `QueryPlanNode::Aggregate` to include it, parse it in `Planner` and handle the code generation for it in `Emitter`
The pipeline is:
`SQL text -> Parser -> Planner -> Optimizer -> Emitter`
and this pipeline generates:
`SQL text -> AST -> Logical Plan -> Optimized Logical Plan -> SQLite Bytecode`
---
Module structure:
`plan.rs`: defines the `Operator` enum. An `Operator` is a tree of other `Operators`, e.g. an `Operator::Join` has `left` and `right` children, etc.
`planner.rs`: Parses an `ast::Select` into a `Plan` which is mainly a wrapper for a root `Operator`
`optimizer.rs`: Makes a new `Plan` from an input `Plan` - does predicate pushdown, constant elimination and turns `Scan` nodes into `SeekRowId` nodes where applicable
`emitter.rs`: Generates bytecode instructions from an input `Plan`.
---
Adds feature `EXPLAIN QUERY PLAN <stmt>` which shows the logical query plan instead of the bytecode plan
---
Other changes:
- Almost everything from `select.rs` removed; things like `translate_aggregation()` moved to `expr.rs`
- `where_clause.rs` removed, some things from it like `translate_condition_expr()` moved to `expr.rs`
- i.e.: there is nothing _new_ in `expr.rs`, stuff just moved there
---
Concerns:
- Perf impact: there's a lot more indirection than before (`Operator`s are very "traditional" trees where they refer to other operators via Boxes etc)
Closes#281
Turns out the new cbindgen version generates slightly different
sqlite3.h so commit that to the tree. The version in Cargo.lock also
changed so let's check in that too.
This PR adds support for `count(*)`. I did not find any other functions than `count` which expect a `*` argument yet, but I'm sure there are some?
Surprisingly (to me) I did not need to make any changes to `translate_expr`, only to `analyze_expr`.
```
limbo> SELECT count(id) FROM users;
2
limbo> SELECT count(*) FROM users;
2
limbo> SELECT count(*) FROM users where id = 1;
1
limbo> SELECT count(id) FROM users where id = 1;
1
```
Other aggregation functions such as sum fail, since they expect a specific column:
```
limbo> select sum(*) from users;
Parse error: sum bad number of arguments
```
Closes#285
This pull request introduces the initial setup for the Python bindings
(#248).
- Setup Configuration: Added the Python binding stack, including the
`pyo3 `crates, `pyproject.toml`, `build.rs`, and other necessary
files.
- Database Class: Implemented the Database class with a constructor to
establish a connection and a query function to execute SQL queries.
- Testing: Created `database.db` with a sample users table and two
entries, as outlined in README.md, and added three pytest functions to
validate the Python output.
Closes#276