# (another) refactor of read path query processing logic This PR rewrites our select query processing architecture by moving away from the stateful operator-based execution model, back to a more direct bytecode generation approach that, IMO, is easier to follow. A large part of the bytecode emission itself (`program.emit_insn(...)`) is just copy-pasted from the old implementation (after all, it did _work_), but just structured differently. ## Main Changes 1. Removed the `step()` state machine from operators. Previously, each operator had internal state tracking its execution progress, and parent operators would call `.step()` on their children until they needed to do something else. Reading the code and trying to follow the execution was not very easy, and the abstraction was also too general: there was a lot of unnecessary pattern matching and special casing to make query execution fit the model, when honestly the evaluation of a SELECT without any CTEs or subqueries etc can only go a few different ways. 2. Because of the above change, the main codegen function `emit_program()` now contains a series of linear conditional steps instead of kicking off the state machines with `root_operator.step()`. These steps are just things like: "open the cursors", "open the loops", "emit a record into either the main output or a sorter", etc. 3. The `Plan` struct now (again) contains most of the familiar SELECT query components (WHERE clause, GROUP BY, ORDER BY, etc.) rather than having all of them embedded in a tree of operators. The operator tree now ONLY consists of operators that read from a source table in some way -- so it could just be called a join tree, I guess. 4. There's now `plan.result_columns` which is _ALWAYS_ evaluated to get the final results of a SELECT. Previously the operator state machine thing had a hodgepodge of different ways of arriving at the result row. 5. Removed operators: - Removed Filter operator (even in the previous version the Filter operator -- which is really the where clause -- had its predicates pushed down to the table loops, and it didn't really ever exist in the bytecode emission phase anymore) - Removed Projection operator (`plan.result_columns`) - Removed Limit operator (`plan.limit`) - Removed Aggregate operator (`plan.group_by` and `plan.aggregates`) - Removed Order operator (`plan.order_by`) 6. Added `ast::Expr::Column` to the vendored sqlite3 parser -- column resolution is now done as early as possible. This eliminates repeated string comparisons during execution. I.e. no need for `resolve_ident_table()` etc 7. Simplified expression result caching by removing the complex, and frankly weird, ExpressionResultCache apparatus. The refactored code handles this by tracking which cursor to read columns from at a given time, and copies values from existing registers if the expression is a computation that has already been done in a previous step of the execution. For example in: ``` limbo> select concat(u.first_name, '-LOL'), sum(u.age) from users u group by concat(u.first_name, '-LOL') order by sum(u.age) desc limit 10; Michael-LOL|11204 David-LOL|8758 Robert-LOL|8109 Jennifer-LOL|7700 John-LOL|7299 Christopher-LOL|6397 James-LOL|5921 Joseph-LOL|5711 Brian-LOL|5059 William-LOL|5047 ``` the query execution engine knows that `concat(u.first_name, '-LOL')` is the second column of the `ORDER_BY` sorter without any complex caching. **HACK:** For deduplicating expressions in ORDER BY and the SELECT body, the code still relies on expression `==` equality to make those decisions which sucks (e.g. `sum(x) != SUM(x)` -- I've marked the parts where this is used with a TODO, we should have a custom expression equality comparison function instead...). This is not a correctness- breaking thing, but still. ## In short - No more state machines - The operator tree is now only a "join tree", pretty much - No weird general purpose `ExpressionResultCache` - More direct mapping between SQL operations and generated bytecode -- there's really no harm in carrying the "group by" etc concepts in the bytecode generation phase instead of burying them inside Operators - When a ResultRow is emitted, it is _always_ done by evaluating `plan.result_columns`, instead of the special-casing and hacks that existed previously - 600+ LOC removed Closes #416
Limbo
Limbo is a work-in-progress, in-process OLTP database management system, compatible with SQLite.
Features
- In-process OLTP database engine library
- Asynchronous I/O support with
io_uring - SQLite compatibility (status)
- SQL dialect support
- File format support
- SQLite C API
- JavaScript/WebAssembly bindings (wip)
Getting Started
CLI
Instal limbo with:
curl --proto '=https' --tlsv1.2 -LsSf \
https://github.com/penberg/limbo/releases/latest/download/limbo-installer.sh | sh
Then use the SQL shell to create and query a database:
$ limbo database.db
Limbo v0.0.6
Enter ".help" for usage hints.
limbo> CREATE TABLE users (id INT PRIMARY KEY, username TEXT);
limbo> INSERT INTO users VALUES (1, 'alice');
limbo> INSERT INTO users VALUES (2, 'bob');
limbo> SELECT * FROM users;
1|alice
2|bob
JavaScript (wip)
Installation:
npm i limbo-wasm
Example usage:
import { Database } from 'limbo-wasm';
const db = new Database('sqlite.db');
const stmt = db.prepare('SELECT * FROM users');
const users = stmt.all();
console.log(users);
Python (wip)
pip install pylimbo
Example usage:
import limbo
con = limbo.connect("sqlite.db")
cur = con.cursor()
res = cur.execute("SELECT * FROM users")
print(res.fetchone())
Developing
Run tests:
cargo test
Test coverage report:
cargo tarpaulin -o html
Run benchmarks:
cargo bench
Run benchmarks and generate flamegraphs:
echo -1 | sudo tee /proc/sys/kernel/perf_event_paranoid
cargo bench --bench benchmark -- --profile-time=5
FAQ
How is Limbo different from libSQL?
Limbo is a research project to build a SQLite compatible in-process database in Rust with native async support. The libSQL project, on the other hand, is an open source, open contribution fork of SQLite, with focus on production features such as replication, backups, encryption, and so on. There is no hard dependency between the two projects. Of course, if Limbo becomes widely successful, we might consider merging with libSQL, but that is something that will be decided in the future.
Publications
- Pekka Enberg, Sasu Tarkoma, Jon Crowcroft Ashwin Rao (2024). Serverless Runtime / Database Co-Design With Asynchronous I/O. In EdgeSys ‘24. [PDF]
- Pekka Enberg, Sasu Tarkoma, and Ashwin Rao (2023). Towards Database and Serverless Runtime Co-Design. In CoNEXT-SW ’23. [PDF] [Slides]
Contributing
We'd love to have you contribute to Limbo! Check out the contribution guide to get started.
License
This project is licensed under the MIT license.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in Limbo by you, shall be licensed as MIT, without any additional terms or conditions.
