turso

mirror of https://github.com/aljazceru/turso.git synced 2025-12-31 15:04:19 +01:00

Author	SHA1	Message	Date
jussisaurio	3e9883bfbd	update COMPAT	2024-11-30 10:06:37 +02:00
jussisaurio	3f80e41e7a	support HAVING	2024-11-30 10:05:13 +02:00
jussisaurio	fceb9ac62b	Merge 'core: (another) refactor of read path query processing logic' from Jussi Saurio # (another) refactor of read path query processing logic This PR rewrites our select query processing architecture by moving away from the stateful operator-based execution model, back to a more direct bytecode generation approach that, IMO, is easier to follow. A large part of the bytecode emission itself (`program.emit_insn(...)`) is just copy-pasted from the old implementation (after all, it did _work_), but just structured differently. ## Main Changes 1. Removed the `step()` state machine from operators. Previously, each operator had internal state tracking its execution progress, and parent operators would call `.step()` on their children until they needed to do something else. Reading the code and trying to follow the execution was not very easy, and the abstraction was also too general: there was a lot of unnecessary pattern matching and special casing to make query execution fit the model, when honestly the evaluation of a SELECT without any CTEs or subqueries etc can only go a few different ways. 2. Because of the above change, the main codegen function `emit_program()` now contains a series of linear conditional steps instead of kicking off the state machines with `root_operator.step()`. These steps are just things like: "open the cursors", "open the loops", "emit a record into either the main output or a sorter", etc. 3. The `Plan` struct now (again) contains most of the familiar SELECT query components (WHERE clause, GROUP BY, ORDER BY, etc.) rather than having all of them embedded in a tree of operators. The operator tree now ONLY consists of operators that read from a source table in some way -- so it could just be called a join tree, I guess. 4. There's now `plan.result_columns` which is _ALWAYS_ evaluated to get the final results of a SELECT. Previously the operator state machine thing had a hodgepodge of different ways of arriving at the result row. 5. Removed operators: - Removed Filter operator (even in the previous version the Filter operator -- which is really the where clause -- had its predicates pushed down to the table loops, and it didn't really ever exist in the bytecode emission phase anymore) - Removed Projection operator (`plan.result_columns`) - Removed Limit operator (`plan.limit`) - Removed Aggregate operator (`plan.group_by` and `plan.aggregates`) - Removed Order operator (`plan.order_by`) 6. Added `ast::Expr::Column` to the vendored sqlite3 parser -- column resolution is now done as early as possible. This eliminates repeated string comparisons during execution. I.e. no need for `resolve_ident_table()` etc 7. Simplified expression result caching by removing the complex, and frankly weird, ExpressionResultCache apparatus. The refactored code handles this by tracking which cursor to read columns from at a given time, and copies values from existing registers if the expression is a computation that has already been done in a previous step of the execution. For example in: ``` limbo> select concat(u.first_name, '-LOL'), sum(u.age) from users u group by concat(u.first_name, '-LOL') order by sum(u.age) desc limit 10; Michael-LOL\|11204 David-LOL\|8758 Robert-LOL\|8109 Jennifer-LOL\|7700 John-LOL\|7299 Christopher-LOL\|6397 James-LOL\|5921 Joseph-LOL\|5711 Brian-LOL\|5059 William-LOL\|5047 ``` the query execution engine knows that `concat(u.first_name, '-LOL')` is the second column of the `ORDER_BY` sorter without any complex caching. HACK: For deduplicating expressions in ORDER BY and the SELECT body, the code still relies on expression `==` equality to make those decisions which sucks (e.g. `sum(x) != SUM(x)` -- I've marked the parts where this is used with a TODO, we should have a custom expression equality comparison function instead...). This is not a correctness- breaking thing, but still. ## In short - No more state machines - The operator tree is now only a "join tree", pretty much - No weird general purpose `ExpressionResultCache` - More direct mapping between SQL operations and generated bytecode -- there's really no harm in carrying the "group by" etc concepts in the bytecode generation phase instead of burying them inside Operators - When a ResultRow is emitted, it is _always_ done by evaluating `plan.result_columns`, instead of the special-casing and hacks that existed previously - 600+ LOC removed Closes #416	2024-11-30 10:03:58 +02:00
jussisaurio	84742b81fa	Obsolete comment	2024-11-27 22:43:36 +02:00
jussisaurio	da811dc403	add doc comments for members of Plan struct	2024-11-27 19:30:07 +02:00
jussisaurio	db462530f1	metadata instead of m	2024-11-27 19:27:36 +02:00
jussisaurio	7d569aee1f	fix stupid comment	2024-11-26 18:37:06 +02:00
jussisaurio	1b34698872	add comments and rename some misleading label variables	2024-11-26 18:28:19 +02:00
jussisaurio	7f04f8e88f	rename	2024-11-26 17:41:08 +02:00
jussisaurio	122546444f	extract function order_by_sorter_insert()	2024-11-26 17:40:49 +02:00
jussisaurio	3d27ef90f5	emitting result columns generally works the same way -> extract it	2024-11-26 17:31:51 +02:00
jussisaurio	c74981873e	Extract ORDER BY result column deduping into a function	2024-11-26 17:31:51 +02:00
jussisaurio	89569fa7a3	Remove redundant if-else after refactoring ResultSetColumn to struct	2024-11-26 17:31:51 +02:00
jussisaurio	ac12e9c7fd	No need for ResultSetColumn to be an enum	2024-11-26 17:31:51 +02:00
jussisaurio	bb8ba7fb01	add tests for arithmetic on two aggregates with no from clause	2024-11-26 17:31:51 +02:00
jussisaurio	7d5fa12bb7	fix allocating wrong number of registers upfront for aggregation results	2024-11-26 17:31:51 +02:00
jussisaurio	4636f71522	test ordering by aggregate not mentioned in select	2024-11-26 17:31:51 +02:00
jussisaurio	56b15193d0	resolve aggregates from orderby as well	2024-11-26 17:31:51 +02:00
jussisaurio	885b6ecd76	Remove 'cursor_hint': it is never needed	2024-11-26 17:31:51 +02:00
jussisaurio	008be10cfd	Add TODO about expression equality comparisons	2024-11-26 17:31:51 +02:00
jussisaurio	cfb7e79601	Function doc comments	2024-11-26 17:31:51 +02:00
jussisaurio	fc33c70481	remove many unnecessary fields from SortMetadata and GroupByMetadata	2024-11-26 17:31:51 +02:00
jussisaurio	ebce78bcd9	rename	2024-11-26 17:31:51 +02:00
jussisaurio	0510e150d3	fix comment	2024-11-26 17:31:51 +02:00
jussisaurio	1c37d8b24b	extract function sorter_insert()	2024-11-26 17:31:51 +02:00
jussisaurio	4f3da982c0	extract function emit_result_row()	2024-11-26 17:31:51 +02:00
jussisaurio	52beeabd45	tweaks	2024-11-26 17:31:51 +02:00
jussisaurio	120601f732	fix metadata comments	2024-11-26 17:31:51 +02:00
jussisaurio	97ba4a788e	remove sorts hashmap - only one sortmetadata struct is needed	2024-11-26 17:31:51 +02:00
jussisaurio	d2f84edd2e	fix accidentally removing push_scan_direction()	2024-11-26 17:31:51 +02:00
jussisaurio	7ecc252507	fix rest of the failing tests	2024-11-26 17:31:51 +02:00
jussisaurio	9a557516b8	Fixes for expressions with aggregate arguments + limit 0	2024-11-26 17:31:51 +02:00
jussisaurio	cc902ed25d	GROUP BY and ORDER BY mostly work	2024-11-26 17:31:51 +02:00
jussisaurio	3f9e60633f	select refactor: order by and basic agg kinda work	2024-11-26 17:31:51 +02:00
jussisaurio	d0466e1cae	introduce Column member of ast::Expr and bind idents to columns	2024-11-26 17:31:51 +02:00
Pekka Enberg	4c5f9eb73b	Merge 'contributing: Add note about testing against TPC-H databases' from Jussi Saurio Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #419	2024-11-26 15:56:43 +02:00
jussisaurio	574f52ddbb	Add note about testing against TPC-H databases	2024-11-25 21:57:34 +02:00
jussisaurio	418ad40401	Merge 'Fix some Clippy warnings' from Lauri Virtanen Reviewed-by: Pere Diaz Bou <limeng.1@bytedance.com> Closes #417	2024-11-25 16:43:06 +02:00
jussisaurio	1651779e4c	Merge 'Improve maths support' from Lauri Virtanen - Add support for division in SQL expressions - Fix issues with subtraction - Support multiplication of integers and floats - Support aggregate functions in mathematical expressions - Add compatibility tests for mathematical operations, also with aggregate functions Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #418	2024-11-25 16:36:45 +02:00
Lauri Virtanen	1b2835b316	Add math operator compatibility tests	2024-11-24 22:12:23 +02:00
Lauri Virtanen	70c4d6b360	Support multiplying combinations of different types	2024-11-24 22:11:37 +02:00
Lauri Virtanen	af9d407dee	Fix issues with subtraction of different type combinations	2024-11-24 22:10:23 +02:00
Lauri Virtanen	cafbf5499f	Support divide operator in expressions	2024-11-24 22:10:07 +02:00
Lauri Virtanen	afeb1cbe74	Clippy warning fixes	2024-11-24 20:24:47 +02:00
Lauri Virtanen	a7100d8e9b	Autofix clippy issues with `cargo fix --clippy`	2024-11-24 20:24:47 +02:00
Pekka Enberg	aa4dd5c8e7	Merge 'wal: checksums' from Pere Diaz Bou Implemeted checksums so that sqlite3 is able to read our WAL. This also helps with future work on proper recovery of WAL. Create some frames with CREATE TABLE and kill the process so that there is no checkpoint. ``` Limbo v0.0.6 Enter ".help" for usage hints. limbo> create table x(x); limbo> [1] 15910 killed cargo run xlimbo.db ``` Now sqlite3 is able to recover from this WAL created in limbo: ``` sqlite3 xlimbo.db SQLite version 3.43.2 2023-10-10 13:08:14 Enter ".help" for usage hints. sqlite> .schema CREATE TABLE x (x); ``` Closes #413	2024-11-22 13:21:01 +02:00
jussisaurio	046c4933a5	Merge 'io macro' from Jussi Saurio just having some dang fun Reviewed-by: Pere Diaz Bou <pere-altea@hotmail.com> Reviewed-by: Pere Diaz Bou <pere-altea@hotmail.com> Closes #385	2024-11-21 20:36:07 +02:00
jussisaurio	c722074016	missing cursorresult handling	2024-11-21 20:29:46 +02:00
jussisaurio	f945795ae6	consistent naming	2024-11-21 20:25:51 +02:00
jussisaurio	d8eb4be424	better, less cool names	2024-11-21 20:23:53 +02:00

1 2 3 4 5 ...

1213 Commits