turso

mirror of https://github.com/aljazceru/turso.git synced 2026-01-16 06:34:20 +01:00

Author	SHA1	Message	Date
meteorgan	6179d8de23	refactor compound select	2025-06-13 10:39:32 +03:00
Levy A.	de2ac89ad2	feat: complete ALTER TABLE implementation	2025-06-11 14:17:36 -03:00
pedrocarlo	bfc8cb6d4c	move display and to_sql_string impls to separate modules for plan	2025-06-04 12:06:43 -03:00
pedrocarlo	fa0dff9843	Fix rebase changes	2025-06-04 12:06:43 -03:00
pedrocarlo	a96577529e	impl ToSqlString for Update Plan	2025-06-04 12:06:43 -03:00
pedrocarlo	d243d1015c	impl ToSqlString for Delete Plan	2025-06-04 12:06:43 -03:00
pedrocarlo	ff5aa17769	impl ToSqlString for CompoundSelect Plan	2025-06-04 12:06:43 -03:00
pedrocarlo	51014d01c3	impl ToSqlString for SelectPlan	2025-06-04 12:06:43 -03:00
Jussi Saurio	cc405dea7e	Use new TableReferences struct everywhere	2025-05-29 11:44:56 +03:00
Jussi Saurio	124b38a262	plan.rs: add new datastructures - TableReferences struct, which holds both: - joined_tables, and - outer_query_refs - JoinedTable: - this is just a rename of the previous TableReference struct - OuterQueryReference - this is to distinguish from JoinedTable those cases where e.g. a subquery refers to an outer query's table, or a CTE refers to a previous CTE. Both JoinedTable and OuterQueryReference can be referred to by expressions, but only JoinedTables are considered for join ordering optimization and so forth. This commit does not compile.	2025-05-29 11:03:09 +03:00
Jussi Saurio	77ce4780d9	Fix ProgramBuilder::cursor_ref not having unique keys Currently we have this: program.alloc_cursor_id(Option<String>, CursorType)` where the String is the table's name or alias ('users' or 'u' in the query). This is problematic because this can happen: `SELECT * FROM t WHERE EXISTS (SELECT * FROM t)` There are two cursors, both with identifier 't'. This causes a bug where the program will use the same cursor for both the main query and the subquery, since they are keyed by 't'. Instead introduce `CursorKey`, which is a combination of: 1. `TableInternalId`, and 2. index name (Option<String> -- in case of index cursors. This should provide key uniqueness for cursors: `SELECT * FROM t WHERE EXISTS (SELECT * FROM t)` here the first 't' will have a different `TableInternalId` than the second `t`, so there is no clash.	2025-05-29 00:59:24 +03:00
Jussi Saurio	73e806ad84	Make WhereTerm::consumed a Cell<bool> Currently in the main translation logic after planning and optimization, we don't _really_ need to pass a &mut Vec<WhereTerm> around anymore, except for the fact that virtual table constraint resolution is done ad-hoc in `init_loop()`. Even there, the only thing we mutate is `WhereTerm::consumed` which is a boolean indicating that the term has been "used up" by the optimizer and shouldn't be evaluated as a normal where clause condition anymore. In the upcoming branch for WHERE clause subqueries, I want to store immutable references to WHERE clause expressions in `Resolver`, but this is unfortunately not possible if we still use the aforementioned mutable references. Hence, we can temporarily make `WhereTerm::consumed` a `Cell<bool>` which allows us to pass an immutable reference to `init_loop()`, and the `Cell` can be removed once the virtual table constraint resolution is moved to an earlier part of the query processing pipeline.	2025-05-28 11:02:39 +03:00
Jussi Saurio	07fa3a9668	Rename SelectQueryType to QueryDestination	2025-05-25 21:23:04 +03:00
Jussi Saurio	d893a55c55	UNION	2025-05-25 21:23:04 +03:00
Jussi Saurio	7c07c09300	Add stable internal_id property to TableReference Currently our "table id"/"table no"/"table idx" references always use the direct index of the `TableReference` in the plan, e.g. in `SelectPlan::table_references`. For example: ```rust Expr::Column { table: 0, column: 3, .. } ``` refers to the 0'th table in the `table_references` list. This is a fragile approach because it assumes the table_references list is stable for the lifetime of the query processing. This has so far been the case, but there exist certain query transformations, e.g. subquery unnesting, that may fold new table references from a subquery (which has its own table ref list) into the table reference list of the parent. If such a transformation is made, then potentially all of the Expr::Column references to tables will become invalid. Consider this example: ```sql -- Assume tables: users(id, age), orders(user_id, amount) -- Get total amount spent per user on orders over $100 SELECT u.id, sub.total FROM users u JOIN (SELECT user_id, SUM(amount) as total FROM orders o WHERE o.amount > 100 GROUP BY o.user_id) sub WHERE u.id = sub.user_id -- Before subquery unnesting: -- Main query table_references: [users, sub] -- u.id refers to table 0, column 0 -- sub.total refers to table 1, column 1 -- -- Subquery table_references: [orders] -- o.user_id refers to table 0, column 0 -- o.amount refers to table 0, column 1 -- -- After unnesting and folding subquery tables into main query, -- the query might look like this: SELECT u.id, SUM(o.amount) as total FROM users u JOIN orders o ON u.id = o.user_id WHERE o.amount > 100 GROUP BY u.id; -- Main query table_references: [users, orders] -- u.id refers to table index 0 (correct) -- o.amount refers to table index 0 (incorrect, should be 1) -- o.user_id refers to table index 0 (incorrect, should be 1) ``` We could ofc traverse every expression in the subquery and rewrite the table indexes to be correct, but if we instead use stable identifiers for each table reference, then all the column references will continue to be correct. Hence, this PR introduces a `TableInternalId` used in `TableReference` as well as `Expr::Column` and `Expr::Rowid` so that this kind of query transformations can happen with less pain.	2025-05-25 20:26:17 +03:00
Jussi Saurio	08bda9cc58	UNION ALL	2025-05-24 13:12:41 +03:00
Jussi Saurio	c18c6a00fa	refactor: use walk_expr() in resolving vtab constraints	2025-05-23 16:28:56 +03:00
Jussi Saurio	597020bc0c	Merge 'Support values statement and values in select' from meteorgan Close: #866 limbo output: ``` limbo> explain values(1, 2); addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 5 0 0 Start at 5 1 Integer 1 1 0 0 r[1]=1 2 Integer 2 2 0 0 r[2]=2 3 ResultRow 1 2 0 0 output=r[1..2] 4 Halt 0 0 0 0 5 Goto 0 1 0 0 limbo> explain values(1, 2), (3, 4); addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 16 0 0 Start at 16 1 InitCoroutine 1 9 2 0 2 Integer 1 2 0 0 r[2]=1 3 Integer 2 3 0 0 r[3]=2 4 Yield 1 0 0 0 5 Integer 3 2 0 0 r[2]=3 6 Integer 4 3 0 0 r[3]=4 7 Yield 1 0 0 0 8 EndCoroutine 1 0 0 0 9 InitCoroutine 1 0 2 0 10 Yield 1 15 0 0 11 Copy 2 4 0 0 r[4]=r[2] 12 Copy 3 5 0 0 r[5]=r[3] 13 ResultRow 4 2 0 0 output=r[4..5] 14 Goto 0 10 0 0 15 Halt 0 0 0 0 16 Goto 0 1 0 0 limbo> explain select * from (values(1, 2), (3, 4)); addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 16 0 0 Start at 16 1 InitCoroutine 1 9 2 0 2 Integer 1 2 0 0 r[2]=1 3 Integer 2 3 0 0 r[3]=2 4 Yield 1 0 0 0 5 Integer 3 2 0 0 r[2]=3 6 Integer 4 3 0 0 r[3]=4 7 Yield 1 0 0 0 8 EndCoroutine 1 0 0 0 9 InitCoroutine 1 0 2 0 10 Yield 1 15 0 0 11 Copy 2 4 0 0 r[4]=r[2] 12 Copy 3 5 0 0 r[5]=r[3] 13 ResultRow 4 2 0 0 output=r[4..5] 14 Goto 0 10 0 0 15 Halt 0 0 0 0 16 Transaction 0 0 0 0 write=false 17 Goto 0 1 0 0 ``` sqlite output: ``` sqlite> explain values(1, 2); addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 5 0 0 Start at 5 1 Integer 1 1 0 0 r[1]=1 2 Integer 2 2 0 0 r[2]=2 3 ResultRow 1 2 0 0 output=r[1..2] 4 Halt 0 0 0 0 5 Goto 0 1 0 0 sqlite> explain values(1, 2), (3, 4); addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 16 0 0 Start at 16 1 InitCoroutine 1 9 2 0 2 Integer 1 4 0 0 r[4]=1 3 Integer 2 5 0 0 r[5]=2 4 Yield 1 0 0 0 5 Integer 3 4 0 0 r[4]=3 6 Integer 4 5 0 0 r[5]=4 7 Yield 1 0 0 0 8 EndCoroutine 1 0 0 0 9 InitCoroutine 1 0 2 0 10 Yield 1 15 0 0 next row of 2-ROW VALUES CLAUSE 11 Copy 4 8 0 2 r[8]=r[4] 12 Copy 5 9 0 2 r[9]=r[5] 13 ResultRow 8 2 0 0 output=r[8..9] 14 Goto 0 10 0 0 15 Halt 0 0 0 0 16 Goto 0 1 0 0 sqlite> explain select * from (values(1, 2), (3, 4)); addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 16 0 0 Start at 16 1 InitCoroutine 1 9 2 0 2 Integer 1 4 0 0 r[4]=1 3 Integer 2 5 0 0 r[5]=2 4 Yield 1 0 0 0 5 Integer 3 4 0 0 r[4]=3 6 Integer 4 5 0 0 r[5]=4 7 Yield 1 0 0 0 8 EndCoroutine 1 0 0 0 9 InitCoroutine 1 0 2 0 10 Yield 1 15 0 0 next row of 2-ROW VALUES CLAUSE 11 Copy 4 8 0 2 r[8]=r[4] 12 Copy 5 9 0 2 r[9]=r[5] 13 ResultRow 8 2 0 0 output=r[8..9] 14 Goto 0 10 0 0 15 Halt 0 0 0 0 16 Goto 0 1 0 0 ``` Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1549	2025-05-23 13:56:31 +03:00
Jussi Saurio	128a406f8c	TableReference: fix stale comment	2025-05-23 10:06:24 +03:00
meteorgan	0467d7e11b	Support values statement and values in select	2025-05-23 00:29:54 +08:00
Jussi Saurio	6ed5412bde	extract method	2025-05-22 16:51:03 +03:00
Jussi Saurio	76227ec274	Rename to Distinctness + add distinctness information to SelectPlan	2025-05-22 16:51:03 +03:00
Jussi Saurio	14058357ad	Merge 'refactor: replace Operation::Subquery with Table::FromClauseSubquery' from Jussi Saurio Previously the Operation enum consisted of: - Operation::Scan - Operation::Search - Operation::Subquery Which was always a dumb hack because what we really are doing is an Operation::Scan on a "virtual"/"pseudo" table (overloaded names...) derived from a subquery appearing in the FROM clause. Hence, refactor the relevant data structures so that the Table enum now contains a new variant: Table::FromClauseSubquery And the Operation enum only consists of Scan and Search. ``` SELECT * FROM (SELECT ...) sub; -- the subquery here was previously interpreted as Operation::Subquery on a Table::Pseudo, -- with a lot of special handling for Operation::Subquery in different code paths -- now it's an Operation::Scan on a Table::FromClauseSubquery ``` No functional changes (intended, at least!) Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1529	2025-05-20 14:31:42 +03:00
Pekka Enberg	e102cd0be5	Merge 'Add support for DISTINCT aggregate functions' from Jussi Saurio Reviewable commit by commit. CI failures are not related. Adds support for e.g. `select first_name, sum(distinct age), count(distinct age), avg(distinct age) from users group by 1` Implementation details: - Creates an ephemeral index per distinct aggregate, and jumps over the accumulation step if a duplicate is found Closes #1507	2025-05-20 13:58:57 +03:00
Jussi Saurio	3121c6cdd3	Replace Operation::Subquery with Table::FromClauseSubquery Previously the Operation enum consisted of: - Operation::Scan - Operation::Search - Operation::Subquery Which was always a dumb hack because what we really are doing is an Operation::Scan on a "virtual"/"pseudo" table (overloaded names...) derived from a subquery appearing in the FROM clause. Hence, refactor the relevant data structures so that the Table enum now contains a new variant: Table::FromClauseSubquery And the Operation enum only consists of Scan and Search. No functional changes (intended, at least!)	2025-05-20 12:56:30 +03:00
pedrocarlo	f8854f180a	Added collation to create table columns	2025-05-19 15:22:14 -03:00
Jussi Saurio	d584a1879b	Mark WHERE terms as consumed instead of deleting them We've run into trouble in multiple places due to the fact that we delete terms from the where clause (e.g. when a constant condition is removed, or the term becomes part of an index seek key). A simpler solution is to add a flag indicating that the term is consumed (used), so that it is not translated in the main loop anymore when WHERE clause terms are evaluated.	2025-05-17 15:44:12 +03:00
Jussi Saurio	368c45e025	Add distinctness information to Aggregate struct	2025-05-17 15:33:55 +03:00
Pekka Enberg	e3f71259d8	Rename OwnedValue -> Value We have not had enough merge conflicts for a while so let's do a tree-wide rename.	2025-05-15 09:59:46 +03:00
pedrocarlo	bb158a5433	add unique field to Column	2025-05-14 11:34:11 -03:00
Jussi Saurio	625cf005fd	Add some utilities to constraint related structs	2025-05-14 09:42:26 +03:00
Jussi Saurio	c02d3f8bcd	Do groupby/orderby sort elimination based on optimizer decision	2025-05-14 09:41:13 +03:00
PThorpe92	0593a99f0e	Remove insertCtx from parameters and replace fix with expr rewriting	2025-05-13 12:49:16 -04:00
pedrocarlo	9f726dbe62	simplify simple count detection	2025-05-10 22:36:43 -03:00
pedrocarlo	e9b1631d3c	fix is_simple_count detection	2025-05-10 22:23:01 -03:00
Pekka Enberg	97ad25c506	Merge 'Initial implementation of `ALTER TABLE RENAME`' from Levy A. - [x] `ALTER TABLE _ RENAME TO _` Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1456	2025-05-10 07:57:42 +03:00
Levy A.	023a116b0d	feat: initial implementation of `ALTER TABLE` only supporting renaming tables	2025-05-08 09:24:56 -03:00
Jussi Saurio	37097e01ae	GROUP BY: refactor logic to support cases where no sorting is needed	2025-05-08 12:39:26 +03:00
Jussi Saurio	330fedbc2f	Add notion of join ordering to plan + make determining where to eval expr dynamic always	2025-05-03 15:32:06 +03:00
Jussi Saurio	306e097950	Merge 'Fix bug: we cant remove order by terms from the head of the list' from Jussi Saurio we had an incorrect optimization in `eliminate_orderby_like_groupby()` where it could remove e.g. the first term of the ORDER BY if it matched the first GROUP BY term and the result set was naturally ordered by that term. this is invalid. see e.g.: ```sql main branch - BAD: removes the `ORDER BY id` term because the results are naturally ordered by id. However, this results in sorting the entire thing by last name only! limbo> select id, last_name, count(1) from users GROUP BY 1,2 order by id, last_name desc limit 3; ┌──────┬───────────┬───────────┐ │ id │ last_name │ count (1) │ ├──────┼───────────┼───────────┤ │ 6235 │ Zuniga │ 1 │ ├──────┼───────────┼───────────┤ │ 8043 │ Zuniga │ 1 │ ├──────┼───────────┼───────────┤ │ 944 │ Zimmerman │ 1 │ └──────┴───────────┴───────────┘ after fix - GOOD: limbo> select id, last_name, count(1) from users GROUP BY 1,2 order by id, last_name desc limit 3; ┌────┬───────────┬───────────┐ │ id │ last_name │ count (1) │ ├────┼───────────┼───────────┤ │ 1 │ Foster │ 1 │ ├────┼───────────┼───────────┤ │ 2 │ Salazar │ 1 │ ├────┼───────────┼───────────┤ │ 3 │ Perry │ 1 │ └────┴───────────┴───────────┘ I also refactored sorters to always use the ast `SortOrder` instead of boolean vectors, and use the `compare_immutable()` utility we use inside btrees too. Closes #1365	2025-05-03 12:48:08 +03:00
Pere Diaz Bou	64a12ed887	update index on indexed columns Previously columns that were indexed were updated only in the BtreeTable, but not on Index table. This commit basically enables updates on indexes too if they are needed.	2025-05-01 11:16:29 +03:00
Pere Diaz Bou	63a94e7c62	Merge 'Emit `IdxDelete` instruction and some fixes on seek after deletion' from Pere Diaz Bou Previously `DELETE FROM ...` only emitted deletes for main table, but this is incorrect as we want to remove entries from index tables as well. Closes #1383	2025-04-28 09:13:54 +03:00
Pere Diaz Bou	b7970a286d	implement IdxDelete clippy revert op_idx_ge changes fmt fmt again rever op_idx_gt changes	2025-04-24 16:23:34 +02:00
Jussi Saurio	3798b4aa8b	use SortOrder in sorters always	2025-04-24 10:34:06 +03:00
Pekka Enberg	beaccae664	Merge 'Create an automatic ephemeral index when a nested table scan would otherwise be selected' from Jussi Saurio Closes #747 - Creates an automatic ephemeral (in-memory) index on the right-side table of a join if otherwise a nested table scan would be selected. - This behavior is not hardcoded; instead this PR introduces a (quite dumb) cost estimator that naturally deincentivizes building ephemeral indexes where they don't make sense (e.g. the outermost table). I will probably build this estimator to be smarter in the future when working on join reordering optimizations ### Example bytecode plans and runtimes (note that this is debug mode) Example query with no persistent indexes to choose from. Without ephemeral index it's a nested scan: ```sql limbo> explain select * from t1 natural join t2; addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 13 0 0 Start at 13 1 OpenRead 0 2 0 0 table=t1, root=2 2 OpenRead 1 3 0 0 table=t2, root=3 3 Rewind 0 12 0 0 Rewind t1 4 Rewind 1 11 0 0 Rewind t2 5 Column 0 0 2 0 r[2]=t1.a 6 Column 1 0 3 0 r[3]=t2.a 7 Ne 2 3 10 0 if r[2]!=r[3] goto 10 8 Column 0 0 1 0 r[1]=t1.a 9 ResultRow 1 1 0 0 output=r[1] 10 Next 1 5 0 0 11 Next 0 4 0 0 12 Halt 0 0 0 0 13 Transaction 0 0 0 0 write=false 14 Goto 0 1 0 0 limbo> .timer on limbo> select * from t1 natural join t2; ┌───┐ │ a │ ├───┤ └───┘ Command stats: ---------------------------- total: 953 ms (this includes parsing/coloring of cli app) ``` Same query with autoindexing enabled: ```sql limbo> explain select * from t1 natural join t2; addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 22 0 0 Start at 22 1 OpenRead 0 2 0 0 table=t1, root=2 2 OpenRead 1 3 0 0 table=t2, root=3 3 Rewind 0 21 0 0 Rewind t1 4 Once 12 0 0 0 goto 12 # execute block 5-11 only once, on subsequent iters jump straight to 12 5 OpenAutoindex 3 0 0 0 cursor=3 6 Rewind 1 12 0 0 Rewind t2 # open source table for ephemeral index 7 Column 1 0 2 0 r[2]=t2.a 8 RowId 1 3 0 0 r[3]=t2.rowid 9 MakeRecord 2 2 4 0 r[4]=mkrec(r[2..3]) 10 IdxInsert 3 4 2 0 key=r[4] # insert stuff to ephemeral index 11 Next 1 7 0 0 12 Column 0 0 5 0 r[5]=t1.a 13 IsNull 5 20 0 0 if (r[5]==NULL) goto 20 14 SeekGE 3 20 5 0 key=[5..5] # perform seek on ephemeral index 15 IdxGT 3 20 5 0 key=[5..5] 16 DeferredSeek 3 1 0 0 17 Column 0 0 1 0 r[1]=t1.a 18 ResultRow 1 1 0 0 output=r[1] 19 Next 2 15 0 0 20 Next 0 4 0 0 21 Halt 0 0 0 0 22 Transaction 0 0 0 0 write=false 23 Goto 0 1 0 0 limbo> .timer on limbo> select * from t1 natural join t2; ┌───┐ │ a │ ├───┤ └───┘ Command stats: ---------------------------- total: 220 ms (this includes parsing/coloring of cli app) ``` Closes #1356	2025-04-22 13:00:06 +03:00
Timo Kösters	68d8b86bb7	fix: get name of rowid column	2025-04-22 08:46:37 +02:00
Jussi Saurio	af21f60887	translate/main_loop: create autoindex when index.ephemeral=true	2025-04-21 14:59:13 +03:00
Jussi Saurio	c1b2dfc32b	TableReference: add method column_is_used()	2025-04-21 14:59:13 +03:00
Jussi Saurio	40d880c3b0	TableReference: add resolve_cursors() method	2025-04-18 15:12:06 +03:00
Jussi Saurio	d5a6553e63	TableReference: add open_cursors()	2025-04-18 15:12:06 +03:00

1 2 3

121 Commits