turso

mirror of https://github.com/aljazceru/turso.git synced 2026-01-06 17:54:20 +01:00

Author	SHA1	Message	Date
Levy A.	15e0cab8d8	refactor+fix: precompute default values from schema	2025-06-11 14:18:39 -03:00
Levy A.	7638b0dab7	fix: use default value on empty columns added via ALTER TABLE	2025-06-11 14:18:19 -03:00
krishvishal	5837f7329f	clean up	2025-06-11 00:33:47 +05:30
krishvishal	6c04c18f87	Add affinity flag to comparison opcodes	2025-06-11 00:33:47 +05:30
krishvishal	9130b25111	Add `jump_if_null` flag for rowid alias based seeks	2025-06-11 00:33:05 +05:30
Jussi Saurio	2bac140d73	Remove SeekOp::EQ and encode eq_only in LE&GE - needed for iteration direction aware equality seeks	2025-06-10 14:16:26 +03:00
Jussi Saurio	31b37332d5	all index cursors must be opened when DELETE does an index seek too	2025-06-03 15:18:45 +03:00
Jussi Saurio	06626f72eb	Fix cursors not being opened for indexes in DELETE	2025-06-03 14:45:01 +03:00
Jussi Saurio	819a6138d0	Merge 'Fix: aggregate regs must be initialized as NULL at the start' from Jussi Saurio Again found when fuzzing nested where clause subqueries: Aggregate registers need to be NULLed at the start because the same registers might be reused on another invocation of a subquery, and if they are not NULLed, the 2nd invocation of the same subquery will have values left over from the first invocation. Reviewed-by: Preston Thorpe (@PThorpe92) Closes #1614	2025-05-30 09:39:37 +03:00
Jussi Saurio	f8257df77b	Fix: aggregate regs must be initialized as NULL at the start	2025-05-29 18:44:53 +03:00
Jussi Saurio	cc405dea7e	Use new TableReferences struct everywhere	2025-05-29 11:44:56 +03:00
Jussi Saurio	77ce4780d9	Fix ProgramBuilder::cursor_ref not having unique keys Currently we have this: program.alloc_cursor_id(Option<String>, CursorType)` where the String is the table's name or alias ('users' or 'u' in the query). This is problematic because this can happen: `SELECT * FROM t WHERE EXISTS (SELECT * FROM t)` There are two cursors, both with identifier 't'. This causes a bug where the program will use the same cursor for both the main query and the subquery, since they are keyed by 't'. Instead introduce `CursorKey`, which is a combination of: 1. `TableInternalId`, and 2. index name (Option<String> -- in case of index cursors. This should provide key uniqueness for cursors: `SELECT * FROM t WHERE EXISTS (SELECT * FROM t)` here the first 't' will have a different `TableInternalId` than the second `t`, so there is no clash.	2025-05-29 00:59:24 +03:00
Jussi Saurio	73e806ad84	Make WhereTerm::consumed a Cell<bool> Currently in the main translation logic after planning and optimization, we don't _really_ need to pass a &mut Vec<WhereTerm> around anymore, except for the fact that virtual table constraint resolution is done ad-hoc in `init_loop()`. Even there, the only thing we mutate is `WhereTerm::consumed` which is a boolean indicating that the term has been "used up" by the optimizer and shouldn't be evaluated as a normal where clause condition anymore. In the upcoming branch for WHERE clause subqueries, I want to store immutable references to WHERE clause expressions in `Resolver`, but this is unfortunately not possible if we still use the aforementioned mutable references. Hence, we can temporarily make `WhereTerm::consumed` a `Cell<bool>` which allows us to pass an immutable reference to `init_loop()`, and the `Cell` can be removed once the virtual table constraint resolution is moved to an earlier part of the query processing pipeline.	2025-05-28 11:02:39 +03:00
Jussi Saurio	4e9d9a2470	Fix LIMIT handling Currently we have some usages of LIMIT where the actual limit counter is initialized next to the DecrJumpZero instruction, and then `program.mark_last_insn_constant()` is used to hoist the counter initialization to the beginning of the program. This is very fragile, and already FROM clause subquery handling works around this with a hack (removed in this PR), and (upcoming) WHERE clause subqueries would also run into problems because of this, because the LIMIT might need to be initialized once for every iteration of the subquery. This PR removes those usages for LIMIT, and LIMIT processing is now more intuitive: - limit counter is now initialized at the start of the query processing - a function init_limit() is extracted to do this for select/update/delete	2025-05-27 21:12:22 +03:00
Jussi Saurio	07fa3a9668	Rename SelectQueryType to QueryDestination	2025-05-25 21:23:04 +03:00
Jussi Saurio	7c07c09300	Add stable internal_id property to TableReference Currently our "table id"/"table no"/"table idx" references always use the direct index of the `TableReference` in the plan, e.g. in `SelectPlan::table_references`. For example: ```rust Expr::Column { table: 0, column: 3, .. } ``` refers to the 0'th table in the `table_references` list. This is a fragile approach because it assumes the table_references list is stable for the lifetime of the query processing. This has so far been the case, but there exist certain query transformations, e.g. subquery unnesting, that may fold new table references from a subquery (which has its own table ref list) into the table reference list of the parent. If such a transformation is made, then potentially all of the Expr::Column references to tables will become invalid. Consider this example: ```sql -- Assume tables: users(id, age), orders(user_id, amount) -- Get total amount spent per user on orders over $100 SELECT u.id, sub.total FROM users u JOIN (SELECT user_id, SUM(amount) as total FROM orders o WHERE o.amount > 100 GROUP BY o.user_id) sub WHERE u.id = sub.user_id -- Before subquery unnesting: -- Main query table_references: [users, sub] -- u.id refers to table 0, column 0 -- sub.total refers to table 1, column 1 -- -- Subquery table_references: [orders] -- o.user_id refers to table 0, column 0 -- o.amount refers to table 0, column 1 -- -- After unnesting and folding subquery tables into main query, -- the query might look like this: SELECT u.id, SUM(o.amount) as total FROM users u JOIN orders o ON u.id = o.user_id WHERE o.amount > 100 GROUP BY u.id; -- Main query table_references: [users, orders] -- u.id refers to table index 0 (correct) -- o.amount refers to table index 0 (incorrect, should be 1) -- o.user_id refers to table index 0 (incorrect, should be 1) ``` We could ofc traverse every expression in the subquery and rewrite the table indexes to be correct, but if we instead use stable identifiers for each table reference, then all the column references will continue to be correct. Hence, this PR introduces a `TableInternalId` used in `TableReference` as well as `Expr::Column` and `Expr::Rowid` so that this kind of query transformations can happen with less pain.	2025-05-25 20:26:17 +03:00
PThorpe92	a2f8b2dfea	Fix add check for invalid argv index for vtab constraints in main loop	2025-05-24 14:49:58 -04:00
Jussi Saurio	f6443ae742	Support LIMIT with UNION ALL	2025-05-24 13:12:41 +03:00
Jussi Saurio	c18c6a00fa	refactor: use walk_expr() in resolving vtab constraints	2025-05-23 16:28:56 +03:00
Jussi Saurio	0c4c451d2a	rename	2025-05-22 16:51:03 +03:00
Jussi Saurio	df8a19767f	Fixes to account for collation	2025-05-22 16:51:03 +03:00
Jussi Saurio	f3ea9a603a	add support for SELECT DISTINCT	2025-05-22 16:51:03 +03:00
Jussi Saurio	b0c3483e94	Allocate ephemeral index for SELECT DISTINCT	2025-05-22 16:51:03 +03:00
Jussi Saurio	76227ec274	Rename to Distinctness + add distinctness information to SelectPlan	2025-05-22 16:51:03 +03:00
Jussi Saurio	696c98877c	Merge 'btree: Remove assumption that all btrees have a rowid' from Jussi Saurio For example, implementing `SELECT DISTINCT` (#1517) and `UNION` (#1545) require that we are able to create indexes without a rowid column present. Similarly, `WITHOUT ROWID` tables require this. I implemented this by replacing the `rowid` and `empty_record` properties in `BtreeCursor` with ```rust /// Whether the cursor is currently pointing to a record. #[derive(Debug, Clone, Copy, PartialEq)] enum CursorHasRecord { Yes { rowid: Option<u64>, // not all indexes and btrees have rowids, so this is optional. }, No, } ``` Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1518	2025-05-21 14:53:00 +03:00
Jussi Saurio	14058357ad	Merge 'refactor: replace Operation::Subquery with Table::FromClauseSubquery' from Jussi Saurio Previously the Operation enum consisted of: - Operation::Scan - Operation::Search - Operation::Subquery Which was always a dumb hack because what we really are doing is an Operation::Scan on a "virtual"/"pseudo" table (overloaded names...) derived from a subquery appearing in the FROM clause. Hence, refactor the relevant data structures so that the Table enum now contains a new variant: Table::FromClauseSubquery And the Operation enum only consists of Scan and Search. ``` SELECT * FROM (SELECT ...) sub; -- the subquery here was previously interpreted as Operation::Subquery on a Table::Pseudo, -- with a lot of special handling for Operation::Subquery in different code paths -- now it's an Operation::Scan on a Table::FromClauseSubquery ``` No functional changes (intended, at least!) Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1529	2025-05-20 14:31:42 +03:00
Jussi Saurio	a7b33b1509	schema: add Index::has_rowid	2025-05-20 14:22:17 +03:00
Jussi Saurio	9d3aca6e8f	Fix compile error after merge	2025-05-20 14:19:32 +03:00
Pekka Enberg	e102cd0be5	Merge 'Add support for DISTINCT aggregate functions' from Jussi Saurio Reviewable commit by commit. CI failures are not related. Adds support for e.g. `select first_name, sum(distinct age), count(distinct age), avg(distinct age) from users group by 1` Implementation details: - Creates an ephemeral index per distinct aggregate, and jumps over the accumulation step if a duplicate is found Closes #1507	2025-05-20 13:58:57 +03:00
Jussi Saurio	3121c6cdd3	Replace Operation::Subquery with Table::FromClauseSubquery Previously the Operation enum consisted of: - Operation::Scan - Operation::Search - Operation::Subquery Which was always a dumb hack because what we really are doing is an Operation::Scan on a "virtual"/"pseudo" table (overloaded names...) derived from a subquery appearing in the FROM clause. Hence, refactor the relevant data structures so that the Table enum now contains a new variant: Table::FromClauseSubquery And the Operation enum only consists of Scan and Search. No functional changes (intended, at least!)	2025-05-20 12:56:30 +03:00
pedrocarlo	4a3119786e	refactor BtreeCursor and Sorter to accept Vec of collations	2025-05-19 15:22:55 -03:00
pedrocarlo	5bd47d7462	post rebase adjustments to accomodate new instructions that were created before the merge conflicts	2025-05-19 15:22:15 -03:00
pedrocarlo	d0a63429a6	Naive implementation of collate for queries. Not implemented for column constraints	2025-05-19 15:22:14 -03:00
pedrocarlo	b5b1010e7c	set binary collation as default	2025-05-19 15:22:14 -03:00
Jussi Saurio	d584a1879b	Mark WHERE terms as consumed instead of deleting them We've run into trouble in multiple places due to the fact that we delete terms from the where clause (e.g. when a constant condition is removed, or the term becomes part of an index seek key). A simpler solution is to add a flag indicating that the term is consumed (used), so that it is not translated in the main loop anymore when WHERE clause terms are evaluated.	2025-05-17 15:44:12 +03:00
Jussi Saurio	51c75c6014	Support distinct aggregates in GROUP BY	2025-05-17 15:33:55 +03:00
Jussi Saurio	653a3a7e13	Support distinct aggregates in non-GROUPBY context	2025-05-17 15:33:55 +03:00
Jussi Saurio	415c4ee624	Allocate ephemeral index cursors for DISTINCT aggregates	2025-05-17 15:33:55 +03:00
pedrocarlo	5f2216cf8e	modify explain for MakeRecord to show index name	2025-05-14 13:30:39 -03:00
pedrocarlo	5bae32fe3f	modified OpenWrite to include index or table name in explain	2025-05-14 13:30:39 -03:00
Jussi Saurio	625cf005fd	Add some utilities to constraint related structs	2025-05-14 09:42:26 +03:00
Jussi Saurio	1e46f1d9de	Feature: join reordering optimizer	2025-05-14 09:40:48 +03:00
Jussi Saurio	37097e01ae	GROUP BY: refactor logic to support cases where no sorting is needed	2025-05-08 12:39:26 +03:00
Jussi Saurio	330fedbc2f	Add notion of join ordering to plan + make determining where to eval expr dynamic always	2025-05-03 15:32:06 +03:00
Jussi Saurio	029e5eddde	Fix existing resolve_label() calls to work with new system	2025-04-24 11:05:21 +03:00
Jussi Saurio	a7488496d5	expr.is_nonnull(): return true if col.primary_key \|\| col.notnull	2025-04-23 18:10:33 +03:00
Jussi Saurio	af21f60887	translate/main_loop: create autoindex when index.ephemeral=true	2025-04-21 14:59:13 +03:00
Jussi Saurio	83c509a613	Fix bug: left join null flag not being cleared In left joins, even if the join condition is not matched, the system must emit a row for every row of the outer table: -- this must return t1.count() rows, with NULLs for all columns of t2 SELECT * FROM t1 LEFT JOIN t2 ON FALSE; Our logic for clearing the null flag was to do it in Next/Prev. However, this is problematic for a few reasons: - If the inner table of the left join is using SeekRowid, then Next/Prev is never called on its cursor, so the null flag doesn't get cleared. - If the inner table of the left join is using a non-covering index seek, i.e. it iterates its rows using an index, but seeks to the main table to fetch data, then Next/Prev is never called on the main table, and the main table's null flag doesn't get cleared. What this results in is NULL values incorrectly being emitted for the inner table after the first correct NULL row, since the null flag is correctly set to true, but never cleared. This PR fixes the issue by clearing the null flag whenever seek() is invoked on the cursor. Hence, the null flag is now cleared on: - next() - prev() - seek()	2025-04-19 13:56:52 +03:00
Jussi Saurio	6c73db6fd3	feat: use covering indexes whenever possible	2025-04-18 15:13:09 +03:00
PThorpe92	d02900294e	Remove 2nd shell in vtab tests, fix expr translation in main loop	2025-04-17 14:01:45 -04:00

1 2

100 Commits