turso

mirror of https://github.com/aljazceru/turso.git synced 2026-01-30 13:24:22 +01:00

Author	SHA1	Message	Date
Levy A.	15e0cab8d8	refactor+fix: precompute default values from schema	2025-06-11 14:18:39 -03:00
Jussi Saurio	18dd87eff1	Fix incorrect handling of OR clauses in HAVING	2025-06-10 18:02:14 +03:00
Jussi Saurio	819a6138d0	Merge 'Fix: aggregate regs must be initialized as NULL at the start' from Jussi Saurio Again found when fuzzing nested where clause subqueries: Aggregate registers need to be NULLed at the start because the same registers might be reused on another invocation of a subquery, and if they are not NULLed, the 2nd invocation of the same subquery will have values left over from the first invocation. Reviewed-by: Preston Thorpe (@PThorpe92) Closes #1614	2025-05-30 09:39:37 +03:00
Jussi Saurio	f8257df77b	Fix: aggregate regs must be initialized as NULL at the start	2025-05-29 18:44:53 +03:00
Jussi Saurio	cc405dea7e	Use new TableReferences struct everywhere	2025-05-29 11:44:56 +03:00
Jussi Saurio	77ce4780d9	Fix ProgramBuilder::cursor_ref not having unique keys Currently we have this: program.alloc_cursor_id(Option<String>, CursorType)` where the String is the table's name or alias ('users' or 'u' in the query). This is problematic because this can happen: `SELECT * FROM t WHERE EXISTS (SELECT * FROM t)` There are two cursors, both with identifier 't'. This causes a bug where the program will use the same cursor for both the main query and the subquery, since they are keyed by 't'. Instead introduce `CursorKey`, which is a combination of: 1. `TableInternalId`, and 2. index name (Option<String> -- in case of index cursors. This should provide key uniqueness for cursors: `SELECT * FROM t WHERE EXISTS (SELECT * FROM t)` here the first 't' will have a different `TableInternalId` than the second `t`, so there is no clash.	2025-05-29 00:59:24 +03:00
Jussi Saurio	4e9d9a2470	Fix LIMIT handling Currently we have some usages of LIMIT where the actual limit counter is initialized next to the DecrJumpZero instruction, and then `program.mark_last_insn_constant()` is used to hoist the counter initialization to the beginning of the program. This is very fragile, and already FROM clause subquery handling works around this with a hack (removed in this PR), and (upcoming) WHERE clause subqueries would also run into problems because of this, because the LIMIT might need to be initialized once for every iteration of the subquery. This PR removes those usages for LIMIT, and LIMIT processing is now more intuitive: - limit counter is now initialized at the start of the query processing - a function init_limit() is extracted to do this for select/update/delete	2025-05-27 21:12:22 +03:00
Jussi Saurio	70965f4b28	Insn::Return: add possibility to fallthrough on non-integer values as per sqlite spec	2025-05-27 19:09:10 +03:00
Jussi Saurio	7c07c09300	Add stable internal_id property to TableReference Currently our "table id"/"table no"/"table idx" references always use the direct index of the `TableReference` in the plan, e.g. in `SelectPlan::table_references`. For example: ```rust Expr::Column { table: 0, column: 3, .. } ``` refers to the 0'th table in the `table_references` list. This is a fragile approach because it assumes the table_references list is stable for the lifetime of the query processing. This has so far been the case, but there exist certain query transformations, e.g. subquery unnesting, that may fold new table references from a subquery (which has its own table ref list) into the table reference list of the parent. If such a transformation is made, then potentially all of the Expr::Column references to tables will become invalid. Consider this example: ```sql -- Assume tables: users(id, age), orders(user_id, amount) -- Get total amount spent per user on orders over $100 SELECT u.id, sub.total FROM users u JOIN (SELECT user_id, SUM(amount) as total FROM orders o WHERE o.amount > 100 GROUP BY o.user_id) sub WHERE u.id = sub.user_id -- Before subquery unnesting: -- Main query table_references: [users, sub] -- u.id refers to table 0, column 0 -- sub.total refers to table 1, column 1 -- -- Subquery table_references: [orders] -- o.user_id refers to table 0, column 0 -- o.amount refers to table 0, column 1 -- -- After unnesting and folding subquery tables into main query, -- the query might look like this: SELECT u.id, SUM(o.amount) as total FROM users u JOIN orders o ON u.id = o.user_id WHERE o.amount > 100 GROUP BY u.id; -- Main query table_references: [users, orders] -- u.id refers to table index 0 (correct) -- o.amount refers to table index 0 (incorrect, should be 1) -- o.user_id refers to table index 0 (incorrect, should be 1) ``` We could ofc traverse every expression in the subquery and rewrite the table indexes to be correct, but if we instead use stable identifiers for each table reference, then all the column references will continue to be correct. Hence, this PR introduces a `TableInternalId` used in `TableReference` as well as `Expr::Column` and `Expr::Rowid` so that this kind of query transformations can happen with less pain.	2025-05-25 20:26:17 +03:00
Jussi Saurio	f6443ae742	Support LIMIT with UNION ALL	2025-05-24 13:12:41 +03:00
Jussi Saurio	0c4c451d2a	rename	2025-05-22 16:51:03 +03:00
Jussi Saurio	f3ea9a603a	add support for SELECT DISTINCT	2025-05-22 16:51:03 +03:00
Jussi Saurio	76227ec274	Rename to Distinctness + add distinctness information to SelectPlan	2025-05-22 16:51:03 +03:00
Pekka Enberg	e102cd0be5	Merge 'Add support for DISTINCT aggregate functions' from Jussi Saurio Reviewable commit by commit. CI failures are not related. Adds support for e.g. `select first_name, sum(distinct age), count(distinct age), avg(distinct age) from users group by 1` Implementation details: - Creates an ephemeral index per distinct aggregate, and jumps over the accumulation step if a duplicate is found Closes #1507	2025-05-20 13:58:57 +03:00
pedrocarlo	4a3119786e	refactor BtreeCursor and Sorter to accept Vec of collations	2025-05-19 15:22:55 -03:00
pedrocarlo	bf1fe9e0b3	Actually fixed group by and order by collation	2025-05-19 15:22:15 -03:00
pedrocarlo	0df6c87f07	Fixed Group By collation	2025-05-19 15:22:14 -03:00
pedrocarlo	f8854f180a	Added collation to create table columns	2025-05-19 15:22:14 -03:00
Jussi Saurio	51c75c6014	Support distinct aggregates in GROUP BY	2025-05-17 15:33:55 +03:00
pedrocarlo	bb158a5433	add unique field to Column	2025-05-14 11:34:11 -03:00
Jussi Saurio	37097e01ae	GROUP BY: refactor logic to support cases where no sorting is needed	2025-05-08 12:39:26 +03:00
Jussi Saurio	306e097950	Merge 'Fix bug: we cant remove order by terms from the head of the list' from Jussi Saurio we had an incorrect optimization in `eliminate_orderby_like_groupby()` where it could remove e.g. the first term of the ORDER BY if it matched the first GROUP BY term and the result set was naturally ordered by that term. this is invalid. see e.g.: ```sql main branch - BAD: removes the `ORDER BY id` term because the results are naturally ordered by id. However, this results in sorting the entire thing by last name only! limbo> select id, last_name, count(1) from users GROUP BY 1,2 order by id, last_name desc limit 3; ┌──────┬───────────┬───────────┐ │ id │ last_name │ count (1) │ ├──────┼───────────┼───────────┤ │ 6235 │ Zuniga │ 1 │ ├──────┼───────────┼───────────┤ │ 8043 │ Zuniga │ 1 │ ├──────┼───────────┼───────────┤ │ 944 │ Zimmerman │ 1 │ └──────┴───────────┴───────────┘ after fix - GOOD: limbo> select id, last_name, count(1) from users GROUP BY 1,2 order by id, last_name desc limit 3; ┌────┬───────────┬───────────┐ │ id │ last_name │ count (1) │ ├────┼───────────┼───────────┤ │ 1 │ Foster │ 1 │ ├────┼───────────┼───────────┤ │ 2 │ Salazar │ 1 │ ├────┼───────────┼───────────┤ │ 3 │ Perry │ 1 │ └────┴───────────┴───────────┘ I also refactored sorters to always use the ast `SortOrder` instead of boolean vectors, and use the `compare_immutable()` utility we use inside btrees too. Closes #1365	2025-05-03 12:48:08 +03:00
Jussi Saurio	029e5eddde	Fix existing resolve_label() calls to work with new system	2025-04-24 11:05:21 +03:00
Jussi Saurio	3798b4aa8b	use SortOrder in sorters always	2025-04-24 10:34:06 +03:00
Ihor Andrianov	0c9464e3fc	reduce vec allocations, add comments for magic ifs	2025-04-05 15:15:10 +03:00
Ihor Andrianov	d4b8fa17f8	fix tests	2025-04-03 22:28:14 +03:00
Ihor Andrianov	34a132fcd3	fix output when group by is not part of resulting set	2025-04-03 22:28:13 +03:00
Ihor Andrianov	91ceab1626	improve naming and add comments for context	2025-04-03 22:28:13 +03:00
Ihor Andrianov	816cbacc9c	some smartie optimizations	2025-04-03 22:28:12 +03:00
Ihor Andrianov	2bcdd4e404	non group by cols are displayed in group by agg statements	2025-04-03 22:28:12 +03:00
Ihor Andrianov	40bb867d54	clippy	2025-03-30 19:01:16 +03:00
Ihor Andrianov	db5e364210	made json an optional module again	2025-03-30 19:01:03 +03:00
Ihor Andrianov	101dd51d7c	add jsonb_group_object and array	2025-03-30 18:58:39 +03:00
Ihor Andrianov	a983c979c6	jsonb_merge, json_group_array, json_group_object	2025-03-30 18:47:33 +03:00
Jussi Saurio	89e48a16db	Add affinity() function to Column	2025-02-18 10:56:30 +02:00
Pekka Enberg	ac54c35f92	Switch to workspace dependencies ...makes it easier to specify a version, which is needed for `cargo publish`.	2025-02-12 17:28:04 +02:00
Pekka Enberg	f3902ef9b6	core: Rename OwnedRecord to Record We only have one record type so let's call it `Record`.	2025-02-06 13:40:34 +02:00
Jussi Saurio	795576b2ec	dont eagerly allocate result column name strings	2025-02-05 17:53:23 +02:00
Jussi Saurio	c18c6ad64d	Marginal changes to use new data structures and field names	2025-02-02 10:18:13 +02:00
Glauber Costa	249a8cf8d2	keep type information as a string in column metadata SQLite holds on to it deeply, for example: sqlite> create table a(a int); sqlite> create table b(b integer); sqlite> create table c(c glauber); sqlite> pragma table_info=a; 0\|a\|INT\|0\|\|0 sqlite> pragma table_info=b; 0\|b\|INTEGER\|0\|\|0 sqlite> pragma table_info=c; 0\|c\|glauber\|0\|\|0 So we'll keep it as well so we can produce the same responses.	2025-01-30 19:53:36 -05:00
Glauber Costa	69d3fbc797	keep track of notnull constraint on column creation	2025-01-30 17:04:12 -05:00
Glauber Costa	42f93e9bea	add default type to Column definition	2025-01-30 16:45:57 -05:00
ben594	983fe4c151	Emit Integer, OffsetLimit instructions, and emit IfPos instruction to skip rows Emit Integer, OffsetLimit instructions for offset, and define function to emit IfPosinstruction to skip rows Emit IfPos instructions to handle offset for simple select Emit IfPos to handle offset for select with order by Moved repeated emit_offset function call into emit_select_result	2025-01-26 16:40:30 -05:00
Jussi Saurio	2cd9118be6	Fix jump_if_true to be a bool literal in places where it was used as a register number	2025-01-20 17:13:34 +02:00
Jussi Saurio	f88a4d6ac6	Add jump_if_null to cmp insns to account for either operand being NULL	2025-01-20 16:54:39 +02:00
Krishna Vishal	6173aeeb3b	1. Fix merge conflicts 2. change tests for extensions to return error instead of null (Preston)	2025-01-19 04:39:25 +05:30
PThorpe92	0c737d88f7	Support aggregate functions in Extensions	2025-01-17 14:13:57 -05:00
Jussi Saurio	f8b3b06163	Expr: fix recursive binary operation logic	2025-01-15 14:12:08 +02:00
Jussi Saurio	9909539b9d	Store cursor type (table,index,pseudo,sorter) when allocating cursor	2025-01-11 17:04:16 +02:00
PThorpe92	fa0e7d5729	Support nested parenthesized conditional exprs in translator	2025-01-08 17:16:17 -05:00

1 2

52 Commits