turso

mirror of https://github.com/aljazceru/turso.git synced 2026-02-15 04:54:20 +01:00

Author	SHA1	Message	Date
Jussi Saurio	4e48e1ffad	Make an exception for Expr::SubqueryResult in collect_result_columns()	2025-10-28 13:11:12 +02:00
Jussi Saurio	c80cf2831d	Support subqueries in all positions of a SELECT statement	2025-10-28 13:11:12 +02:00
Jussi Saurio	49ee5529cb	Evaluate uncorrelated subqueries as early as possible even LIMIT can reference an uncorrelated subquery, so we need to translate them before we do anything with LIMIT.	2025-10-28 13:11:11 +02:00
Jussi Saurio	3294b78051	Initialize LIMIT after after ORDER BY / GROUP BY initialization Currently LIMIT 0 jumps to "after the main loop", and it is done before ORDER BY and GROUP BY cursor have had a chance to be initialized, which causes a panic. Simplest fix for now is to delay the LIMIT initialization.	2025-10-28 13:08:05 +02:00
Jussi Saurio	dae2441dd1	Fix compilation error after incompatible merges	2025-10-28 07:05:18 +02:00
Jussi Saurio	d993ac8157	Merge 'index_method: implement basic trait and simple toy index' from Nikita Sivukhin This PR adds `index_method` trait and implementation of toy sparse vector index. In order to make PR more lightweight - for now index methods are not deeply integrated into the query planner and only necessary components are added in order to make integration tests which uses `index_method` API directly to work. Primary changes introduced in this PR are: 1. `SymbolTable` extended with `index_methods` field and builtin extensions populated with 2 native indices: `backing_btree` and `toy_vector_sparse_ivf` 2. `Index` struct extended with `index_method` field which holds `IndexMethodAttachment` constructed for the table with given parameters from `IndexMethod` "factory" trait The toy index implementation store inverted index pairs `(dimension, rowid)` in the auxilary BTree index. This index uses special `backing_btree` index_method which marked as `backing_btree: true` and treated in a special way by the db core: this is real BTree index which is not managed by the tursodb core and must be managed by index_method created it (so it responsible for data population, creation, destruction of this btree). Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #3846	2025-10-28 07:01:36 +02:00
Jussi Saurio	9c87b20cb2	Merge 'Where clause subquery support' from Jussi Saurio Closes #1282 # Support for WHERE clause subqueries This PR implements support for subqueries that appear in the WHERE clause of SELECT statements. ## What are those lol 1. EXISTS subqueries: `WHERE EXISTS (SELECT ...)` 2. Row value subqueries: `WHERE x = (SELECT ...)` or `WHERE (x, y) = (SELECT ...)`. The latter are not yet supported - only the single-column ("scalar subquery") case is. 3. IN subqueries: `WHERE x IN (SELECT ...)` or `WHERE (x, y) IN (SELECT ...)` ## Correlated vs Uncorrelated Subqueries - Uncorrelated subqueries reference only their own tables and can be evaluated once. - Correlated subqueries reference columns from the outer query (e.g., `WHERE EXISTS (SELECT * FROM t2 WHERE t2.id = t1.id)`) and must be re-evaluated for each row of the outer query ## Implementation ### Planning During query planning, the WHERE clause is walked to find subquery expressions (`Expr::Exists`, `Expr::Subquery`, `Expr::InSelect`). Each subquery is: 1. Assigned a unique internal ID 2. Compiled into its own `SelectPlan` with outer query tables provided as available references 3. Replaced in the AST with an `Expr::SubqueryResult` node that references the subquery with its internal ID 4. Stored in a `Vec<NonFromClauseSubquery>` on the `SelectPlan` For IN subqueries, an ephemeral index is created to store the subquery results; for other kinds, the results are stored in register(s). ### Translation Before emitting bytecode, we need to determine when each subquery should be evaluated: - Uncorrelated: Evaluated once before opening any table cursors - Correlated: Evaluated at the appropriate nested loop depth after all referenced outer tables are in scope This is calculated by examining which outer query tables the subquery references and finding the right-most (innermost) loop that opens those tables - using similar mechanisms that we use for figuring out when to evaluate other `WhereTerm`s too. ### Code Generation - EXISTS: Sets a register to 1 if any row is produced, 0 otherwise. Has new `QueryDestination::ExistsSubqueryResult` variant. - IN: Results stored in an ephemeral index and the index is probed. - RowValue: Results stored in a range of registers. Has new `QueryDestination::RowValueSubqueryResult` variant. ## Annoying details ### Which cursor to read from in a subquery? Sometimes a query will use a covering index, i.e. skip opening the table cursor at all if the index contains All The Needed Stuff. Correlated subqueries reading columns from outer tables is a bit problematic in this regard: with our current translation code, the subquery doesn't know whether the outer query opened a table cursor, index cursor, or both. So, for now, we try to find a table cursor first, then fall back to finding any index cursor for that table. Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3847	2025-10-28 06:36:55 +02:00
Jussi Saurio	f288dfd3d0	TableMask: take tables referenced in subqueries into account This influences valid potential join orders.	2025-10-27 16:10:49 +02:00
Jussi Saurio	59363a1be3	Translate Expr::SubqueryResult into bytecode	2025-10-27 16:01:39 +02:00
Jussi Saurio	bc2a7c79f9	Add TODO comment about subquery positions we don't support yet	2025-10-27 16:01:39 +02:00
Jussi Saurio	8fecd82311	Emit non from clause subqueries in translation	2025-10-27 16:01:39 +02:00
Jussi Saurio	bf66999f64	Add emit_non_from_clause_subquery() method	2025-10-27 16:01:39 +02:00
Jussi Saurio	8e1987bd5d	Rename emit_subqueries() to emit_from_clause_subqueries() to disambiguate	2025-10-27 16:01:39 +02:00
Jussi Saurio	58caf32fe2	Add plan_subqueries_from_where_clause() method and use it in Select planning	2025-10-27 16:01:39 +02:00
Jussi Saurio	c54988192e	Add SelectPlan::is_correlated() method	2025-10-27 16:01:39 +02:00
Jussi Saurio	9b62687c41	Change unwrap_parens() to return Parenthesized as is, if it contains multiple values	2025-10-27 16:01:39 +02:00
Jussi Saurio	580333ddd3	Add NonFromClauseSubquery struct and add a Vec of them to SelectPlan	2025-10-27 16:01:39 +02:00
Jussi Saurio	609d9957c1	Add new QueryDestination variants for subquery types	2025-10-27 16:01:39 +02:00
Jussi Saurio	5bd6e033e6	Rename emit_subquery() to emit_from_clause_subquery() to disambiguate	2025-10-27 16:01:39 +02:00
Jussi Saurio	5eb74ce8e6	AST: Add Expr::SubqueryResult variant and enum SubqueryType	2025-10-27 16:01:39 +02:00
Nikita Sivukhin	05f0ee6a72	add more integration in order to properly skip backing_btree index_method	2025-10-27 17:00:26 +04:00
Nikita Sivukhin	bdbfac20fb	resolve index method parameters	2025-10-27 16:39:22 +04:00
Nikita Sivukhin	a151770cea	add minimal support of index_methods in the query planner in order to make integration tests work	2025-10-27 16:34:49 +04:00
Nikita Sivukhin	97dcc0869e	register index_methods as db builtin extensions	2025-10-27 16:31:31 +04:00
Nikita Sivukhin	cb11417883	add index_method trait and implement simple inverted index for sparse vectors	2025-10-27 16:22:52 +04:00
Jussi Saurio	e7aa7ee2ff	ProgramBuilder: add a few utility methods needed for correlated subqueries	2025-10-27 14:03:41 +02:00
Jussi Saurio	5c05383cc1	Implement union for ColumnUsedMask	2025-10-27 13:57:56 +02:00
Jussi Saurio	3a1d6d8879	Improve error messages in translate_expr() The current error messages are misleading, as the user may encounter these errors in expressions outside the WHERE clause, too.	2025-10-27 13:51:59 +02:00
Jussi Saurio	de81af29e5	find_table_by_internal_id() returns whether table is an outer query reference Unfortunately, our current translation machinery is unable to know for sure whether a subquery reference to an outer table 't1' has opened a table cursor, an index cursor, or both. For this reason, return a flag from `TableReferences::find_table_by_internal_id()` that tells the caller whether the table is an outer query reference, and further commits will have some additional logic to decide which cursor a subquery will read from when referencing a table from the outer query.	2025-10-27 13:47:49 +02:00
Nikita Sivukhin	8a80e8b743	rename custom modules to index_method like in postgresql	2025-10-27 13:18:18 +04:00
Nikita Sivukhin	408ca235d1	small refactoring	2025-10-27 12:43:38 +04:00
Nikita Sivukhin	299533b7b6	hide custom modules syntax behind --experimental-custom-modules flag	2025-10-27 12:29:05 +04:00
Nikita Sivukhin	f178daa373	update comment	2025-10-27 11:47:25 +04:00
Nikita Sivukhin	906bbdd1c4	support deep nestedness	2025-10-27 11:37:42 +04:00
Pekka Enberg	7d035f27d8	Merge 'Strict numeric cast for op_must_be_int' from bit-aloo closes: #3302 Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #3771	2025-10-26 16:42:35 +02:00
Pekka Enberg	6603f5318a	Merge 'core/vdbe: Reuse cursor in op_open_write()' from Pekka Enberg This optimization reuses an existing cursor when op_open_write() is called on the same table/index (same root_page). This is safe because the cursor position doesn't matter - op_rewind() is always called after op_open_write() to position the cursor at the beginning of the table/index before any operations are performed. This change speeds up op_open_write() by avoiding unnecessary cursor re- initialization. Closes #3815	2025-10-26 12:29:20 +02:00
Pekka Enberg	ca073b5ecd	Merge 'core: Switch RwLock<Arc<Pager>> to ArcSwap<Pager>' from Pekka Enberg We don't actually need the RwLock locking capabilities, just the ability to swap the instance. Closes #3814	2025-10-26 12:29:11 +02:00
Pekka Enberg	6020b3d1ec	Merge 'Always returns Floats for sum and avg on DBSP aggregations' from Glauber Costa Trying to return integer sometimes to match SQLite led to more problems that I anticipated. The reason being, we can't really match SQLite's behavior unless we know the type of every element in the sum. This is not impossible, but it is very hard, for very little gain. Fixes #3831 Closes #3832	2025-10-26 12:28:18 +02:00
Sumit Patel	7f8f1bc074	Update the write_varint method to use an encoded buffer of size 9 instead of 10. The SQLite varint specification states that the varint is guaranteed to be a maximum of 9 bytes, but our version of write_varint initializes a buffer of 10 bytes. Changing the size to match the specification.	2025-10-25 16:53:59 +05:30
Glauber Costa	1ccd61088e	Always returns Floats for sum and avg on DBSP aggregations Trying to return integer sometimes to match SQLite led to more problems that I anticipated. The reason being, we can't really match SQLite's behavior unless we know the type of every element in the sum. This is not impossible, but it is very hard, for very little gain. Fixes #3831	2025-10-24 14:13:53 -05:00
Pekka Enberg	f85ba9198f	Merge 'Add DISTINCT support to aggregate operator' from Glauber Costa Implements COUNT/SUM/AVG(DISTINCT) and SELECT DISTINCT for materialized views. To do this we have to keep a list of the actual distinct values (similarly to how we do for min/max). We then update the operator (and issue deltas) only when there is a state transition (for example, if we already count the value x = 1, and we see an insert for x = 1, we do nothing). SELECT DISTINCT (with no aggregator) is similar. We already have to keep a list of the values anyway to power the aggregates. So we just issue new deltas based on the transition, without updating the aggregator. Closes #3808	2025-10-24 18:47:11 +03:00
Jussi Saurio	8c6a6f0aa1	Merge 'Fix foreign key constraint enforcement on UNIQUE indexes' from Jussi Saurio Our foreign key constraint checks were checking for changes in PRIMARY KEYs, but not unique indexes - which are in practice the same thing, apart from the `INTEGER PRIMARY KEY` special case, where the PRIMARY KEY is an alias for the rowid of the table. Closes #3648 Closes #3652 (reimplements) Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3825	2025-10-24 15:14:03 +03:00
Pekka Enberg	c3fb867173	core: Switch RwLock<Arc<Pager>> to ArcSwap<Pager> We don't actually need the RwLock locking capabilities, just the ability to swap the instance.	2025-10-24 14:10:08 +03:00
Pekka Enberg	ae60b78d82	Merge 'Switch to SQLite's Julian date logic' from Pekka Enberg The `julian_day_converter` crate is GPL, which is problematic for apps embedding Turso. Closes #3822	2025-10-24 13:38:17 +03:00
bit-aloo	b2769afffd	add test	2025-10-24 16:08:15 +05:30
bit-aloo	64bbca9e12	Fix op_must_be_int to use strict numeric cast	2025-10-24 16:08:15 +05:30
Jussi Saurio	18e6a23f23	Fix foreign key constraint enforcement on UNIQUE indexes Closes #3648 Co-authored-by: Pavan-Nambi <pavannambi999@gmail.com>	2025-10-24 11:03:55 +03:00
Pekka Enberg	827b646c24	Switch to SQLite's Julian date logic The `julian_day_converter` crate is GPL, which is problematic for apps embedding Turso. Switch to SQLite's Julian date logic by porting the C code to Rust.	2025-10-24 08:31:28 +03:00
Pekka Enberg	4c59f29931	Merge 'core/storage: Fix WAL already enabled issue' from Pekka Enberg If WAL is already enabled, let's just continue execution instead of erroring out. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #3819	2025-10-23 20:56:57 +03:00
Pekka Enberg	87069fde93	core/storage: Fix WAL already enabled issue If WAL is already enabled, let's just continue execution instead of erroring out.	2025-10-23 19:35:46 +03:00

1 2 3 4 5 ...

5917 Commits