turso

mirror of https://github.com/aljazceru/turso.git synced 2026-01-14 13:44:21 +01:00

Author	SHA1	Message	Date
Jussi Saurio	6cf2072b51	translate: disallow correlated subqueries in HAVING and ORDER BY These are supported by SQLite, but we cannot handle them correctly yet.	2025-10-29 15:37:19 +02:00
Jussi Saurio	4bf8ad8cfd	Merge 'Support subqueries in all positions of a SELECT statement' from Jussi Saurio Follow-up to #3847. Adds support for subqueries in all other positions of a SELECT (the result list, GROUP BY, ORDER BY, HAVING, LIMIT, OFFSET). Turns out I am a sql noob and didn't realize that correlated subqueries are supported in basically all positions except LIMIT/OFFSET, so added support for those too + accompanying TCL tests. Thankfully the abstractions introduced in #3847 carry over to this very well so the code change is relatively small (over half of the diff is tests and a lot of the remaining diff is just moving logic around). Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3852	2025-10-29 10:19:39 +02:00
Jussi Saurio	29fe3b585a	Add more tests and disable correlated IN-subqueries in HAVING position I discovered a flaw in our current translation that makes queries of type HAVING foo IN (SELECT ...) not work properly - in these cases we need to defer translation of the subquery until later. I will fix this in a future PR because I suspect it's not trivial.	2025-10-29 09:57:55 +02:00
Nikita Sivukhin	0da3b4bfd3	fix after rebase	2025-10-28 11:27:35 +04:00
Nikita Sivukhin	e9b1ca12b6	add new access operation through IndexMethod	2025-10-28 11:27:35 +04:00
Jussi Saurio	d993ac8157	Merge 'index_method: implement basic trait and simple toy index' from Nikita Sivukhin This PR adds `index_method` trait and implementation of toy sparse vector index. In order to make PR more lightweight - for now index methods are not deeply integrated into the query planner and only necessary components are added in order to make integration tests which uses `index_method` API directly to work. Primary changes introduced in this PR are: 1. `SymbolTable` extended with `index_methods` field and builtin extensions populated with 2 native indices: `backing_btree` and `toy_vector_sparse_ivf` 2. `Index` struct extended with `index_method` field which holds `IndexMethodAttachment` constructed for the table with given parameters from `IndexMethod` "factory" trait The toy index implementation store inverted index pairs `(dimension, rowid)` in the auxilary BTree index. This index uses special `backing_btree` index_method which marked as `backing_btree: true` and treated in a special way by the db core: this is real BTree index which is not managed by the tursodb core and must be managed by index_method created it (so it responsible for data population, creation, destruction of this btree). Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #3846	2025-10-28 07:01:36 +02:00
Jussi Saurio	8fecd82311	Emit non from clause subqueries in translation	2025-10-27 16:01:39 +02:00
Jussi Saurio	c54988192e	Add SelectPlan::is_correlated() method	2025-10-27 16:01:39 +02:00
Jussi Saurio	580333ddd3	Add NonFromClauseSubquery struct and add a Vec of them to SelectPlan	2025-10-27 16:01:39 +02:00
Jussi Saurio	609d9957c1	Add new QueryDestination variants for subquery types	2025-10-27 16:01:39 +02:00
Nikita Sivukhin	05f0ee6a72	add more integration in order to properly skip backing_btree index_method	2025-10-27 17:00:26 +04:00
Jussi Saurio	5c05383cc1	Implement union for ColumnUsedMask	2025-10-27 13:57:56 +02:00
Jussi Saurio	de81af29e5	find_table_by_internal_id() returns whether table is an outer query reference Unfortunately, our current translation machinery is unable to know for sure whether a subquery reference to an outer table 't1' has opened a table cursor, an index cursor, or both. For this reason, return a flag from `TableReferences::find_table_by_internal_id()` that tells the caller whether the table is an outer query reference, and further commits will have some additional logic to decide which cursor a subquery will read from when referencing a table from the outer query.	2025-10-27 13:47:49 +02:00
Jussi Saurio	0173d31c04	clippy: collapse nested if	2025-10-14 15:51:31 +03:00
Jussi Saurio	4b80678898	Allow case where cursor for btree is already opened When populating an ephemeral table for UPDATE, it may open a cursor on the (permanent) table - in this case we don't need to open it again in the UPDATE loop	2025-10-14 15:32:48 +03:00
Jussi Saurio	f5ee4807da	Properly differentiate between source and target in UPDATE - Encode information about ephemeral source table in OperationMode::UPDATE if present - Use OperationMode information to correctly resolve cursors in UPDATE	2025-10-14 14:17:28 +03:00
Jussi Saurio	691dce6b8a	Make decision about UpdatePlan::ephemeral_plan _after_ optimizer An ephemeral table is required if the b-tree key of the table (rowid) or the index (index key) is affected by the UPDATE.	2025-10-14 14:17:28 +03:00
Jussi Saurio	c2fe13ad4f	Update documentation of UpdatePlan::ephemeral_plan It now better reflects when it is used.	2025-10-14 12:18:53 +03:00
Jussi Saurio	3669437482	Add vibecoded tests for ColumnUsedMask	2025-10-13 14:03:34 +03:00
Jussi Saurio	e055ed9a8d	Allow arbitrarily many columns in a table Use roaring bitmaps because ColumnUsedMask is likely to be sparsely populated.	2025-10-13 13:30:26 +03:00
Jussi Saurio	59a1c2ae2e	Disallow joining more than 63 tables Returns an error instead of panicing	2025-10-13 13:30:03 +03:00
Nikita Sivukhin	4313f57ecb	Optimize range scans	2025-10-09 11:47:41 +03:00
Jussi Saurio	f02757fe11	Collate: add proper collation to FROM-clause subquery result cols	2025-10-02 21:49:33 +03:00
Jussi Saurio	c0da38e24a	Merge 'Clear WhereTerm 'from_outer_join' state when LEFT JOIN is optimized to INNER JOIN' from Jussi Saurio Closes #3470 ## Background In a query like `SELECT * FROM t LEFT JOIN s ON t.a=s.a WHERE s.a = 'foo'` we can remove the LEFT JOIN and replace it with an `INNER JOIN` because NULL values will never be equal to 'foo'. Rewriting as `INNER JOIN` allows the optimizer to also reorder the table join order to come up with a more efficient query plan. In fact, we have this optimization already. ## Problem However, there is a dumb bug where `WhereTerm`s involving this join still retain their `from_outer_join` state, resulting in forcing the evaluation of those terms at the original join index, which results in completely wrong bytecode if the join optimizer decides to reorder the join as `s JOIN t` instead. Effectively it will evaluate `t.a=s.a` after table `s` is open but table `t` is not open yet. ## Fix This PR fixes that issue by clearing `from_outer_join` properly from the relevant `WhereTerm`s. Closes #3475	2025-10-02 06:56:07 +03:00
Jussi Saurio	b2f9854b1c	Add more documentation for WhereTerm::from_outer_join	2025-10-01 13:42:36 +03:00
Jussi Saurio	27b1c1a1db	Merge 'Fix self-insert with nested subquery' from Mikaël Francoeur There were 2 problems: 1. The SELECT wasn't propagating which register it used for its results, so sometimes the INSERT read bad data. 2. `TableReferences::contains_table` was only checking the top-level tables, not the nested tables in FROM queries. This condition is used to emit "template 4", the bytecode template for self-inserts. Closes https://github.com/tursodatabase/turso/issues/3312 Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #3436	2025-10-01 08:56:16 +03:00
Nikita Sivukhin	a32ed53bd8	remove optimization - even if index search will return only 1 row - it will call next in the loop - and we incorrecty can process same row values multiple times - the following query failed with this optimization: turso> CREATE TABLE t (id INTEGER PRIMARY KEY AUTOINCREMENT, k TEXT, c0 INT); turso> CREATE UNIQUE INDEX idx_p1_0 ON t(c0); turso> insert into t values (null, 'uu', -1); turso> insert into t values (null, 'uu', -2); turso> UPDATE t SET c0 = NULL WHERE c0 = -1; turso> SELECT * FROM t ┌────┬────┬────┐ │ id │ k │ c0 │ ├────┼────┼────┤ │ 1 │ uu │ │ ├────┼────┼────┤ │ 2 │ uu │ │ └────┴────┴────┘	2025-09-30 16:37:41 +04:00
Mikaël Francoeur	dc231abb2e	fix self-insert bug	2025-09-29 17:18:19 -04:00
Nikita Sivukhin	c4b3074575	format	2025-09-26 13:01:49 +04:00
Nikita Sivukhin	fdf8ca88fd	introduce exact(...) function - because enum variant will disappear	2025-09-26 13:01:49 +04:00
PThorpe92	6dc7d04c5a	Replace translate_epxr with translate_condition_expr and fix constraint error	2025-09-20 15:02:06 -04:00
TcMits	88119888d0	reduce allocation needed for break_predicate_at_and_boundaries	2025-09-18 10:52:29 +07:00
Piotr Rzysko	f5efcbe745	Add support for window functions Adds initial support for window functions. For now, only existing aggregate functions can be used as window functions—no specialized window-specific functions are supported yet. Currently, only the default frame definition is implemented: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW EXCLUDE NO OTHERS.	2025-09-13 11:12:44 +02:00
Piotr Rzysko	c81cd16230	Extract QueryDestination::placeholder_for_subquery	2025-09-13 10:49:14 +02:00
Piotr Rzysko	5f2a3e1242	Handle dummy argument for count() and count() in translation Two main reasons for this change: Improve readability by moving the logic for this special case closer to the code that relies on it. * Decouple AggFunc from the Aggregate struct. In the future, window function processing will use AggFunc directly, without necessarily depending on Aggregate.	2025-09-13 10:49:14 +02:00
Pekka Enberg	f88f39082a	core/vdbe: Fix MakeRecord affinity handling The MakeRecord instruction now accepts an optional affinity_str parameter that applies column-specific type conversions before creating records. When provided, the affinity string is applied character-by-character to each register using the existing apply_affinity_char() function, matching SQLite's behavior. Fixes #2040 Fixes #2041	2025-09-08 18:49:13 +03:00
Glauber Costa	08b2e685d5	Persistence for DBSP-based materialized views This fairly long commit implements persistence for materialized view. It is hard to split because of all the interdependencies between components, so it is a one big thing. This commit message will at least try to go into details about the basic architecture. Materialized Views as tables ============================ Materialized views are now a normal table - whereas before they were a virtual table. By making a materialized view a table, we can reuse all the infrastructure for dealing with tables (cursors, etc). One of the advantages of doing this is that we can create indexes on view columns. Later, we should also be able to write those views to separate files with ATTACH write. Materialized Views as Zsets =========================== The contents of the table are a ZSet: rowid, values, weight. Readers will notice that because of this, the usage of the ZSet data structure dwindles throughout the codebase. The main difference between our materialized ZSet and the standard DBSP ZSet, is that obviously ours is backed by a BTree, not a Hash (since SQLite tables are BTrees) Aggregator State ================ In DBSP, the aggregator nodes also have state. To store that state, there is a second table. The table holds all aggregators in the view, and there is one table per view. That is __turso_internal_dbsp_state_{view_name}. The format of that table is similar to a ZSet: rowid, serialized_values, weight. We serialize the values because there will be many aggregators in the table. We can't rely on a particular format for the values. The Materialized View Cursor ============================ Reading from a Materialized View essentially means reading from the persisted ZSet, and enhancing that with data that exists within the transaction. Transaction data is ephemeral, so we do not materialize this anywhere: we have a carefully crafted implementation of seek that takes care of merging weights and stitching the two sets together.	2025-09-05 07:04:33 -05:00
Pekka Enberg	44357f93a2	Merge branch 'main' into 2025-08-21-make-limit-and-offset-expr	2025-09-04 09:54:45 +03:00
Piotr Rzysko	3ad4016080	Fix handling of zero-argument grouped aggregations This commit consolidates the creation of the Aggregate struct, which was previously handled differently in `prepare_one_select_plan` and `resolve_aggregates`. That discrepancy caused inconsistent handling of zero-argument aggregates. The queries added in the new tests would previously trigger a panic.	2025-08-31 12:02:09 +02:00
bit-aloo	a16bee4574	move to new parser	2025-08-26 19:56:24 +05:30
bit-aloo	28439efd09	make offset and limit Expr	2025-08-26 19:56:11 +05:30
Pekka Enberg	26ba09c45f	Revert "Merge 'Remove double indirection in the Parser' from Pedro Muniz" This reverts commit `71c1b357e4`, reversing changes made to `6bc568ff69` because it actually makes things slower.	2025-08-26 14:58:21 +03:00
pedrocarlo	d3240844ec	refactor Core to remove the double indirection	2025-08-25 22:59:31 -03:00
Levy A.	4ba1304fb9	complete parser integration	2025-08-21 15:23:59 -03:00
Levy A.	186e2f5d8e	switch to new parser	2025-08-21 15:19:16 -03:00
Jussi Saurio	5da76c9125	Allow index in UPDATE for point queries (i.e. max 1 row affected)	2025-08-14 15:58:01 +03:00
Nikita Sivukhin	5d0ada9fb9	add "updates" column for cdc table	2025-08-11 12:46:15 +04:00
Jussi Saurio	c498196c7b	fix/perf: fix regression in SELECT 1 benchmark Do not start a read transaction when a SELECT is not going to access the database, which means we can avoid checking whether the schema has changed.	2025-08-05 15:10:55 +03:00
Piotr Rzysko	8fb4fbf8af	Make WhereTerm::consumed a plain bool Now that virtual tables are integrated into the optimizer, this field no longer needs to be wrapped in Cell<bool>.	2025-08-05 05:48:28 +02:00
Piotr Rzysko	82491ceb6a	Integrate virtual tables with optimizer This change connects virtual tables with the query optimizer. The optimizer now considers virtual tables during join order search and invokes their best_index callbacks to determine feasible access paths. Currently, this is not a visible change, since none of the existing extensions return information indicating that a plan is invalid.	2025-08-05 05:48:28 +02:00

1 2 3 4

188 Commits