turso

mirror of https://github.com/aljazceru/turso.git synced 2025-12-23 03:04:19 +01:00

Author	SHA1	Message	Date
Jussi Saurio	a99c8a8ca0	Simplify ORDER BY sorter column remapping In case an ORDER BY column exactly matches a result column in the SELECT, the insertion of the result column into the ORDER BY sorter can be skipped because it's already necessarily inserted as a sorting column. For this reason we have a mapping to know what index a given result column has in the order by sorter. This commit makes that mapping much simpler.	2025-08-15 15:48:41 +03:00
Piotr Rzysko	375b9047e2	Evaluate WHERE conditions after LEFT JOIN Previously, the query from the added test would not filter out rows where `products.price` was NULL.	2025-08-08 06:26:30 +02:00
Piotr Rzysko	92ba25e44d	Extract loop emitting conditions into a method No functional changes — this is just preparation for reusing this code and avoiding polluting future commits with trivial refactoring.	2025-08-08 06:21:08 +02:00
Piotr Rzysko	8986266394	Emit conditions in open_loop in one place The loop emitting conditions is independent of the operation type.	2025-08-07 19:26:32 +02:00
Nikita Sivukhin	c6a87d61c7	emit CDC entries if necessary for schema changes	2025-08-06 01:03:49 +04:00
Nikita Sivukhin	0b4c1ac802	refactor code a little bit	2025-08-06 01:03:48 +04:00
Piotr Rzysko	82491ceb6a	Integrate virtual tables with optimizer This change connects virtual tables with the query optimizer. The optimizer now considers virtual tables during join order search and invokes their best_index callbacks to determine feasible access paths. Currently, this is not a visible change, since none of the existing extensions return information indicating that a plan is invalid.	2025-08-05 05:48:28 +02:00
Piotr Rzysko	718598eab8	Introduce scan type Different scan parameters are required for different table types. Currently, index and iteration direction are only used by B-tree tables, while the remaining table types don’t require any parameters. Planning access to virtual tables, however, will require passing additional information from the planner, such as the virtual table index (distinct from a B-tree index) and the constraints that must be forwarded to the `filter` method.	2025-08-04 20:27:22 +02:00
Piotr Rzysko	61234eeb19	Add ResultCode to best_index result The `best_index` implementation now returns a ResultCode along with the IndexInfo. This allows it to signal specific outcomes, such as errors or constraint violations. This change aligns better with SQLite’s xBestIndex contract, where cases like missing constraints or invalid combinations of constraints must not result in a valid plan.	2025-08-04 20:18:44 +02:00
Piotr Rzysko	c465ce6e7b	Clarify semantics of argv_index Extend the documentation of `argv_index` and add validations enforcing the requirements it must meet.	2025-08-04 19:31:18 +02:00
Piotr Rzysko	b0460a589f	Ensure argv_index is either None or >= 1 Previously, there were two ways to indicate that a constraint should not be passed to the filter function: setting `argv_index` to `None` or to a value less than 1. This was redundant, so now only `None` is used.	2025-08-04 19:27:53 +02:00
Piotr Rzysko	c6f398122d	Add validation for constraint usage length returned by best_index Additional changes: - Update IndexInfo documentation to clarify that constraint_usages must have exact 1:1 correspondence with input ConstraintInfo array. The code translating constraints into VFilter arguments heavily relies on this. - Fix best_index implementation in test extension to comply with new validation requirements by returning usage entry for each constraint	2025-08-04 19:25:10 +02:00
Glauber Costa	988b16f962	Support ATTACH (read only) Support for attaching databases. The main difference from SQLite is that we support an arbitrary number of attached databases, and we are not bound to just 100ish. We for now only support read-only databases. We open them as read-only, but also, to keep things simple, we don't patch any of the insert machinery to resolve foreign tables. So if an insert is tried on an attached database, it will just fail with a "no such table" error - this is perfect for now. The code in core/translate/attach.rs is written by Claude, who also played a key part in the boilerplate for stuff like the .databases command and extending the pragma database_list, and also aided me in the test cases.	2025-07-24 19:19:48 -05:00
PThorpe92	0871a8c7f3	Bail early when we detect a readonly virtual table	2025-07-23 16:57:30 -04:00
Glauber Costa	57a1113460	make readonly a property of the database There's no such thing as a read-only connection. In a normal connection, you can have many attached databases. Some r/o, some r/w. To properly fix that, we also need to fix the OpenWrite opcode. Right now we are passing a name, which is the name of the table. That parameter is not used anywhere. That is also not what the SQLite opcode specifies. Same as OpenRead, the p3 register should be the database index. With that change, we can - for now - pass the index 0, which is all we support anyway, and then use that to test if we are r/o.	2025-07-22 09:41:32 -05:00
Glauber Costa	65312baee6	fix opcodes missing a database register Two of the opcodes we implement (OpenRead and Transaction) should have an opcode specifying the database to use, but they don't. Add it, and for now always use 0 (the main database).	2025-07-20 12:27:26 -05:00
Piotr Rzysko	30ae6538ee	Treat table-valued functions as tables With this change, the following two queries are considered equivalent: ```sql SELECT value FROM generate_series(5, 50); SELECT value FROM generate_series WHERE start = 5 AND stop = 50; ``` Arguments passed in parentheses to the virtual table name are now matched to hidden columns. Column references are still not supported as table-valued function arguments. The only difference is that previously, a query like: ```sql SELECT one.value, series.value FROM (SELECT 1 AS value) one, generate_series(one.value, 3) series; ``` would cause a panic. Now, it returns a proper error message instead. Adding support for column references is more nuanced for two main reasons: - We need to ensure that in joins where a TVF depends on other tables, those other tables are processed first. For example, in: ```sql SELECT one.value, series.value FROM generate_series(one.value, 3) series, (SELECT 1 AS value) one; ``` the one table must be processed by the top-level loop, and series must be nested. - For outer joins involving TVFs, the arguments must be treated as ON predicates, not WHERE predicates.	2025-07-14 07:16:53 +02:00
Piotr Rzysko	44b1b1852a	Fix referencing virtual table predicates We need to enumerate first and filter afterward — not the other way around — because we later use the indexes produced by `enumerate` to access the original `predicates` slice.	2025-07-14 07:16:53 +02:00
Nikita Sivukhin	32fa2ac3ee	avoid capturing changes in cdc table	2025-07-06 22:24:35 +04:00
Nikita Sivukhin	a988bbaffe	allow to specify table in the capture_data_changes PRAGMA	2025-07-06 22:19:32 +04:00
Nikita Sivukhin	40769618c1	small refactoring	2025-07-06 21:16:58 +04:00
Nikita Sivukhin	04f2efeaa4	small renames	2025-07-06 21:16:57 +04:00
Nikita Sivukhin	a82529f55a	emit cdc changes for UPDATE / DELETE statements	2025-07-06 21:16:25 +04:00
Levy A.	ffd6844b5b	refactor: remove `PseudoTable` from `Table` the only reason for `PseudoTable` to exist, is to provide column information for `PseudoCursor` creation. this should not be part of the schema.	2025-06-30 14:31:58 -03:00
Pekka Enberg	725c3e4ddc	Rename `limbo_sqlite3_parser` crate to `turso_sqlite3_parser`	2025-06-29 12:34:46 +03:00
Pekka Enberg	eb0de4066b	Rename `limbo_ext` crate to `turso_ext`	2025-06-29 12:14:08 +03:00
Nils Koch	2827b86917	chore: fix clippy warnings	2025-06-23 19:52:13 +01:00
pedrocarlo	e53a290a48	move ephemeral table logic to update plan and reuse select logic for ephemeral index	2025-06-20 16:30:21 -03:00
pedrocarlo	b3351dc709	tests + adjustment to halt error message	2025-06-20 16:29:10 -03:00
pedrocarlo	9048ad398b	modify loop functions to accomodate for ephemeral tables	2025-06-20 16:29:10 -03:00
pedrocarlo	74beac5ea8	ephemeral table for update when rowid is being update	2025-06-20 16:28:10 -03:00
Jussi Saurio	f396528d53	Merge 'Fix DELETE not emitting constant `WhereTerms`' from Pedro Muniz Fixes DELETE not emitting conditional jumps at all if the associated WhereTerm is a constant, e.g. ```sql limbo> create table t(x); limbo> explain DELETE FROM t WHERE 5-5; addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 7 0 0 Start at 7 1 OpenWrite 0 2 0 0 root=2; t 2 Rewind 0 6 0 0 Rewind table t 3 RowId 0 1 0 0 r[1]=t.rowid 4 Delete 0 0 0 0 5 Next 0 3 0 0 6 Halt 0 0 0 0 7 Transaction 0 1 0 0 write=true 8 Goto 0 1 0 0 ``` I was adding more stuff to the simulator in a Branch of mine, and I caught this error with delete. Upstreaming the fix here. As we do with Update, I added the translation step for the `WhereTerms` of the query. Edit: Closes #1732. Closes #1733. Closes #1734. Closes #1735. Closes #1736. Closes #1738. Closes #1739. Closes #1740. Edit: Also pushes constant where term translation to `init_loop` for Update and Select as well. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1746	2025-06-20 22:00:32 +03:00
Piotr Rzysko	08c1767ba7	Collect non-aggregate columns in one place Previously, the logic for collecting non-aggregate columns was duplicated across multiple locations and implemented inconsistently. This caused a bug that was revealed by the refactoring in this commit (see the added test).	2025-06-20 06:17:14 +02:00
pedrocarlo	fcff306f98	emit constant where terms in init_loop	2025-06-19 13:50:38 -03:00
Levy A.	15e0cab8d8	refactor+fix: precompute default values from schema	2025-06-11 14:18:39 -03:00
Levy A.	7638b0dab7	fix: use default value on empty columns added via ALTER TABLE	2025-06-11 14:18:19 -03:00
krishvishal	5837f7329f	clean up	2025-06-11 00:33:47 +05:30
krishvishal	6c04c18f87	Add affinity flag to comparison opcodes	2025-06-11 00:33:47 +05:30
krishvishal	9130b25111	Add `jump_if_null` flag for rowid alias based seeks	2025-06-11 00:33:05 +05:30
Jussi Saurio	2bac140d73	Remove SeekOp::EQ and encode eq_only in LE&GE - needed for iteration direction aware equality seeks	2025-06-10 14:16:26 +03:00
Jussi Saurio	31b37332d5	all index cursors must be opened when DELETE does an index seek too	2025-06-03 15:18:45 +03:00
Jussi Saurio	06626f72eb	Fix cursors not being opened for indexes in DELETE	2025-06-03 14:45:01 +03:00
Jussi Saurio	819a6138d0	Merge 'Fix: aggregate regs must be initialized as NULL at the start' from Jussi Saurio Again found when fuzzing nested where clause subqueries: Aggregate registers need to be NULLed at the start because the same registers might be reused on another invocation of a subquery, and if they are not NULLed, the 2nd invocation of the same subquery will have values left over from the first invocation. Reviewed-by: Preston Thorpe (@PThorpe92) Closes #1614	2025-05-30 09:39:37 +03:00
Jussi Saurio	f8257df77b	Fix: aggregate regs must be initialized as NULL at the start	2025-05-29 18:44:53 +03:00
Jussi Saurio	cc405dea7e	Use new TableReferences struct everywhere	2025-05-29 11:44:56 +03:00
Jussi Saurio	77ce4780d9	Fix ProgramBuilder::cursor_ref not having unique keys Currently we have this: program.alloc_cursor_id(Option<String>, CursorType)` where the String is the table's name or alias ('users' or 'u' in the query). This is problematic because this can happen: `SELECT * FROM t WHERE EXISTS (SELECT * FROM t)` There are two cursors, both with identifier 't'. This causes a bug where the program will use the same cursor for both the main query and the subquery, since they are keyed by 't'. Instead introduce `CursorKey`, which is a combination of: 1. `TableInternalId`, and 2. index name (Option<String> -- in case of index cursors. This should provide key uniqueness for cursors: `SELECT * FROM t WHERE EXISTS (SELECT * FROM t)` here the first 't' will have a different `TableInternalId` than the second `t`, so there is no clash.	2025-05-29 00:59:24 +03:00
Jussi Saurio	73e806ad84	Make WhereTerm::consumed a Cell<bool> Currently in the main translation logic after planning and optimization, we don't _really_ need to pass a &mut Vec<WhereTerm> around anymore, except for the fact that virtual table constraint resolution is done ad-hoc in `init_loop()`. Even there, the only thing we mutate is `WhereTerm::consumed` which is a boolean indicating that the term has been "used up" by the optimizer and shouldn't be evaluated as a normal where clause condition anymore. In the upcoming branch for WHERE clause subqueries, I want to store immutable references to WHERE clause expressions in `Resolver`, but this is unfortunately not possible if we still use the aforementioned mutable references. Hence, we can temporarily make `WhereTerm::consumed` a `Cell<bool>` which allows us to pass an immutable reference to `init_loop()`, and the `Cell` can be removed once the virtual table constraint resolution is moved to an earlier part of the query processing pipeline.	2025-05-28 11:02:39 +03:00
Jussi Saurio	4e9d9a2470	Fix LIMIT handling Currently we have some usages of LIMIT where the actual limit counter is initialized next to the DecrJumpZero instruction, and then `program.mark_last_insn_constant()` is used to hoist the counter initialization to the beginning of the program. This is very fragile, and already FROM clause subquery handling works around this with a hack (removed in this PR), and (upcoming) WHERE clause subqueries would also run into problems because of this, because the LIMIT might need to be initialized once for every iteration of the subquery. This PR removes those usages for LIMIT, and LIMIT processing is now more intuitive: - limit counter is now initialized at the start of the query processing - a function init_limit() is extracted to do this for select/update/delete	2025-05-27 21:12:22 +03:00
Jussi Saurio	07fa3a9668	Rename SelectQueryType to QueryDestination	2025-05-25 21:23:04 +03:00
Jussi Saurio	7c07c09300	Add stable internal_id property to TableReference Currently our "table id"/"table no"/"table idx" references always use the direct index of the `TableReference` in the plan, e.g. in `SelectPlan::table_references`. For example: ```rust Expr::Column { table: 0, column: 3, .. } ``` refers to the 0'th table in the `table_references` list. This is a fragile approach because it assumes the table_references list is stable for the lifetime of the query processing. This has so far been the case, but there exist certain query transformations, e.g. subquery unnesting, that may fold new table references from a subquery (which has its own table ref list) into the table reference list of the parent. If such a transformation is made, then potentially all of the Expr::Column references to tables will become invalid. Consider this example: ```sql -- Assume tables: users(id, age), orders(user_id, amount) -- Get total amount spent per user on orders over $100 SELECT u.id, sub.total FROM users u JOIN (SELECT user_id, SUM(amount) as total FROM orders o WHERE o.amount > 100 GROUP BY o.user_id) sub WHERE u.id = sub.user_id -- Before subquery unnesting: -- Main query table_references: [users, sub] -- u.id refers to table 0, column 0 -- sub.total refers to table 1, column 1 -- -- Subquery table_references: [orders] -- o.user_id refers to table 0, column 0 -- o.amount refers to table 0, column 1 -- -- After unnesting and folding subquery tables into main query, -- the query might look like this: SELECT u.id, SUM(o.amount) as total FROM users u JOIN orders o ON u.id = o.user_id WHERE o.amount > 100 GROUP BY u.id; -- Main query table_references: [users, orders] -- u.id refers to table index 0 (correct) -- o.amount refers to table index 0 (incorrect, should be 1) -- o.user_id refers to table index 0 (incorrect, should be 1) ``` We could ofc traverse every expression in the subquery and rewrite the table indexes to be correct, but if we instead use stable identifiers for each table reference, then all the column references will continue to be correct. Hence, this PR introduces a `TableInternalId` used in `TableReference` as well as `Expr::Column` and `Expr::Rowid` so that this kind of query transformations can happen with less pain.	2025-05-25 20:26:17 +03:00

1 2 3

134 Commits