turso

mirror of https://github.com/aljazceru/turso.git synced 2025-12-20 09:54:19 +01:00

Author	SHA1	Message	Date
Jussi Saurio	5da76c9125	Allow index in UPDATE for point queries (i.e. max 1 row affected)	2025-08-14 15:58:01 +03:00
Jussi Saurio	cd3b4bccd3	Fix UPDATE: Do not use an index for iteration if that index is going to be updated Closes #2598	2025-08-14 15:35:00 +03:00
Mikaël Francoeur	2cf4e4fe96	handle single, double and unquoted strings in values clause	2025-08-08 09:03:38 -04:00
Piotr Rzysko	59ec2d3949	Replace ConstraintInfo::plan_info with ConstraintInfo::index The side of the binary expression no longer needs to be stored in `ConstraintInfo`, since the optimizer now guarantees that it is always on the right. As a result, only the index of the corresponding constraint needs to be preserved.	2025-08-05 05:48:29 +02:00
Piotr Rzysko	8fb4fbf8af	Make WhereTerm::consumed a plain bool Now that virtual tables are integrated into the optimizer, this field no longer needs to be wrapped in Cell<bool>.	2025-08-05 05:48:28 +02:00
Piotr Rzysko	82491ceb6a	Integrate virtual tables with optimizer This change connects virtual tables with the query optimizer. The optimizer now considers virtual tables during join order search and invokes their best_index callbacks to determine feasible access paths. Currently, this is not a visible change, since none of the existing extensions return information indicating that a plan is invalid.	2025-08-05 05:48:28 +02:00
Piotr Rzysko	718598eab8	Introduce scan type Different scan parameters are required for different table types. Currently, index and iteration direction are only used by B-tree tables, while the remaining table types don’t require any parameters. Planning access to virtual tables, however, will require passing additional information from the planner, such as the virtual table index (distinct from a B-tree index) and the constraints that must be forwarded to the `filter` method.	2025-08-04 20:27:22 +02:00
Piotr Rzysko	9167b30c7c	Introduce AccessMethodParams Previously, AccessMethod stored fields like `iter_dir`, `index`, and `constraint_refs` directly, but these only applied to BTree tables. Other table types (virtual tables, subqueries) either ignored these fields or required different parameters entirely. This change prepares the planner to handle virtual table access methods with their own specialized parameters.	2025-08-04 20:23:44 +02:00
bit-aloo	9a54ef214e	parser: Distinguish quoted identifiers and unify Id into Name enum This commit replaces the `Name(pub String)` struct with a `Name` enum that explicitly models how the name appeared in the source either as an unquoted identifier (`Ident`) or a quoted string (`Quoted`). In the process, the separate `Id` wrapper type has been coalesced into the `Name` enum, simplifying the AST and reducing duplication in identifier handling logic. While this increases the size of some AST nodes (notably `yyStackEntry`), it improves correctness and makes source structure more explicit for later phases.	2025-07-24 14:40:19 +05:30
Glauber Costa	cbdd5c5fc7	improve handling of double quotes I ended up hitting #1974 today and wanted to fix it. I worked with Claude to generate a more comprehensive set of queries that could fail aside from just the insert query described in the issue. He got most of them right - lots of cases were indeed failing. The ones that were gibberish, he told me I was absolutely right for pointing out they were bad. But alas. With the test cases generated, we can work on fixing it. The place where the assertion was hit, all we need to do there is return true (but we assert that this is indeed a string literal, it shouldn't be anything else at this point). There are then just a couple of places where we need to make sure we handle double quotes correctly. We already tested for single quotes in a couple of places, but never for double quotes. There is one funny corner case where you can just select "col" from tbl, and if there is no column "col" on the table, that is treated as a string literal. We handle that too. Fixes #1974	2025-07-18 10:39:02 -05:00
Levy A.	6fe2505425	add more `ToTokens` impls	2025-07-16 12:16:31 -03:00
Piotr Rzysko	319cdbe3af	Don't use search for virtual tables Previously, the test queries added in this commit would fail with: thread 'main' panicked at core/schema.rs:129:34: not implemented stack backtrace: 0: rust_begin_unwind at /rustc/90b35a6239c3d8bdabc530a6a0816f7ff89a0aaf/library/std/src/panicking.rs:665:5 1: core::panicking::panic_fmt at /rustc/90b35a6239c3d8bdabc530a6a0816f7ff89a0aaf/library/core/src/panicking.rs:74:14 2: core::panicking::panic at /rustc/90b35a6239c3d8bdabc530a6a0816f7ff89a0aaf/library/core/src/panicking.rs:148:5 3: limbo_core::schema::Table::get_root_page at ./core/schema.rs:129:34 4: limbo_core::translate::main_loop::init_loop at ./core/translate/main_loop.rs:260:44 5: limbo_core::translate::emitter::emit_query at ./core/translate/emitter.rs:568:5 6: limbo_core::translate::emitter::emit_program_for_select at ./core/translate/emitter.rs:496:5 7: limbo_core::translate::emitter::emit_program at ./core/translate/emitter.rs:187:31 8: limbo_core::translate::select::translate_select at ./core/translate/select.rs:82:5 9: limbo_core::translate::translate_inner at ./core/translate/mod.rs:241:13 10: limbo_core::translate::translate at ./core/translate/mod.rs:95:17 11: limbo_core::Connection::run_cmd at ./core/lib.rs:416:31 12: <limbo_core::QueryRunner as core::iter::traits::iterator::Iterator>::next at ./core/lib.rs:916:22 13: limbo::app::Limbo::run_query at ./cli/app.rs:442:27 14: limbo::app::Limbo::handle_input_line at ./cli/app.rs:544:13 15: limbo::main at ./cli/main.rs:51:31 16: core::ops::function::FnOnce::call_once	2025-07-14 07:16:53 +02:00
Nils Koch	828d4f5016	fix clippy errors for rust 1.88.0 (auto fix)	2025-07-12 18:58:41 +03:00
Pekka Enberg	b87ce6d178	Merge 'Fix deleting previous rowid when rowid is in the Set Clause' from Pedro Muniz Closes #1888 . This PR fixes UPDATE translation by not emitting an ephemeral plan when we are doing a `RowIdEq` search. Also, we should delete the previous rowid when the rowid is in the set clause. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1891	2025-06-30 11:58:05 +03:00
Pekka Enberg	9c1b7897ac	Fix URLs to point to github.com/tursodatabase/turso	2025-06-30 11:23:53 +03:00
pedrocarlo	738e2cc06c	do not emit ephemeral plan when doing a SeekRowId + emit Delete instruction when rowid in set clause	2025-06-29 17:12:24 -03:00
Pekka Enberg	725c3e4ddc	Rename `limbo_sqlite3_parser` crate to `turso_sqlite3_parser`	2025-06-29 12:34:46 +03:00
Pekka Enberg	2fc5c0ce5c	Switch to runtime flag for enabling indexes Makes it easier to test the feature: ``` $ cargo run -- --experimental-indexes Limbo v0.0.22 Enter ".help" for usage hints. Connected to a transient in-memory database. Use ".open FILENAME" to reopen on a persistent database limbo> CREATE TABLE t(x); limbo> CREATE INDEX t_idx ON t(x); limbo> DROP INDEX t_idx; ```	2025-06-26 10:07:28 +03:00
Nils Koch	2827b86917	chore: fix clippy warnings	2025-06-23 19:52:13 +01:00
pedrocarlo	e53a290a48	move ephemeral table logic to update plan and reuse select logic for ephemeral index	2025-06-20 16:30:21 -03:00
pedrocarlo	9048ad398b	modify loop functions to accomodate for ephemeral tables	2025-06-20 16:29:10 -03:00
Pere Diaz Bou	9ae4563bcd	`index_experimental` flag to enable index usages Currently indexes are the bulk of the problem with `UPDATE` and `DELETE`, while we work on fixing those it makes sense to disable indexing since they are not stable. We want to try to make everything else stable before we continue with indexing.	2025-06-17 19:33:23 +02:00
meteorgan	6179d8de23	refactor compound select	2025-06-13 10:39:32 +03:00
Levy A.	7638b0dab7	fix: use default value on empty columns added via ALTER TABLE	2025-06-11 14:18:19 -03:00
Jussi Saurio	e9d1f0823b	Disable index usage in DELETE because it does not work safely	2025-06-11 12:15:20 +03:00
Jussi Saurio	85972fd744	Merge 'Fix rowid to_sql_string' from Pedro Muniz Addresses the panic encountered here: https://github.com/tursodatabase/limbo/pull/1690 . Sorry about that. Closes #1693	2025-06-10 18:11:51 +03:00
pedrocarlo	36f60e4dd1	Fix rowid to_sql_string printing	2025-06-10 10:48:05 -03:00
Jussi Saurio	844461d20b	update and delete fixes	2025-06-10 14:16:26 +03:00
Jussi Saurio	2bac140d73	Remove SeekOp::EQ and encode eq_only in LE&GE - needed for iteration direction aware equality seeks	2025-06-10 14:16:26 +03:00
Jussi Saurio	18e6987904	Remove plan.to_sql_string() from optimize_plan() as it panics on TODOs	2025-06-09 09:45:06 +03:00
pedrocarlo	3c1b984b78	use table_references for `PlanContext`	2025-06-04 12:06:43 -03:00
pedrocarlo	bfc8cb6d4c	move display and to_sql_string impls to separate modules for plan	2025-06-04 12:06:43 -03:00
pedrocarlo	f90bebbfbc	small fix and remove dbg	2025-06-04 12:06:43 -03:00
pedrocarlo	fa0dff9843	Fix rebase changes	2025-06-04 12:06:43 -03:00
pedrocarlo	a96577529e	impl ToSqlString for Update Plan	2025-06-04 12:06:43 -03:00
pedrocarlo	d243d1015c	impl ToSqlString for Delete Plan	2025-06-04 12:06:43 -03:00
pedrocarlo	ff5aa17769	impl ToSqlString for CompoundSelect Plan	2025-06-04 12:06:43 -03:00
pedrocarlo	51014d01c3	impl ToSqlString for SelectPlan	2025-06-04 12:06:43 -03:00
Jussi Saurio	cc405dea7e	Use new TableReferences struct everywhere	2025-05-29 11:44:56 +03:00
Jussi Saurio	73e806ad84	Make WhereTerm::consumed a Cell<bool> Currently in the main translation logic after planning and optimization, we don't _really_ need to pass a &mut Vec<WhereTerm> around anymore, except for the fact that virtual table constraint resolution is done ad-hoc in `init_loop()`. Even there, the only thing we mutate is `WhereTerm::consumed` which is a boolean indicating that the term has been "used up" by the optimizer and shouldn't be evaluated as a normal where clause condition anymore. In the upcoming branch for WHERE clause subqueries, I want to store immutable references to WHERE clause expressions in `Resolver`, but this is unfortunately not possible if we still use the aforementioned mutable references. Hence, we can temporarily make `WhereTerm::consumed` a `Cell<bool>` which allows us to pass an immutable reference to `init_loop()`, and the `Cell` can be removed once the virtual table constraint resolution is moved to an earlier part of the query processing pipeline.	2025-05-28 11:02:39 +03:00
Jussi Saurio	7c07c09300	Add stable internal_id property to TableReference Currently our "table id"/"table no"/"table idx" references always use the direct index of the `TableReference` in the plan, e.g. in `SelectPlan::table_references`. For example: ```rust Expr::Column { table: 0, column: 3, .. } ``` refers to the 0'th table in the `table_references` list. This is a fragile approach because it assumes the table_references list is stable for the lifetime of the query processing. This has so far been the case, but there exist certain query transformations, e.g. subquery unnesting, that may fold new table references from a subquery (which has its own table ref list) into the table reference list of the parent. If such a transformation is made, then potentially all of the Expr::Column references to tables will become invalid. Consider this example: ```sql -- Assume tables: users(id, age), orders(user_id, amount) -- Get total amount spent per user on orders over $100 SELECT u.id, sub.total FROM users u JOIN (SELECT user_id, SUM(amount) as total FROM orders o WHERE o.amount > 100 GROUP BY o.user_id) sub WHERE u.id = sub.user_id -- Before subquery unnesting: -- Main query table_references: [users, sub] -- u.id refers to table 0, column 0 -- sub.total refers to table 1, column 1 -- -- Subquery table_references: [orders] -- o.user_id refers to table 0, column 0 -- o.amount refers to table 0, column 1 -- -- After unnesting and folding subquery tables into main query, -- the query might look like this: SELECT u.id, SUM(o.amount) as total FROM users u JOIN orders o ON u.id = o.user_id WHERE o.amount > 100 GROUP BY u.id; -- Main query table_references: [users, orders] -- u.id refers to table index 0 (correct) -- o.amount refers to table index 0 (incorrect, should be 1) -- o.user_id refers to table index 0 (incorrect, should be 1) ``` We could ofc traverse every expression in the subquery and rewrite the table indexes to be correct, but if we instead use stable identifiers for each table reference, then all the column references will continue to be correct. Hence, this PR introduces a `TableInternalId` used in `TableReference` as well as `Expr::Column` and `Expr::Rowid` so that this kind of query transformations can happen with less pain.	2025-05-25 20:26:17 +03:00
Jussi Saurio	08bda9cc58	UNION ALL	2025-05-24 13:12:41 +03:00
Jussi Saurio	fbfd2b2c38	refactor: use walk_expr_mut() in rewrite_expr()	2025-05-23 16:27:28 +03:00
Jussi Saurio	696c98877c	Merge 'btree: Remove assumption that all btrees have a rowid' from Jussi Saurio For example, implementing `SELECT DISTINCT` (#1517) and `UNION` (#1545) require that we are able to create indexes without a rowid column present. Similarly, `WITHOUT ROWID` tables require this. I implemented this by replacing the `rowid` and `empty_record` properties in `BtreeCursor` with ```rust /// Whether the cursor is currently pointing to a record. #[derive(Debug, Clone, Copy, PartialEq)] enum CursorHasRecord { Yes { rowid: Option<u64>, // not all indexes and btrees have rowids, so this is optional. }, No, } ``` Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1518	2025-05-21 14:53:00 +03:00
Jussi Saurio	c4548b51f1	Merge 'Optimization: lift common subexpressions from OR terms' from Jussi Saurio ```sql -- This PR does effectively this transformation: select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where ( p_partkey = l_partkey and p_brand = 'Brand#22' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 8 and l_quantity <= 8 + 10 and p_size between 1 and 5 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#23' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 10 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#12' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 24 and l_quantity <= 24 + 10 and p_size between 1 and 15 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ); -- Same query with common conjuncts (ANDs) extracted: select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where p_partkey = l_partkey and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' and ( ( p_brand = 'Brand#22' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 8 and l_quantity <= 8 + 10 and p_size between 1 and 5 ) or ( p_brand = 'Brand#23' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 10 ) or ( p_brand = 'Brand#12' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 24 and l_quantity <= 24 + 10 and p_size between 1 and 15 ) ); ``` This allows Limbo's optimizer to 1. recognize `p_partkey=l_partkey` as an index constraint on `part`, and 2. filter out `lineitem` rows before joining. With this optimization, Limbo completes TPC-H `19.sql` nearly as fast as SQLite on my machine. Without it, Limbo takes forever. This branch: `939ms` Main: `uh, i started running it a few minutes ago and it hasnt finished, and i dont feel like waiting i guess` Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1520	2025-05-20 14:33:49 +03:00
Jussi Saurio	14058357ad	Merge 'refactor: replace Operation::Subquery with Table::FromClauseSubquery' from Jussi Saurio Previously the Operation enum consisted of: - Operation::Scan - Operation::Search - Operation::Subquery Which was always a dumb hack because what we really are doing is an Operation::Scan on a "virtual"/"pseudo" table (overloaded names...) derived from a subquery appearing in the FROM clause. Hence, refactor the relevant data structures so that the Table enum now contains a new variant: Table::FromClauseSubquery And the Operation enum only consists of Scan and Search. ``` SELECT * FROM (SELECT ...) sub; -- the subquery here was previously interpreted as Operation::Subquery on a Table::Pseudo, -- with a lot of special handling for Operation::Subquery in different code paths -- now it's an Operation::Scan on a Table::FromClauseSubquery ``` No functional changes (intended, at least!) Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1529	2025-05-20 14:31:42 +03:00
Jussi Saurio	6790b7479c	Optimization: lift common subexpressions from OR terms ```sql -- This PR does effectively this transformation: select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where ( p_partkey = l_partkey and p_brand = 'Brand#22' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 8 and l_quantity <= 8 + 10 and p_size between 1 and 5 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#23' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 10 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#12' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 24 and l_quantity <= 24 + 10 and p_size between 1 and 15 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ); -- Same query with common conjuncts (ANDs) extracted: select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where p_partkey = l_partkey and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' and ( ( p_brand = 'Brand#22' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 8 and l_quantity <= 8 + 10 and p_size between 1 and 5 ) or ( p_brand = 'Brand#23' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 10 ) or ( p_brand = 'Brand#12' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 24 and l_quantity <= 24 + 10 and p_size between 1 and 15 ) ); ```	2025-05-20 14:25:15 +03:00
Jussi Saurio	a7b33b1509	schema: add Index::has_rowid	2025-05-20 14:22:17 +03:00
Jussi Saurio	3121c6cdd3	Replace Operation::Subquery with Table::FromClauseSubquery Previously the Operation enum consisted of: - Operation::Scan - Operation::Search - Operation::Subquery Which was always a dumb hack because what we really are doing is an Operation::Scan on a "virtual"/"pseudo" table (overloaded names...) derived from a subquery appearing in the FROM clause. Hence, refactor the relevant data structures so that the Table enum now contains a new variant: Table::FromClauseSubquery And the Operation enum only consists of Scan and Search. No functional changes (intended, at least!)	2025-05-20 12:56:30 +03:00
Jussi Saurio	9c710b5292	Add collation column to Index struct	2025-05-20 12:52:54 +03:00

1 2

62 Commits