turso

mirror of https://github.com/aljazceru/turso.git synced 2025-12-18 17:14:20 +01:00

Author	SHA1	Message	Date
Pekka Enberg	f24e254ec6	core/translate: Fix "misuse of aggregate function" error message ``` sqlite> CREATE TABLE test1(f1, f2); sqlite> SELECT SUM(min(f1)) FROM test1; Parse error: misuse of aggregate function min() SELECT SUM(min(f1)) FROM test1; ^--- error here ``` Spotted by SQLite TCL tests.	2025-07-10 14:29:59 +03:00
KaguraMilet	9d6ae78786	Merge branch 'tursodatabase:main' into distance	2025-07-10 19:15:08 +08:00
Pekka Enberg	3f10427f52	core: Fix resolve_function() error messages We need to return the original function name, not normalized one to be compatible with SQLite. Spotted by SQLite TCL tests.	2025-07-09 15:30:57 +03:00
pedrocarlo	b85687658d	change instrumentation level to INFO	2025-07-07 11:53:45 -03:00
pedrocarlo	5559c45011	more instrumentation + write counter should decrement if pwrite fails	2025-07-07 11:50:21 -03:00
pedrocarlo	897426a662	add error tracing to relevant functions + rollback transaction in step_end_write_txn + make move_to_root return result	2025-07-07 11:50:21 -03:00
KaguraMilet	ac95758f76	feat(vector): integrate euclidean distance into limbo	2025-07-07 21:11:51 +08:00
Levy A.	ffd6844b5b	refactor: remove `PseudoTable` from `Table` the only reason for `PseudoTable` to exist, is to provide column information for `PseudoCursor` creation. this should not be part of the schema.	2025-06-30 14:31:58 -03:00
Pekka Enberg	725c3e4ddc	Rename `limbo_sqlite3_parser` crate to `turso_sqlite3_parser`	2025-06-29 12:34:46 +03:00
Piotr Rzysko	116df2ec86	Fix evaluation of ISNULL/NOTNULL in OR expressions Previously, the `jump_if_condition_is_true` flag was not respected. As a result, for expressions like <`ISNULL`/`NOTNULL`> `OR` <rhs>, the <rhs> expression was evaluated even when the left-hand side was true, and its value was incorrectly used as the final result.	2025-06-27 08:21:40 +02:00
Nils Koch	2827b86917	chore: fix clippy warnings	2025-06-23 19:52:13 +01:00
Piotr Rzysko	64a0333119	Fix missing column references in non-aggregate expressions Previously, queries like: ``` SELECT CASE WHEN c0 != 'x' THEN group_concat(c1, ',') ELSE 'x' END FROM t0 GROUP BY c0; ``` would return incorrect results because c0 was not copied during the aggregation loop into a register accessible to the logic processing the grouped results (e.g., the CASE WHEN expression in this example). The same issue applied to expressions in the HAVING and ORDER BY clauses.	2025-06-20 06:19:16 +02:00
Levy A.	b88cb99ff0	fix warnings and some refactoring	2025-06-11 14:19:06 -03:00
Levy A.	49a6ddad97	wip	2025-06-11 14:19:04 -03:00
Levy A.	15e0cab8d8	refactor+fix: precompute default values from schema	2025-06-11 14:18:39 -03:00
Krishna Vishal	0d5cbc4f1d	Add affinity check as a function as `ast::Operator` impl	2025-06-11 00:33:48 +05:30
Krishna Vishal	712c94537c	Add affinity flags to `IS` and `IS NOT` opeartors	2025-06-11 00:33:48 +05:30
krishvishal	5837f7329f	clean up	2025-06-11 00:33:47 +05:30
krishvishal	7bd1589615	Added affinity inference and conversion for comparison ops. Added affinity helper function for `CmpInsFlags`	2025-06-11 00:33:44 +05:30
pedrocarlo	80c480517a	incorrect placeholder label in where clause translation	2025-06-10 12:00:19 -03:00
Jussi Saurio	cc405dea7e	Use new TableReferences struct everywhere	2025-05-29 11:44:56 +03:00
Jussi Saurio	77ce4780d9	Fix ProgramBuilder::cursor_ref not having unique keys Currently we have this: program.alloc_cursor_id(Option<String>, CursorType)` where the String is the table's name or alias ('users' or 'u' in the query). This is problematic because this can happen: `SELECT * FROM t WHERE EXISTS (SELECT * FROM t)` There are two cursors, both with identifier 't'. This causes a bug where the program will use the same cursor for both the main query and the subquery, since they are keyed by 't'. Instead introduce `CursorKey`, which is a combination of: 1. `TableInternalId`, and 2. index name (Option<String> -- in case of index cursors. This should provide key uniqueness for cursors: `SELECT * FROM t WHERE EXISTS (SELECT * FROM t)` here the first 't' will have a different `TableInternalId` than the second `t`, so there is no clash.	2025-05-29 00:59:24 +03:00
Jussi Saurio	51605ad2a4	Use lifetimes in walk_expr() to guarantee that child expr has same lifetime as parent expr	2025-05-28 10:56:30 +03:00
Jussi Saurio	7c07c09300	Add stable internal_id property to TableReference Currently our "table id"/"table no"/"table idx" references always use the direct index of the `TableReference` in the plan, e.g. in `SelectPlan::table_references`. For example: ```rust Expr::Column { table: 0, column: 3, .. } ``` refers to the 0'th table in the `table_references` list. This is a fragile approach because it assumes the table_references list is stable for the lifetime of the query processing. This has so far been the case, but there exist certain query transformations, e.g. subquery unnesting, that may fold new table references from a subquery (which has its own table ref list) into the table reference list of the parent. If such a transformation is made, then potentially all of the Expr::Column references to tables will become invalid. Consider this example: ```sql -- Assume tables: users(id, age), orders(user_id, amount) -- Get total amount spent per user on orders over $100 SELECT u.id, sub.total FROM users u JOIN (SELECT user_id, SUM(amount) as total FROM orders o WHERE o.amount > 100 GROUP BY o.user_id) sub WHERE u.id = sub.user_id -- Before subquery unnesting: -- Main query table_references: [users, sub] -- u.id refers to table 0, column 0 -- sub.total refers to table 1, column 1 -- -- Subquery table_references: [orders] -- o.user_id refers to table 0, column 0 -- o.amount refers to table 0, column 1 -- -- After unnesting and folding subquery tables into main query, -- the query might look like this: SELECT u.id, SUM(o.amount) as total FROM users u JOIN orders o ON u.id = o.user_id WHERE o.amount > 100 GROUP BY u.id; -- Main query table_references: [users, orders] -- u.id refers to table index 0 (correct) -- o.amount refers to table index 0 (incorrect, should be 1) -- o.user_id refers to table index 0 (incorrect, should be 1) ``` We could ofc traverse every expression in the subquery and rewrite the table indexes to be correct, but if we instead use stable identifiers for each table reference, then all the column references will continue to be correct. Hence, this PR introduces a `TableInternalId` used in `TableReference` as well as `Expr::Column` and `Expr::Rowid` so that this kind of query transformations can happen with less pain.	2025-05-25 20:26:17 +03:00
Jussi Saurio	40a4d162bc	Introduce walker expressions for ast::Expr	2025-05-23 15:56:27 +03:00
Jussi Saurio	c4548b51f1	Merge 'Optimization: lift common subexpressions from OR terms' from Jussi Saurio ```sql -- This PR does effectively this transformation: select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where ( p_partkey = l_partkey and p_brand = 'Brand#22' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 8 and l_quantity <= 8 + 10 and p_size between 1 and 5 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#23' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 10 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#12' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 24 and l_quantity <= 24 + 10 and p_size between 1 and 15 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ); -- Same query with common conjuncts (ANDs) extracted: select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where p_partkey = l_partkey and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' and ( ( p_brand = 'Brand#22' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 8 and l_quantity <= 8 + 10 and p_size between 1 and 5 ) or ( p_brand = 'Brand#23' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 10 ) or ( p_brand = 'Brand#12' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 24 and l_quantity <= 24 + 10 and p_size between 1 and 15 ) ); ``` This allows Limbo's optimizer to 1. recognize `p_partkey=l_partkey` as an index constraint on `part`, and 2. filter out `lineitem` rows before joining. With this optimization, Limbo completes TPC-H `19.sql` nearly as fast as SQLite on my machine. Without it, Limbo takes forever. This branch: `939ms` Main: `uh, i started running it a few minutes ago and it hasnt finished, and i dont feel like waiting i guess` Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1520	2025-05-20 14:33:49 +03:00
Jussi Saurio	6790b7479c	Optimization: lift common subexpressions from OR terms ```sql -- This PR does effectively this transformation: select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where ( p_partkey = l_partkey and p_brand = 'Brand#22' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 8 and l_quantity <= 8 + 10 and p_size between 1 and 5 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#23' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 10 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#12' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 24 and l_quantity <= 24 + 10 and p_size between 1 and 15 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ); -- Same query with common conjuncts (ANDs) extracted: select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where p_partkey = l_partkey and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' and ( ( p_brand = 'Brand#22' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 8 and l_quantity <= 8 + 10 and p_size between 1 and 5 ) or ( p_brand = 'Brand#23' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 10 ) or ( p_brand = 'Brand#12' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 24 and l_quantity <= 24 + 10 and p_size between 1 and 15 ) ); ```	2025-05-20 14:25:15 +03:00
Jussi Saurio	3121c6cdd3	Replace Operation::Subquery with Table::FromClauseSubquery Previously the Operation enum consisted of: - Operation::Scan - Operation::Search - Operation::Subquery Which was always a dumb hack because what we really are doing is an Operation::Scan on a "virtual"/"pseudo" table (overloaded names...) derived from a subquery appearing in the FROM clause. Hence, refactor the relevant data structures so that the Table enum now contains a new variant: Table::FromClauseSubquery And the Operation enum only consists of Scan and Search. No functional changes (intended, at least!)	2025-05-20 12:56:30 +03:00
pedrocarlo	5bd47d7462	post rebase adjustments to accomodate new instructions that were created before the merge conflicts	2025-05-19 15:22:15 -03:00
pedrocarlo	bba9689674	Fixed matching bug for defining collation context to use	2025-05-19 15:22:14 -03:00
pedrocarlo	a818b6924c	Removed repeated binary expression translation. Adjusted the set_collation to capture additional context of whether it was set by a Collate expression or not. Added some tests to prove those modifications were necessary.	2025-05-19 15:22:14 -03:00
pedrocarlo	f8854f180a	Added collation to create table columns	2025-05-19 15:22:14 -03:00
pedrocarlo	d0a63429a6	Naive implementation of collate for queries. Not implemented for column constraints	2025-05-19 15:22:14 -03:00
pedrocarlo	b5b1010e7c	set binary collation as default	2025-05-19 15:22:14 -03:00
Pekka Enberg	e3f71259d8	Rename OwnedValue -> Value We have not had enough merge conflicts for a while so let's do a tree-wide rename.	2025-05-15 09:59:46 +03:00
Jussi Saurio	5386859b44	as_binary-components: simplify	2025-05-14 09:42:26 +03:00
Jussi Saurio	bd875e3876	optimizer module split	2025-05-14 09:42:26 +03:00
Jussi Saurio	501e95637a	Merge 'Support isnull and notnull expr' from meteorgan Limbo `isnull` output: ``` limbo> explain select 1 isnull; addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 3 0 0 Start at 3 1 ResultRow 1 1 0 0 output=r[1] 2 Halt 0 0 0 0 3 Integer 1 2 0 0 r[2]=1 4 Integer 1 1 0 0 r[1]=1 5 IsNull 2 7 0 0 if (r[2]==NULL) goto 7 6 Integer 0 1 0 0 r[1]=0 7 Goto 0 1 0 0 ``` Sqlite `isnull` output: ``` sqlite> explain select 1 isnull; addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 6 0 0 Start at 6 1 Integer 1 1 0 0 r[1]=1 2 IsNull 2 4 0 0 if r[2]==NULL goto 4 3 Integer 0 1 0 0 r[1]=0 4 ResultRow 1 1 0 0 output=r[1] 5 Halt 0 0 0 0 6 Integer 1 2 0 0 r[2]=1 7 Goto 0 1 0 0 ``` ------------------------------------------------------------------------ ------------------- Limbo `notnull` output: ``` limbo> explain select 1 notnull; addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 3 0 0 Start at 3 1 ResultRow 1 1 0 0 output=r[1] 2 Halt 0 0 0 0 3 Integer 1 2 0 0 r[2]=1 4 Integer 1 1 0 0 r[1]=1 5 NotNull 2 7 0 0 r[2]!=NULL -> goto 7 6 Integer 0 1 0 0 r[1]=0 7 Goto 0 1 0 0 ``` Sqlite `notnull` output: ``` sqlite> explain select 1 notnull; addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 6 0 0 Start at 6 1 Integer 1 1 0 0 r[1]=1 2 NotNull 2 4 0 0 if r[2]!=NULL goto 4 3 Integer 0 1 0 0 r[1]=0 4 ResultRow 1 1 0 0 output=r[1] 5 Halt 0 0 0 0 6 Integer 1 2 0 0 r[2]=1 7 Goto 0 1 0 0 ``` Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1468	2025-05-12 10:06:35 +03:00
PThorpe92	ab23f2a24f	Add comments and reorganize fix of ordering parameters for insert statements	2025-05-11 14:20:57 -04:00
meteorgan	5185f4bf9e	Support isnull and notnull expr	2025-05-11 23:47:30 +08:00
PThorpe92	1c7a50de96	Update comments and correct vtab insert behavior	2025-05-10 10:03:00 -04:00
PThorpe92	0d73fe0fe7	Fix parameter position on insert by handling before vdbe layer	2025-05-10 07:46:29 -04:00
PThorpe92	50f2621c12	Add several more rust tests for parameter binding	2025-05-10 07:46:29 -04:00
PThorpe92	c4aee50b58	Fix unclear comments in translator	2025-05-10 07:46:29 -04:00
PThorpe92	d908e78729	Use positional offsets in translate::expr to remap parameters to their correct offsets	2025-05-10 07:46:27 -04:00
meteorgan	ef3f004e30	refactor numeric literal	2025-05-08 18:37:17 +08:00
meteorgan	51d43074f3	Support literal-value current_time, current_date and current_timestamp	2025-04-29 22:35:26 +08:00
Jussi Saurio	e557503091	expr.rs: use constant spans to optimize constant expressions	2025-04-24 11:05:21 +03:00
pedrocarlo	1928dcfa10	Correct docs regarding between	2025-04-21 23:05:01 -03:00
Jussi Saurio	6c73db6fd3	feat: use covering indexes whenever possible	2025-04-18 15:13:09 +03:00

1 2 3 4 5 ...

459 Commits