Commit Graph

1856 Commits

Author SHA1 Message Date
Nikita Sivukhin
a151770cea add minimal support of index_methods in the query planner in order to make integration tests work 2025-10-27 16:34:49 +04:00
Jussi Saurio
5c05383cc1 Implement union for ColumnUsedMask 2025-10-27 13:57:56 +02:00
Jussi Saurio
3a1d6d8879 Improve error messages in translate_expr()
The current error messages are misleading, as the user may encounter
these errors in expressions outside the WHERE clause, too.
2025-10-27 13:51:59 +02:00
Jussi Saurio
de81af29e5 find_table_by_internal_id() returns whether table is an outer query reference
Unfortunately, our current translation machinery is unable to know for sure
whether a subquery reference to an outer table 't1' has opened a table cursor,
an index cursor, or both.

For this reason, return a flag from `TableReferences::find_table_by_internal_id()`
that tells the caller whether the table is an outer query reference, and further
commits will have some additional logic to decide which cursor a subquery will
read from when referencing a table from the outer query.
2025-10-27 13:47:49 +02:00
Nikita Sivukhin
8a80e8b743 rename custom modules to index_method like in postgresql 2025-10-27 13:18:18 +04:00
Nikita Sivukhin
408ca235d1 small refactoring 2025-10-27 12:43:38 +04:00
Nikita Sivukhin
299533b7b6 hide custom modules syntax behind --experimental-custom-modules flag 2025-10-27 12:29:05 +04:00
Pavan-Nambi
8d0ae362da Merge branch 'main' of github.com:tursodatabase/turso into avcm 2025-10-24 18:58:30 +05:30
Jussi Saurio
18e6a23f23 Fix foreign key constraint enforcement on UNIQUE indexes
Closes #3648

Co-authored-by: Pavan-Nambi <pavannambi999@gmail.com>
2025-10-24 11:03:55 +03:00
Jussi Saurio
ae22468d8b Merge 'Order by heap sort' from Nikita Sivukhin
This PR implements simple heap-sort approach for query plans like
`SELECT ... FROM t WHERE ... ORDER BY ... LIMIT N` in order to maintain
small set of top N elements in the ephemeral B-tree and avoid sort and
materialization of whole dataset.
I removed all optimizations not related to this particular change in
order to make branch lightweight.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3726
2025-10-23 15:00:42 +03:00
Jussi Saurio
fe51804e6b Implement crude way of making opening subtransaction conditional
We don't want something like `BEGIN IMMEDIATE` to start a subtransaction,
so instead we will open it if:

- Statement is write, AND

a) Statement has >0 table_references, or
b) The statement is an INSERT (INSERT doesn't track table_references in
   the same way as other program types)
2025-10-22 23:40:45 +03:00
Jussi Saurio
ea98d8086f Change default ON CONFLICT mode back to ABORT now that we support it 2025-10-22 23:40:45 +03:00
Jussi Saurio
ad80285437 Rename is_scope to deferred and invert respective boolean logic
Much clearer name for what it is/does
2025-10-22 23:40:44 +03:00
Jussi Saurio
6557a41503 Refactor emit_fk_violation() to always issue a FkCounter instruction 2025-10-22 23:40:44 +03:00
Nikita Sivukhin
0fb149c4c9 fix bug 2025-10-22 17:44:02 +04:00
Nikita Sivukhin
bf77862fab Merge branch 'main' into order-by-heap-sort 2025-10-22 11:44:55 +04:00
Pekka Enberg
d2d995a9c0 Merge 'Make sure explicit column aliases have binding precedence in orderby' from Pavan Nambi
closes https://github.com/tursodatabase/turso/issues/3684

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3709
2025-10-21 19:04:42 +03:00
Pavan-Nambi
9841f487a6 dont allow autovacuum on nonempty dbs adds a is_db_empty fn 2025-10-18 19:01:21 +05:30
PThorpe92
ddd674c340 Move duplicate table identifier checking to parse_join to allow for natural joins 2025-10-16 18:32:48 -04:00
PThorpe92
10c69b910e Prevent ambiguous self-join table reference 2025-10-16 16:39:10 -04:00
PThorpe92
edaa1b675e Prevent column definitions on CREATE TABLE or opening DB with ON CONFLICT on column def 2025-10-16 15:45:20 -04:00
PThorpe92
04c9eee4f1 Throw parse error on GENERATED constraint when creating new table 2025-10-16 14:27:22 -04:00
Preston Thorpe
b31908fe99 Merge 'translate/select: Fix rewriting Rowid expression when no btree table exists in joined table refs ' from Preston Thorpe
closes https://github.com/tursodatabase/turso/issues/3667

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3754
2025-10-16 14:22:51 -04:00
PThorpe92
e417188cb2 Fix panic when selecting explicit rowid from FROM clause subquery 2025-10-16 13:40:01 -04:00
PThorpe92
bd33b3fa83 Throw parse error on CHECK constraint in create table 2025-10-16 13:07:12 -04:00
Jussi Saurio
e8e583ace6 Default ON CONFLICT behavior should be ROLLBACK 2025-10-16 14:28:18 +03:00
Jussi Saurio
95f375791b refactor: move condition outside init_autoincrement 2025-10-16 09:34:13 +03:00
Jussi Saurio
25339a5200 rename: CheckConstraints -> ConstraintsToCheck
CHECK constraints is a separate SQL concept, so let's remove some
potential confusion from the naming.
2025-10-16 09:30:41 +03:00
PThorpe92
41d2a0af77 Add INSERT OR IGNORE handling and refactor INSERT further 2025-10-15 22:51:10 -04:00
Nikita Sivukhin
4b3689e9e7 avoid doing work in case of heap-sort optimization 2025-10-15 17:27:22 +04:00
Nikita Sivukhin
af4c1e8bd4 use proper register for limit 2025-10-15 17:27:22 +04:00
Nikita Sivukhin
b065e7d380 emit Sequence column for heap-sort in order to distinguish between rows with same order by key and result columns 2025-10-15 17:27:22 +04:00
Nikita Sivukhin
5868270b06 fix clippy 2025-10-15 17:27:22 +04:00
Nikita Sivukhin
1a24139359 fix limit for order by queries with heap-sort style execution 2025-10-15 17:27:22 +04:00
Nikita Sivukhin
7c919314a9 use heap-sort style algorithm for order by ... limit k queries 2025-10-15 17:27:22 +04:00
Jussi Saurio
d7a719418e Fix: outer CTEs should be available in subqueries 2025-10-15 15:15:55 +03:00
Jussi Saurio
4e6f373e3d Merge 'Fix: Evaluating expression in LIMIT and OFFSET clauses.' from
Closes #3687 .
Previously, the `try_fold_expr_to_i64` function casted `NULL` as `0`
when evaluating expressions in `LIMIT` or `OFFSET` clauses. I removed
this function since evaluating the expression directly and relying on
the MustBeInt operation for casting seems to handle everything.

Closes #3695
2025-10-15 10:36:36 +03:00
Jussi Saurio
bae33cb52c Avoid unwrapping failed f64 parsing attempts 2025-10-15 09:47:47 +03:00
Jussi Saurio
b1cb897216 Merge 'Fix another "should have been rewritten" translation panic' from Jussi Saurio
Closes #2158

Closes #3702
2025-10-15 09:25:01 +03:00
Preston Thorpe
74bbb0d5a3 Merge 'Allow using indexes to iterate rows in UPDATE statements' from Jussi Saurio
Closes #2600
## Problem
Every btree has a key it is sorted by - this is the integer `rowid` for
tables and an arbitrary-sized, potentially multi-column key for indexes.
Executing an UPDATE in a loop is not safe if the update modifies any
part of the key of the btree that is used for iterating the rows in said
loop. For example:
- Using the table itself to iterate rows is not safe if the UPDATE
modifies the rowid (or rowid alias) of a row, because since it modifies
the iteration order itself, it may cause rows to be skipped:
```sql
CREATE TABLE t(x INTEGER PRIMARY KEY, y);
INSERT <something>
UPDATE t SET y = RANDOM() where x > 100; // safe to iterate 't', 'y' is not being modified
UPDATE t SET x = RANDOM() where x > 100; // not safe to iterate 't', 'x' is being modified
```
- Using an index to iterate rows is not safe if the UPDATE modifies any
of the columns in the index key
```sql
CREATE TABLE t(x, y, z);
CREATE INDEX txy ON t (x,y);
INSERT <something>
UPDATE t SET z = RANDOM() where x = 100 and y > 0; // safe to iterate txy, neither x or y is being modified
UPDATE t SET x = RANDOM() where x = 100 and y > 0; // not safe to iterate txy, 'x' is being modified
UPDATE t SET y = RANDOM() where x = 100 and y > 0; // not safe to iterate txy, 'y' is being modified
```
## Current solution in tursodb
Our current `main` code recognizes this issue and adopts this pseudocode
algorithm from SQLite:
- open a table or index for reading the rows of the source table,
- for each row that matches the condition in the UPDATE statement, write
the row into a temporary table
- then use that temporary table for iteration in the UPDATE loop.
This guarantees that the iteration order will not be affected by the
UPDATEs because the ephemeral table is not under modification.
## Problem with current solution
Our `main` code specialcases the ephemeral table solution to rowids /
rowid aliases only. Using indexes for UPDATE iteration was disabled in
an earlier PR (#2599) due to the safety issue mentioned above, which
means that many UPDATE statements become full table scans:
```sql
turso> create table t(x PRIMARY KEY);
turso> insert into t select value from generate_series(1,10000);
turso> explain update t set x = x + 100000 where x > 50 and x < 60;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     28    0                    0   Start at 28
1     OpenWrite          0     2     0                    0   root=2; iDb=0
2     OpenWrite          1     3     0                    0   root=3; iDb=0
-- scan entire 't' despite very narrow update range!
3     Rewind             0     27    0                    0   Rewind table t
...
```
## Solution
We move the ephemeral table logic to _after_ the optimizer has selected
the best access path for the table, and then, if the UPDATE modifies the
key of the chosen access path (table or index; whichever was selected by
the optimizer), we change the plan to include the ephemeral table
prepopulation. Hence, the same query from above becomes:
```sql
turso> explain update t set x = x + 100000 where x > 50 and x < 60;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     35    0                    0   Start at 35
1     OpenEphemeral      0     1     0                    0   cursor=0 is_table=true
2     OpenRead           1     3     0                    0   index=sqlite_autoindex_t_1, root=3, iDb=0
3     Integer            50    2     0                    0   r[2]=50
-- index seek on PRIMARY KEY index
4     SeekGT             1     10    2                    0   key=[2..2]
5       Integer          60    2     0                    0   r[2]=60
6       IdxGE            1     10    2                    0   key=[2..2]
7       IdxRowId         1     1     0                    0   r[1]=cursor 1 for index sqlite_autoindex_t_1.rowid
8       Insert           0     3     1     ephemeral_scratch  2   intkey=r[1] data=r[3]
9     Next               1     6     0                    0   
10    OpenWrite          2     2     0                    0   root=2; iDb=0
11    OpenWrite          3     3     0                    0   root=3; iDb=0
-- only scan rows that were inserted to ephemeral index
12    Rewind             0     34    0                    0   Rewind table ephemeral_scratch
13      RowId            0     5     0                    0   r[5]=ephemeral_scratch.rowid
```
Note that an ephemeral index does not have to be used if the index is
not affected:
```sql
turso> create table t(x PRIMARY KEY, data);
turso> explain update t set data = 'some_data' where x > 50 and x < 60;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     15    0                    0   Start at 15
1     OpenWrite          0     2     0                    0   root=2; iDb=0
2     OpenWrite          1     3     0                    0   root=3; iDb=0
3     Integer            50    1     0                    0   r[1]=50
-- direct index seek
4     SeekGT             1     14    1                    0   key=[1..1]
```

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3728
2025-10-14 16:11:25 -04:00
PThorpe92
792877d421 add doc comments to InsertEmitCtx 2025-10-14 13:22:32 -04:00
PThorpe92
20bdb1133d fix clippy warnings 2025-10-14 13:00:31 -04:00
PThorpe92
22e98964cc Refactor INSERT translation to a modular setup with emitter context 2025-10-14 12:48:34 -04:00
Pavan-Nambi
796ff4b2ac resolve explicit aliases for cannonical col binding 2025-10-14 20:46:15 +05:30
Jussi Saurio
b3be21f472 Do not count ephemeral table INSERTs as changes 2025-10-14 16:15:20 +03:00
Jussi Saurio
87434b8a72 Do not count DELETEs occuring in an UPDATE stmt as separate changes 2025-10-14 16:11:43 +03:00
Jussi Saurio
0173d31c04 clippy: collapse nested if 2025-10-14 15:51:31 +03:00
Jussi Saurio
4b80678898 Allow case where cursor for btree is already opened
When populating an ephemeral table for UPDATE, it may open a cursor
on the (permanent) table - in this case we don't need to open it
again in the UPDATE loop
2025-10-14 15:32:48 +03:00
Jussi Saurio
f5ee4807da Properly differentiate between source and target in UPDATE
- Encode information about ephemeral source table in OperationMode::UPDATE
  if present
- Use OperationMode information to correctly resolve cursors in UPDATE
2025-10-14 14:17:28 +03:00
Jussi Saurio
691dce6b8a Make decision about UpdatePlan::ephemeral_plan _after_ optimizer
An ephemeral table is required if the b-tree key of the table (rowid)
or the index (index key) is affected by the UPDATE.
2025-10-14 14:17:28 +03:00