Commit Graph

10285 Commits

Author SHA1 Message Date
Preston Thorpe
7cc351afff Merge 'translate/insert: more refactoring and support INSERT OR IGNORE' from Preston Thorpe
`INSERT OR IGNORE INTO t VALUES (...)` can trivially be rewritten to
`INSERT INTO t VALUES (..) ON CONFLICT DO NOTHING`
This PR does this rewriting, as well as finishes a large refactor on
INSERT translation in general.. I just need a break from the rest of
this feature tbh.. just was getting under my skin and I have been in
`translate` land for too long.

Closes #3742
2025-10-16 08:18:28 -04:00
Jussi Saurio
e8e583ace6 Default ON CONFLICT behavior should be ROLLBACK 2025-10-16 14:28:18 +03:00
Pekka Enberg
e9c0fdcb4b Turso 0.3.0-pre.2 2025-10-16 11:31:30 +03:00
Pekka Enberg
a450a43d6d dist: Add Linux/arm64 target for install package 2025-10-16 11:30:30 +03:00
Pekka Enberg
b64ce77e47 Merge 'core: Don't run build.rs in debug mode' from Pedro Muniz
Hopefully this will help a bit with compile times when developing and
with `cargo check`

Closes #3744
2025-10-16 10:52:03 +03:00
Pekka Enberg
f8e8ed6228 Merge 'Run SQLite integrity check after stress test run' from Pedro Muniz
Closes #2916

Closes #3730
2025-10-16 10:07:09 +03:00
Jussi Saurio
95f375791b refactor: move condition outside init_autoincrement 2025-10-16 09:34:13 +03:00
Jussi Saurio
25339a5200 rename: CheckConstraints -> ConstraintsToCheck
CHECK constraints is a separate SQL concept, so let's remove some
potential confusion from the naming.
2025-10-16 09:30:41 +03:00
Pekka Enberg
012ac00e46 Merge 'Document ThreadSanitizer in CONTRIBUTING.md' from Pekka Enberg
Closes #3739
2025-10-16 08:43:33 +03:00
pedrocarlo
2a1be48f3a do not run build.rs on debug mode 2025-10-16 01:22:54 -03:00
PThorpe92
3112f55e05 Add TCL tests for INSERT OR IGNORE handling 2025-10-15 22:51:10 -04:00
PThorpe92
41d2a0af77 Add INSERT OR IGNORE handling and refactor INSERT further 2025-10-15 22:51:10 -04:00
Preston Thorpe
0ae53e6270 Merge 'tests/fuzz: Accept SEED env var for all fuzz tests' from Preston Thorpe
Closes #3741
2025-10-15 22:50:57 -04:00
PThorpe92
48eb456a12 Accept SEED env var for all fuzz tests 2025-10-15 20:08:13 -04:00
Preston Thorpe
4873103660 Merge 'Fix: outer CTEs should be available in subqueries' from Jussi Saurio
Closes #3670

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3737
2025-10-15 19:07:42 -04:00
Pekka Enberg
6fe9ea0925 Merge 'Make Rust bindings actually async' from Pedro Muniz
This PR introduces a `Context` object that is stored in the `Completion`
that currently only stores a `Waker`. In the future, I want to add some
sort of abort signal so that we can abort tasks that share the same
Context. To pass the Waker, I introduced a `step_with_waker` function in
`Statement` that delegates to an internal `_step` function. `_step` is
the previous `step` but just with the `Option<&Waker>` argument.
I was going to try and have the BusyHandler by truly async as well, but
I decided to not do it here, because it will be slightly complicated to
achieve.

Closes #3535
2025-10-15 19:38:24 +03:00
Pekka Enberg
08f010b969 Document ThreadSanitizer in CONTRIBUTING.md 2025-10-15 18:16:31 +03:00
Jussi Saurio
d7a719418e Fix: outer CTEs should be available in subqueries 2025-10-15 15:15:55 +03:00
Jussi Saurio
4e6f373e3d Merge 'Fix: Evaluating expression in LIMIT and OFFSET clauses.' from
Closes #3687 .
Previously, the `try_fold_expr_to_i64` function casted `NULL` as `0`
when evaluating expressions in `LIMIT` or `OFFSET` clauses. I removed
this function since evaluating the expression directly and relying on
the MustBeInt operation for casting seems to handle everything.

Closes #3695
2025-10-15 10:36:36 +03:00
Pekka Enberg
d3aad90820 Merge 'perf/throughput: force sqlite to use fullfsync' from Pedro Muniz
I was sampling our performance and noticed that In the throughput test,
we were only setting the `PRAGMA synchronous = full` in `setup_database`
and not in the connection. Also we were not setting `PRAGMA fullfsync =
true`.
Now SQLite numbers and Turso numbers are much closer:
`Turso,1,1000,0,215095.94`
vs
`SQLite,1,1000,0,219748.39`
Specs: M2 Air 8gb
related: #3027

Closes #3734
2025-10-15 10:24:13 +03:00
Jussi Saurio
bae33cb52c Avoid unwrapping failed f64 parsing attempts 2025-10-15 09:47:47 +03:00
Jussi Saurio
25cf56b8e8 Fix expected error message 2025-10-15 09:41:44 +03:00
Pekka Enberg
ab2911b370 Merge 'Fix change counter incrementation' from Jussi Saurio
We merged two concurrent fixes to `nchange` handling last night and
AFAICT the fix in #3692 was incorrect because it doesn't count UPDATEs
in cases where the original row was DELETEd as part of the UPDATE
statement.
The correct fix was in 87434b8
EDIT: okay, it's not strictly _incorrect_ in #3692 I guess, I just think
it's more intuitive to increment the change for UPDATE in the `insert`
opcode because that's what performs the actual update.

Closes #3735
2025-10-15 09:28:17 +03:00
Jussi Saurio
0d8a3dda8c Merge 'sql_generation: Fix implementation of LTValue and GTValue for Text types' from Jussi Saurio
## Background
Simulator wants to create predicates that it knows will be Greater or
Less than some known value. It uses `LTValue` and `GTValue` for
generating these.
## Problem
Current implementation simply decrements or increments a random char by
1, and can thus generate strings with control characters like null
terminators that result in parse errors, as seen in e.g. this CI run htt
ps://github.com/tursodatabase/turso/actions/runs/18459131141/job/5258630
5749?pr=3702 of PR #3702
EDIT: I realized the _actual_ problem is in `GTValue` when it decides to
make the string longer, it uses a random char value from `0..255` which
can include null terminators etc. Fixed that too. I think in general
this PR's approach is a bit more predictable so let's keep it.
## Solution
Restrict string mutations to ascii string characters so that the
mutation always results in another ascii string character.

Closes #3708
2025-10-15 09:25:17 +03:00
Jussi Saurio
b1cb897216 Merge 'Fix another "should have been rewritten" translation panic' from Jussi Saurio
Closes #2158

Closes #3702
2025-10-15 09:25:01 +03:00
Jussi Saurio
2791f2f479 Fix change counter incrementation
We merged two concurrent fixes to `nchange` handling last night and
AFAICT the fix in #3692 was incorrect because it doesn't count UPDATEs
in cases where the original row was DELETEd as part of the UPDATE
statement.

The correct fix was in 87434b8
2025-10-15 08:51:27 +03:00
Preston Thorpe
e5a74b347a Merge 'relax check in the vector test' from Nikita Sivukhin
- fixes https://github.com/tursodatabase/turso/issues/3732

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #3733
2025-10-14 16:14:33 -04:00
Preston Thorpe
74bbb0d5a3 Merge 'Allow using indexes to iterate rows in UPDATE statements' from Jussi Saurio
Closes #2600
## Problem
Every btree has a key it is sorted by - this is the integer `rowid` for
tables and an arbitrary-sized, potentially multi-column key for indexes.
Executing an UPDATE in a loop is not safe if the update modifies any
part of the key of the btree that is used for iterating the rows in said
loop. For example:
- Using the table itself to iterate rows is not safe if the UPDATE
modifies the rowid (or rowid alias) of a row, because since it modifies
the iteration order itself, it may cause rows to be skipped:
```sql
CREATE TABLE t(x INTEGER PRIMARY KEY, y);
INSERT <something>
UPDATE t SET y = RANDOM() where x > 100; // safe to iterate 't', 'y' is not being modified
UPDATE t SET x = RANDOM() where x > 100; // not safe to iterate 't', 'x' is being modified
```
- Using an index to iterate rows is not safe if the UPDATE modifies any
of the columns in the index key
```sql
CREATE TABLE t(x, y, z);
CREATE INDEX txy ON t (x,y);
INSERT <something>
UPDATE t SET z = RANDOM() where x = 100 and y > 0; // safe to iterate txy, neither x or y is being modified
UPDATE t SET x = RANDOM() where x = 100 and y > 0; // not safe to iterate txy, 'x' is being modified
UPDATE t SET y = RANDOM() where x = 100 and y > 0; // not safe to iterate txy, 'y' is being modified
```
## Current solution in tursodb
Our current `main` code recognizes this issue and adopts this pseudocode
algorithm from SQLite:
- open a table or index for reading the rows of the source table,
- for each row that matches the condition in the UPDATE statement, write
the row into a temporary table
- then use that temporary table for iteration in the UPDATE loop.
This guarantees that the iteration order will not be affected by the
UPDATEs because the ephemeral table is not under modification.
## Problem with current solution
Our `main` code specialcases the ephemeral table solution to rowids /
rowid aliases only. Using indexes for UPDATE iteration was disabled in
an earlier PR (#2599) due to the safety issue mentioned above, which
means that many UPDATE statements become full table scans:
```sql
turso> create table t(x PRIMARY KEY);
turso> insert into t select value from generate_series(1,10000);
turso> explain update t set x = x + 100000 where x > 50 and x < 60;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     28    0                    0   Start at 28
1     OpenWrite          0     2     0                    0   root=2; iDb=0
2     OpenWrite          1     3     0                    0   root=3; iDb=0
-- scan entire 't' despite very narrow update range!
3     Rewind             0     27    0                    0   Rewind table t
...
```
## Solution
We move the ephemeral table logic to _after_ the optimizer has selected
the best access path for the table, and then, if the UPDATE modifies the
key of the chosen access path (table or index; whichever was selected by
the optimizer), we change the plan to include the ephemeral table
prepopulation. Hence, the same query from above becomes:
```sql
turso> explain update t set x = x + 100000 where x > 50 and x < 60;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     35    0                    0   Start at 35
1     OpenEphemeral      0     1     0                    0   cursor=0 is_table=true
2     OpenRead           1     3     0                    0   index=sqlite_autoindex_t_1, root=3, iDb=0
3     Integer            50    2     0                    0   r[2]=50
-- index seek on PRIMARY KEY index
4     SeekGT             1     10    2                    0   key=[2..2]
5       Integer          60    2     0                    0   r[2]=60
6       IdxGE            1     10    2                    0   key=[2..2]
7       IdxRowId         1     1     0                    0   r[1]=cursor 1 for index sqlite_autoindex_t_1.rowid
8       Insert           0     3     1     ephemeral_scratch  2   intkey=r[1] data=r[3]
9     Next               1     6     0                    0   
10    OpenWrite          2     2     0                    0   root=2; iDb=0
11    OpenWrite          3     3     0                    0   root=3; iDb=0
-- only scan rows that were inserted to ephemeral index
12    Rewind             0     34    0                    0   Rewind table ephemeral_scratch
13      RowId            0     5     0                    0   r[5]=ephemeral_scratch.rowid
```
Note that an ephemeral index does not have to be used if the index is
not affected:
```sql
turso> create table t(x PRIMARY KEY, data);
turso> explain update t set data = 'some_data' where x > 50 and x < 60;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     15    0                    0   Start at 15
1     OpenWrite          0     2     0                    0   root=2; iDb=0
2     OpenWrite          1     3     0                    0   root=3; iDb=0
3     Integer            50    1     0                    0   r[1]=50
-- direct index seek
4     SeekGT             1     14    1                    0   key=[1..1]
```

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3728
2025-10-14 16:11:25 -04:00
Preston Thorpe
a6b0778fb3 Merge 'Refactor INSERT translation to a modular setup with emitter context' from Preston Thorpe
This PR contains NO semantic changes at all, this simply refactors
existing INSERT code to be easier to reason about.
Very sorry I know I've been working on `INSERT OR IGNORE|REPLACE|etc..`
for days now but the insert translation was literally unbearable and I
got it working but I barely could wrap my head around that whole
`translate_insert` function, so I spent a bunch of time refactoring the
whole INSERT handling out into different "Plans"...  which turned into a
whole different clusterf***... So I just went back and made the existing
insert emission more modular and created some context that can make it
easier to reason about.
This should be able to just be merged quickly

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3731
2025-10-14 16:10:43 -04:00
pedrocarlo
aabc7b87a4 perf/throughput force sqlite to use fullfsync 2025-10-14 15:50:45 -03:00
Pere Diaz Bou
1a464664a7 Merge 'increment Changes() only once conditionally ' from Pavan Nambi
closes #3688

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #3692
2025-10-14 20:26:04 +02:00
Nikita Sivukhin
427a145663 fmt 2025-10-14 22:22:14 +04:00
Pere Diaz Bou
a2097188f0 Merge 'make comparison case sensitive' from Pavan Nambi
closes https://github.com/tursodatabase/turso/issues/3672

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #3686
2025-10-14 20:20:02 +02:00
Nikita Sivukhin
9dac7e00ba relax check in the vector test
- fixes https://github.com/tursodatabase/turso/issues/3732
2025-10-14 22:19:19 +04:00
PThorpe92
792877d421 add doc comments to InsertEmitCtx 2025-10-14 13:22:32 -04:00
PThorpe92
20bdb1133d fix clippy warnings 2025-10-14 13:00:31 -04:00
pedrocarlo
d3bb8beb17 Run SQLite integrity check after stress test run 2025-10-14 13:50:50 -03:00
PThorpe92
22e98964cc Refactor INSERT translation to a modular setup with emitter context 2025-10-14 12:48:34 -04:00
pedrocarlo
818a68b3dd ignore busy errors for test_concurrent_unique_constraint_regression 2025-10-14 12:33:36 -03:00
pedrocarlo
23380a58d7 make next truly async and non blocking 2025-10-14 12:33:36 -03:00
pedrocarlo
ff955aeee9 simplify clock code by using a common struct 2025-10-14 12:33:36 -03:00
pedrocarlo
943ade7293 pass waker to completion for more efficient task scheduling 2025-10-14 12:33:36 -03:00
pedrocarlo
0d95a2924a pass optional waker to step 2025-10-14 12:33:36 -03:00
pedrocarlo
e64aa5d014 add tokio console to write-throughput test 2025-10-14 12:33:36 -03:00
Jussi Saurio
3cbdf433a9 fuzz: update multiple columns in table_index_mutation_fuzz 2025-10-14 17:26:21 +03:00
Jussi Saurio
0ae4425e4c fuzz: create multi-column indices in table_index_mutation_fuzz 2025-10-14 17:23:21 +03:00
Jussi Saurio
b3b07252dc Add TCL smoke tests for UPDATEs affecting indexes 2025-10-14 16:25:05 +03:00
Jussi Saurio
b3be21f472 Do not count ephemeral table INSERTs as changes 2025-10-14 16:15:20 +03:00
Jussi Saurio
87434b8a72 Do not count DELETEs occuring in an UPDATE stmt as separate changes 2025-10-14 16:11:43 +03:00
Jussi Saurio
0173d31c04 clippy: collapse nested if 2025-10-14 15:51:31 +03:00