Commit Graph

8754 Commits

Author SHA1 Message Date
Pere Diaz Bou
e87226548c core/mvcc: fix concurrent tests mvcc 2025-09-12 13:49:40 +00:00
Pere Diaz Bou
9b6d181be4 wal: add hacky update max frame for mvcc use
When multiple tx writes happen concurrently in mvcc, max frame will be
updated. This new max_frame makes is the point of view of the other
transaction return busy because his current wal snapshot is outdated.
2025-09-12 13:49:14 +00:00
Pere Diaz Bou
66b5630870 vdbe/mvcc: rollback mvcc txn on vdbe error 2025-09-12 13:47:45 +00:00
Preston Thorpe
b09dcceeef Merge 'Fixes views' from Glauber Costa
This is a collection of fixes for materialized views ahead of adding
support for JOINs.
It is mostly issues with how we assume there is a single table, with a
single delta, but we have to send more than one.
Those are things that are just objectively wrong, so I am sending it
separately to make the JOIN PR smaller.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3009
2025-09-12 07:43:32 -04:00
Preston Thorpe
16a3410934 Merge 'Fix checkpoint fast-path, don't use cached pages w/o write lock' from Preston Thorpe
closes #3024
Don't use pages from the cache unless we hold an exclusive write lock,
because a page could be updated by a writer in-memory at any point
before we backfill it.
Clear the WAL tag in other areas to prevent any stale tags. Also, we
will just snapshot the page when we determine that it's eligible, and
pay a memcpy instead of the read from disk, but this further prevents
any in-memory changes to the page/TOCTOU issues, and we also assert that
it's still eligible after we copy it to a new buffer.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3036
2025-09-12 07:39:32 -04:00
Preston Thorpe
f55023acc8 Merge 'Refactor UPSERT to use wal_expr_mut to walk AST.' from Preston Thorpe
Working on https://github.com/tursodatabase/turso/issues/2964 I came
upon `walk_expr_mut`, I don't think it existed last time I really spent
much time in the translator. So quickly went back and cleaned this up.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3044
2025-09-12 06:45:13 -04:00
PThorpe92
f60ca3970f Remove old comment from wal 2025-09-12 06:39:59 -04:00
PThorpe92
faf3531a4e Fix checkpoint fast-path, don't use cached pages w/o write lock
closes #3024
Also we snapshot the page when we determine that it's eligible, and pay a
memcpy instead of the read from disk, but this further prevents any in-memory
changes to the page/TOCTOU issues.
2025-09-12 06:38:02 -04:00
Pekka Enberg
6a992e551c Merge 'core: Fix reprepare to properly reset statement cursors and registers' from Pedro Muniz
Before we were not updating the number of registers and cursors, which
meant that on a schema change the Program could now open an additional
cursor and we would not have space for it in the ProgramState, which
lead to the panic.
Closes #3002

Closes #3034
2025-09-12 12:29:53 +03:00
Pekka Enberg
162c3a5644 Merge 'Commit uncommitted whopper lockfile' from Jussi Saurio
Closes #3048
2025-09-12 10:13:41 +03:00
Jussi Saurio
4f7ffa0e62 Commit uncommitted whopper lockfile 2025-09-12 08:43:59 +03:00
Pekka Enberg
aa32574554 core/mvcc: Fix begin_exclusive_tx()
The RwLock elimination patches conflicted with the BEGIN CONCURRENT
changes.
2025-09-12 08:42:14 +03:00
Pekka Enberg
a9a48f6272 Merge 'core/schema: Optimize get_dependent_materialized_views() when no views' from Pekka Enberg
Eliminates get_dependent_materialized_views() overhead when there are no
views. Note that we need to optimize the case when there are views as
well because this ends up being pretty hot in write-intensive workloads.

Closes #3046
2025-09-12 08:29:24 +03:00
Pekka Enberg
a349e7684a Merge 'perf: Add simple throughput benchmark' from Pekka Enberg
This adds a simple throughput benchmark for rusqlite and Turso, allowing
to compare the two, but also MVCC and SQLite transactions.

Closes #3047
2025-09-12 08:29:14 +03:00
Pekka Enberg
06371d8894 Merge 'Add BEGIN CONCURRENT support for MVCC mode' from Pekka Enberg
Currently, when MVCC is enabled, every transaction mode supports
concurrent reads and writes, which makes it hard to adopt for existing
applications that use `BEGIN DEFERRED` or `BEGIN IMMEDIATE`.
Therefore, add support for `BEGIN CONCURRENT` transactions when MVCC is
enabled. The transaction mode allows multiple concurrent read/write
transactions that don't block each other, with conflicts resolved at
commit time. Furthermore, implement the correct semantics for `BEGIN
DEFERRED` and `BEGIN IMMEDIATE` by taking advantage of the pager level
write lock when transaction upgrades to write. This means that now
concurrent MVCC transactions are serialized against the legacy ones when
needed.
The implementation includes:
- Parser support for CONCURRENT keyword in BEGIN statements
- New Concurrent variant in TransactionMode to distinguish from regular
read/write transactions
- MVCC store tracking of exclusive transactions to support IMMEDIATE and
EXCLUSIVE modes alongside CONCURRENT
- Proper transaction state management for all transaction types in MVCC
This enables better concurrency for applications that can handle
optimistic concurrency control, while still supporting traditional
SQLite transaction semantics via IMMEDIATE and EXCLUSIVE modes.

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #3021
2025-09-12 07:38:53 +03:00
Pekka Enberg
964dd0cd43 perf: Add simple throughput benchmark
This adds a simple throughput benchmark for rusqlite and Turso, allowing
to compare the two, but also MVCC and SQLite transactions.
2025-09-12 07:35:57 +03:00
Pekka Enberg
d80814fa2c core/schema: Optimize get_dependent_materialized_views() when no views
Eliminates get_dependent_materialized_views() overhead when there are no
views. Note that we need to optimize the case when there are views as
well because this ends up being pretty hot in write-intensive workloads.
2025-09-12 07:22:18 +03:00
Preston Thorpe
f9f7a44955 Merge 'add explicit usize type annotation to range iterator in test' from Denizhan Dakılır
very small fix when i was reading the codebase with rust-analyser while
trying to find a bug for simulator.
original error:
`non-primitive cast: <Range<i32> as Iterator>::Item as i32 rust-analyzer
E0605`

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3043
2025-09-11 21:12:32 -04:00
PThorpe92
36425b2ada Refactor UPSERT to use wal_expr_mut to walk AST.
Working on https://github.com/tursodatabase/turso/issues/2964 I came
upon `walk_expr_mut`, I don't think it existed last time I really spent
much time in the translator. So quickly went back and cleaned this up.
2025-09-11 21:08:11 -04:00
Denizhan Dakılır
70102f5f6e add explicit usize type annotation to range iterator in test 2025-09-12 02:18:49 +03:00
Preston Thorpe
e9944f5d1f Merge 'Fix automatic indexes' from Jussi Saurio
Closes #2993
## Background
When a `CREATE TABLE` statement specifies constraints like `col UNIQUE`,
`col PRIMARY KEY`, `UNIQUE (col1, col2)`, `PRIMARY KEY(col3, col4)`,
SQLite creates indexes for these constraints automatically with the
naming scheme `sqlite_autoindex_<table_name>_<increasing_number>`.
## Problem
SQLite expects these indexes to be created in table definition order.
For example:
```sql
CREATE TABLE t(x UNIQUE, y PRIMARY KEY, c, d, UNIQUE(c,d));
```
Should result in:
```sql
sqlite_autoindex_t_1 -- x UNIQUE
sqlite_autoindex_t_2 -- y PRIMARY KEY
sqlite_autoindex_t_3-- UNIQUE(c,d)
```
However, `tursodb` currently doesn't uphold this invariant -- for
example: the PRIMARY KEY index is always constructed first. SQLite flags
this as a corruption error (see #2993).
## Solution
- Process "unique sets" in table definition order. "Unique sets" are
groups of 1-n columns that are part of either a UNIQUE or a PRIMARY KEY
constraint.
- Deduplicate unique sets properly: a PRIMARY KEY of a rowid alias
(INTEGER PRIMARY KEY) is not a unique set. `UNIQUE (a desc, b)` and
`PRIMARY KEY(a, b)` are a single unique set, not two.
- Unify logic for creating automatic indexes and parsing them - remove
separate logic in `check_automatic_pk_index_required()` and use the
existing `create_table()` utility in both index creation and
deserialization.
- Deserialize a single automatic index per unique set, and assert that
`unique_sets.len() == autoindexes.len()`.
- Verify consistent behavior by adding a fuzz tests that creates 1000
databases with 1 table each and runs `PRAGMA integrity_check` on all of
them with SQLite.
## Trivia
Apart from fixing the exact issue #2993, this PR also fixes other bugs
related to autoindex construction - namely cases where too many indexes
were created due to improper deduplication of unique sets.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3018
2025-09-11 17:04:53 -04:00
pedrocarlo
dbb7d6f532 reprepare optimization using reset() 2025-09-11 16:59:53 -03:00
pedrocarlo
c04cf535b0 flip is_done to is_busy 2025-09-11 14:58:51 -03:00
Pekka Enberg
5b9e849415 Merge 'core/mvcc: Eliminate RwLock wrapping Transaction' from Pekka Enberg
The write and read sets in Transaction use SkipSet, which is thread-
safe. Therefore, drop the RwLock wrapping Transaction everywhere,
increasing MVCC throughput by almost 30%.
Before:
```
Running write throughput benchmark with 1 threads, 1000 batch size, 1000 iterations, mode: Mvcc
Database created at: write_throughput_test.db
Thread 0: 1000000 inserts in 6.50s (153927.21 inserts/sec)

=== BENCHMARK RESULTS ===
Total inserts: 1000000
Total time: 6.50s
Overall throughput: 153758.85 inserts/sec
Threads: 1
Batch size: 1000
Iterations per thread: 1000
```
After:
```
Running write throughput benchmark with 1 threads, 1000 batch size, 1000 iterations, mode: Mvcc
Database created at: write_throughput_test.db
Thread 0: 1000000 inserts in 5.10s (195927.13 inserts/sec)

=== BENCHMARK RESULTS ===
Total inserts: 1000000
Total time: 5.11s
Overall throughput: 195663.94 inserts/sec
Threads: 1
Batch size: 1000
Iterations per thread: 1000
```

Closes #3035
2025-09-11 20:55:14 +03:00
Pekka Enberg
45288b1297 core/mvcc: Eliminate RwLock wrapping Transaction
The write and read sets in Transaction use SkipSet, which is thread-safe.
Therefore, drop the RwLock wrapping Transaction everywhere, increasing
MVCC throughput by almost 30%.

Before:

```
Running write throughput benchmark with 1 threads, 1000 batch size, 1000 iterations, mode: Mvcc
Database created at: write_throughput_test.db
Thread 0: 1000000 inserts in 6.50s (153927.21 inserts/sec)

=== BENCHMARK RESULTS ===
Total inserts: 1000000
Total time: 6.50s
Overall throughput: 153758.85 inserts/sec
Threads: 1
Batch size: 1000
Iterations per thread: 1000
```

After:

```
Running write throughput benchmark with 1 threads, 1000 batch size, 1000 iterations, mode: Mvcc
Database created at: write_throughput_test.db
Thread 0: 1000000 inserts in 5.10s (195927.13 inserts/sec)

=== BENCHMARK RESULTS ===
Total inserts: 1000000
Total time: 5.11s
Overall throughput: 195663.94 inserts/sec
Threads: 1
Batch size: 1000
Iterations per thread: 1000
```
2025-09-11 20:31:19 +03:00
pedrocarlo
6264d694d5 on reprepare create new state with updated number of cursors and
registers, so that the Program insns are in sync with ProgramState
2025-09-11 12:50:22 -03:00
Pekka Enberg
1559cd127e Merge ' bindings/java: PreparedStatement executeUpdate ' from zongkx
JDBC Driver  Add PreparedStatement executeUpdate return

Closes #3022
2025-09-11 18:43:58 +03:00
Pekka Enberg
61c5b4530c Merge 'handle EXPLAIN like sqlite' from Lâm Hoàng Phúc
we are hard coding `EXPLAIN` for debugging
```sh
turso> EXPLAIN SELECT 1; EXPLAIN SELECT 1;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     3     0                    0   Start at 3
1     ResultRow          1     1     0                    0   output=r[1]
2     Halt               0     0     0                    0
3     Integer            1     1     0                    0   r[1]=1
4     Goto               0     1     0                    0
```
```sh
sqlite> EXPLAIN SELECT 1; EXPLAIN SELECT 1;
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     4     0                    0   Start at 4
1     Integer        1     1     0                    0   r[1]=1
2     ResultRow      1     1     0                    0   output=r[1]
3     Halt           0     0     0                    0
4     Goto           0     1     0                    0
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     4     0                    0   Start at 4
1     Integer        1     1     0                    0   r[1]=1
2     ResultRow      1     1     0                    0   output=r[1]
3     Halt           0     0     0                    0
4     Goto           0     1     0                    0
```

Closes #3005
2025-09-11 18:43:24 +03:00
Pekka Enberg
7d8a1a0d5f Merge 'whopper: A new DST with concurrency' from Pekka Enberg
Our simulator is currently limited to concurrency of one. This
introduces a much less sophisticated DST with focus on finding
concurrency bugs.

Closes #2985
2025-09-11 18:42:45 +03:00
Pekka Enberg
453ca6c531 Merge 'Document DEFERRED and IMMEDIATE transaction modes' from Pekka Enberg
Closes #3011
2025-09-11 18:21:17 +03:00
Pekka Enberg
ebd9da4369 Merge 'Fix tx isolation test semantics after #3023' from Jussi Saurio
After the fix in #3023, the transaction isolation fuzz test now
incorrectly takes a shadow snapshot of the DB state too early - before
it is determined that the connection successfully started a read
transaction.
Fix: take the snapshot after we've verified that the read TX started.
Closes #3025

Closes #3026
2025-09-11 18:21:00 +03:00
Jussi Saurio
6a7bead482 Fix tx isolation test semantics after #3023
The test now incorrectly takes a shadow snapshot of the DB state
before it is determined that the connection successfully started
a read transaction.

Fix: take the snapshot after we've verified that the read TX started.
2025-09-11 16:44:28 +03:00
Jussi Saurio
aeb3c217e1 Merge 'Fix: read transaction cannot be allowed to start with a stale max frame' from Jussi Saurio
If both of the following are true:
1. All read locks are already held
2. The highest readmark of any read lock is less than the committed max
frame
Then we must return Busy to the reader, because otherwise they would
begin a transaction with a stale local max frame, and thus not see some
committed changes.
Closes #3016

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3023
2025-09-11 16:20:05 +03:00
Pekka Enberg
433b60555f Add BEGIN CONCURRENT support for MVCC mode
Currently, when MVCC is enabled, every transaction mode supports
concurrent reads and writes, which makes it hard to adopt for existing
applications that use `BEGIN DEFERRED` or `BEGIN IMMEDIATE`.

Therefore, add support for `BEGIN CONCURRENT` transactions when MVCC is
enabled. The transaction mode allows multiple concurrent read/write
transactions that don't block each other, with conflicts resolved at
commit time. Furthermore, implement the correct semantics for `BEGIN
DEFERRED` and `BEGIN IMMEDIATE` by taking advantage of the pager level
write lock when transaction upgrades to write. This means that now
concurrent MVCC transactions are serialized against the legacy ones when
needed.

The implementation includes:

- Parser support for CONCURRENT keyword in BEGIN statements

- New Concurrent variant in TransactionMode to distinguish from regular
  read/write transactions

- MVCC store tracking of exclusive transactions to support IMMEDIATE and
  EXCLUSIVE modes alongside CONCURRENT

- Proper transaction state management for all transaction types in MVCC

This enables better concurrency for applications that can handle
optimistic concurrency control, while still supporting traditional
SQLite transaction semantics via IMMEDIATE and EXCLUSIVE modes.
2025-09-11 16:05:52 +03:00
Jussi Saurio
c30d320cab Fix: read transaction cannot be allowed to start with a stale max frame
If both of the following are true:

1. All read locks are already held
2. The highest readmark of any read lock is less than the committed max frame

Then we must return Busy to the reader, because otherwise they would begin a
transaction with a stale local max frame, and thus not see some committed
changes.
2025-09-11 15:58:13 +03:00
Glauber Costa
874047276e views: pass a DeltaSet for merge_delta
A DeltaSet is a collection of Deltas, one per table.
We'll need that for joins. The populate step for now will still generate
a single set. That will be our next step to fix.
2025-09-11 05:30:46 -07:00
Glauber Costa
841de334b7 view: catch all tables mentioned, instead of just one.
Ahead of the implementation of JOINs, we need to evolve the
IncrementalView, which currently only accepts a single base table,
to keep a list of tables mentioned in the statement.
2025-09-11 05:30:46 -07:00
Glauber Costa
98ed6c2b0e keep alias in logical plan
We have been ignoring the alias in the logical plan, but we have to keep
it. Implementing joins in particular is made hard without it, because it
is common that one has the same column name in different tables, just
differentiated by the alias
2025-09-11 05:30:46 -07:00
Glauber Costa
c15ac87a3c fix cursor validation
We are validating that the weights on the materialized view table are
-1, 0, and 1. This is only true for the aggregator operator. For DBSP
in general, any number will do.

Our algorithm, however, would have deleted anything from the BTree that
is <= 0. So we don't expect them here.
2025-09-11 05:30:46 -07:00
Glauber Costa
e6008e532a Add a second delta to the EvalState, Commit
We will assert that the second one is always empty for the existing
operators - as they should be!

But joins will need both.
2025-09-11 05:30:46 -07:00
Glauber Costa
6541a43670 move hashable_row to dbsp.rs
There will be a new type for joins, so it makes less sense to have
a separate file just for it. dbsp.rs is good.
2025-09-11 05:30:46 -07:00
Glauber Costa
1fd345f382 unify code used for persistence.
We have code written for BTree (ZSet) persistence in both compiler.rs
and operator.rs, because there are minor differences between them. With
joins coming, it is time to unify this code.
2025-09-11 05:30:46 -07:00
Glauber Costa
8997670936 include dbsp tables in the list of tables that cannot be modified 2025-09-11 05:30:46 -07:00
zongkx
92e211f278 Merge remote-tracking branch 'origin/main' 2025-09-11 12:26:52 +00:00
zongkx
d7096bdd28 fix executeUpdate updated count 2025-09-11 12:25:14 +00:00
zongkx
22cbd3a02c Merge branch 'tursodatabase:main' into main 2025-09-11 20:18:11 +08:00
zongkx
5d6e97b46b add executeUpdate updated count 2025-09-11 12:17:05 +00:00
TcMits
4c17fa87c5 remove .explain() 2025-09-11 18:28:46 +07:00
TcMits
68e8d5a36b clippy 2025-09-11 18:16:01 +07:00
TcMits
830e10da8f resolve merge conflict 2025-09-11 18:13:29 +07:00