Commit Graph

8910 Commits

Author SHA1 Message Date
Jussi Saurio
139ce39a00 mvcc: fix logic bug in MvStore::insert_version_raw()
In insert_version_raw(), we correctly iterate the versions backwards
because we want to find the newest version that is still older than
the one we are inserting.

However, the order of `.enumerate()` and `.rev()` was wrong, so the
insertion position was calculated based on the position in the
_reversed_ iterator, not the original iterator.
2025-09-16 12:56:17 +03:00
Jussi Saurio
847e413c34 mvcc: assert that DeleteRowStateMachine must find the row it is deleting 2025-09-16 12:56:17 +03:00
Jussi Saurio
ea6373b8ae Switch to BTreeMap for deterministic iteration 2025-09-16 12:56:17 +03:00
Pekka Enberg
74331898a3 Merge 'Add quoted identifier test cases for ALTER TABLE' from Levy A.
Resolves #2093
There is a small incompatibility on how we quote the added column on the
final schema, but doesn't change any behavior.

Closes #2943
2025-09-16 11:46:12 +03:00
Pekka Enberg
4b12ce954a Merge 'core/mvcc: Specify level for tracing' from Pekka Enberg
..otherwise we perform the tracing for every step() dropping write
throughput by 40%.

Closes #3145
2025-09-16 10:44:17 +03:00
Pekka Enberg
b625e73355 Merge 'Switch to GitHub runners for performance workflows' from Diego Reis
Blacksmith runners have a lot of variance in performance, making it hard
for Nyrkiö to do its job. Discussed on [Discord](https://discord.com/cha
nnels/1258658826257961020/1402269486752469085)

Reviewed-by: Henrik Ingo <henrik@nyrk.io>

Closes #2448
2025-09-16 10:40:08 +03:00
Pekka Enberg
3c62352bcb core/mvcc: Specify level for tracing
..otherwise we perform the tracing for every step() dropping write
throughput by 40%.
2025-09-16 09:51:08 +03:00
Pekka Enberg
950cb8a818 Merge 'Move common dependencies to workspace ' from Pedro Muniz
This removes 4 crates from the `cargo build` and tries to ensure that in
the future we avoid the same crates with different versions.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3141
2025-09-16 08:30:06 +03:00
Preston Thorpe
3aa477f819 Merge 'fix re-entrancy issue in Pager::free_page' from Jussi Saurio
current logic can lead to a situation where:
- we call read_page(trunk_page_id)
- we assign trunk_page in the FreePageState state machine
- the page read fails and cache marks it as !locked && !loaded
- next call to Pager::free_page() asserts that the page is loaded and
panics
Whopper takes so long to run that i wasn't patient enough, but i'm
pretty sure this closes #3101

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3139
2025-09-15 19:12:28 -04:00
pedrocarlo
3c91ae206b move as many dependencies as possible to workspace to avoid multiple versions of the same dependency 2025-09-15 17:19:36 -03:00
Jussi Saurio
d2d1d1bc61 fix re-entrancy issue in Pager::free_page
current logic can lead to a situation where:

- we call read_page(trunk_page_id)
- we assign trunk_page in the FreePageState state machine
- the page read fails and cache marks it as !locked && !loaded
- next call to Pager::free_page() asserts that the page is loaded and panics
2025-09-15 21:41:18 +03:00
Pekka Enberg
0dcd38a3c3 Merge 'stress: Retry sync on error to avoid a panic' from Pekka Enberg
We now panic on fsync error by default to be safe against fsyncgate.
However, no reason to do that in the stress tester, especially since we
test out of disk space errors under Antithesis.

Closes #3131
2025-09-15 19:03:46 +03:00
Pekka Enberg
bfce9e02a0 Merge 'move divider_cell_is_overflow_cell to debug assertions' from Pedro Muniz
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3135
2025-09-15 17:42:55 +03:00
pedrocarlo
7021386f86 move divider_cell_is_overflow_cell to debug assertions so it stops appearing in release builds 2025-09-15 11:11:28 -03:00
Pekka Enberg
e79dfd2f50 Merge 'Fix SharedWalFile deadlock in multithreaded context' from Jussi Saurio
Fixes `write-throughput` benchmark deadlocking on 2 threads or more. The
gist of the PR is in the big code comment:
```rust
            // important not to hold shared lock beyond this point to avoid deadlock scenario where:
            // thread 1: takes readlock here, passes reference to shared.file to begin_read_wal_frame
            // thread 2: tries to acquire write lock elsewhere
            // thread 1: tries to re-acquire read lock in the completion (see 'complete' above)
            //
            // this causes a deadlock due to the locking policy in parking_lot:
            // from https://docs.rs/parking_lot/latest/parking_lot/type.RwLock.html:
            // "This lock uses a task-fair locking policy which avoids both reader and writer starvation.
            // This means that readers trying to acquire the lock will block even if the lock is unlocked
            // when there are writers waiting to acquire the lock.
            // Because of this, attempts to recursively acquire a read lock within a single thread may result in a deadlock."
 ```

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #3132
2025-09-15 15:25:04 +03:00
Jussi Saurio
32cd01a615 fix deadlock 2025-09-15 14:48:26 +03:00
Jussi Saurio
d493a72cc0 dont unwrap begin_tx 2025-09-15 14:48:26 +03:00
Jussi Saurio
26c0d72c25 perf/thrpt: add tracing 2025-09-15 14:25:18 +03:00
Pekka Enberg
247d4c06c6 Merge 'Fix MVCC update' from Jussi Saurio
Based on #3126
Closes #3029
Closes #3030
Closes #3065
Closes #3083
Closes #3084
Closes #3085
simple reason why mvcc update didn't work: it didn't try to update.

Closes #3127
2025-09-15 14:24:59 +03:00
Pekka Enberg
a5eac9b700 Merge 'avoid unnecessary cloning when formatting Txn for Display' from Avinash Sajjanshetty
Closes #3109
2025-09-15 14:24:32 +03:00
Pekka Enberg
dd06d2eb99 Merge 'add perf/throughput/rusqlite to workspace' from Pedro Muniz
Closes #3116
2025-09-15 14:24:13 +03:00
Pekka Enberg
244458199f Merge 'Various fixes to sync' from Nikita Sivukhin
This PR fixes incorrect path registration for sync in browser, add tests
and also expose revision string in the `stats()` method of synced
database

Closes #3124
2025-09-15 14:24:02 +03:00
Pekka Enberg
eeab6d5ce0 stress: Retry sync on error to avoid a panic
We now panic on fsync error by default to be safe against fsyncgate.
However, no reason to do that in the stress tester, especially since we
test out of disk space errors under Antithesis.
2025-09-15 14:21:53 +03:00
Pekka Enberg
877b28bcb3 perf/throughput/turso: Use 30 second busy timeout like in rusqlite 2025-09-15 13:57:58 +03:00
Pekka Enberg
380b27f58a Merge 'Busy handler' from Pedro Muniz
I searched using deepwiki how SQLite implements their busy handler. They
use a callback system with exponential backoff, where it stores the
callback in the pager and in the database. I confess I found this
slightly confusing, so I just implemented a simple exponential backoff
directly in the `Statement` struct. I imagine SQLite does this in a more
convoluted manner, as they do not have a concept of yielding as we do.
https://deepwiki.com/search/where-is-the-code-for-the-
busy_4a5ed006-4eed-479f-80c3-dd038832831b
I also fixed the rust bindings so that it yields when we return
`StepResult::IO`, instead of just blocking the async function. To
achieve this I implemented the `Stream` trait for `Rows` struct, which
unfortunately came with a slight change to the function signature of
`rows.next()` to `rows.try_next()`.
EDIT:
~test `test_multiple_connections_fuzz` timeouts because now it has the
busy handler "slowing" things down (this test generates a lot of busy
transactions), so it takes a lot longer for the test to run. Not sure if
it is acceptable for us to reduce the number of operations so the test
is shorter.~
EDIT:
Adjusted the API to be more in line with
https://www.sqlite.org/c3ref/busy_timeout.html.
Sets maximum total accumulated timeout. If the duration is None or Zero,
we unset the busy handler for this Connection.
This api defers slightly from SQLite as instead of sleeping for linear
amount of time specified by the user, we will sleep in phases until the
the total amount of time requested is reached. This means we first sleep
of 1ms, then if we still return busy, we sleep for 2 ms, and repeat
until a maximum of 100 ms per phase or we reached the total timeout.
Example:
1. Set duration to 5ms
2. Step through query -> returns Busy -> sleep/yield for 1 ms
3. Step through query -> returns Busy -> sleep/yield for 2 ms
4. Step through query -> returns Busy -> sleep/yield for 2 ms (totaling
5 ms of sleep)
5. Step through query -> returns Busy -> return Busy to user
This slight api change demonstrated a better throughtput in
`perf/throughput/turso` benchmark
```sh
cargo run -p write-throughput --release -- -t 2

Running write throughput benchmark with 2 threads, 100 batch size, 10 iterations, mode: Legacy
Database created at: write_throughput_test.db
Thread 1: 1000 inserts in 0.04s (23438.42 inserts/sec)
Thread 0: 1000 inserts in 0.08s (12385.64 inserts/sec)

=== BENCHMARK RESULTS ===
Total inserts: 2000
Total time: 0.08s
Overall throughput: 24762.60 inserts/sec
Threads: 2
Batch size: 100
Iterations per thread: 10
Database file exists: true
Database file size: 4096 bytes
```
Depends on #3102
Closes #3067

Closes #3074
2025-09-15 13:52:49 +03:00
Pekka Enberg
07c580aadf Merge 'mvcc: fix hang when CONCURRENT tx tries to commit and non-CONCURRENT tx is active' from Jussi Saurio
Based on #3125
Closes #3120

Closes #3126
2025-09-15 11:45:30 +03:00
Jussi Saurio
61764bf415 clippy 2025-09-15 11:37:17 +03:00
Jussi Saurio
1fa57b2dec add test demonstrating that issue 3085 can be closed 2025-09-15 11:36:19 +03:00
Jussi Saurio
88856de48e fmt 2025-09-15 11:33:15 +03:00
Jussi Saurio
f2dbf1eeb0 add test demonstrating that issue 3084 can be closed 2025-09-15 11:32:39 +03:00
Jussi Saurio
d643bb2092 add test that demonstrates issue 3083 can be closed 2025-09-15 11:30:56 +03:00
Jussi Saurio
59f18e2dc8 fix mvcc update
simple reason why mvcc update didn't work: it didn't try to update.
2025-09-15 11:27:56 +03:00
Pekka Enberg
54c79b879b Merge 'mvcc: fix two sources of panic' from Jussi Saurio
1. commit state machine was assuming that begin_write_tx() cannot fail,
but it can fail if there is another tx that is not using BEGIN
CONCURRENT.
2. if a brand new non-CONCURRENT transaction attempts to start exclusive
transaction but fails with Busy, we must end the read pager read tx it
just started, because otherwise the next time it attempts to do
something it will panic with: `"cannot start a new read tx without
ending an existing one"`

Closes #3125
2025-09-15 11:26:05 +03:00
Jussi Saurio
aa7a853cd2 mvcc: fix hang when CONCURRENT tx tries to commit and non-CONCURRENT tx is active 2025-09-15 11:09:19 +03:00
Jussi Saurio
9234ef86ae mvcc: fix two sources of panic
1. commit state machine was assuming that begin_write_tx() cannot
fail, but it can fail if there is another tx that is not using
BEGIN CONCURRENT.

2. if a brand new non-CONCURRENT transaction attempts to start
exclusive transaction but fails with Busy, we must end the read
pager read tx it just started, because otherwise the next time
it attempts to do something it will panic with:

"cannot start a new read tx without ending an existing one"
2025-09-15 10:59:44 +03:00
Nikita Sivukhin
3bcac441e4 reduce log level of some very frequent logs 2025-09-15 11:35:41 +04:00
Pekka Enberg
eb3f17a0a9 Merge 'Fix MVCC rollback' from Jussi Saurio
Closes #3119
Closes #3121
executing ROLLBACK did not rollback the mv-store transaction

Closes #3123
2025-09-15 10:05:59 +03:00
Nikita Sivukhin
9b5656d4dc fix stats method 2025-09-15 11:05:49 +04:00
Nikita Sivukhin
23e8204bfc yarn build 2025-09-15 10:57:03 +04:00
Nikita Sivukhin
e8b076ebe5 export SyncEngineStats type 2025-09-15 10:56:44 +04:00
Nikita Sivukhin
527d0cb1f3 expose revision in the stats method 2025-09-15 10:56:13 +04:00
Nikita Sivukhin
ebf042cf6b refine error message 2025-09-15 10:55:43 +04:00
Nikita Sivukhin
aa65c910bf fix sync-browser bug and add more tests 2025-09-15 10:55:01 +04:00
Jussi Saurio
8f43741513 fix mvcc rollback
executing ROLLBACK did not rollback the mv-store transaction
2025-09-15 09:29:08 +03:00
Jussi Saurio
0e7eecc7a1 Merge 'test/fuzz: improve maintainability/usability of tx isolation test' from Jussi Saurio
**test/fuzz: introduce fuzzoptions to tx isolation test**
this makes it significantly easier to tweak the tx isolation test
parameters, and also makes it much easier to run the MVCC version of the
test without manually tweaking code inline to make it work.
introduces default options for the non-mvcc and mvcc test variants.
---
**test/fuzz: improve error handling in tx isolation fuzz test**
- extract out common behavior for checking acceptable errors
- add functionality to check which errors require rolling back
  a transaction

Closes #3118
2025-09-15 09:21:49 +03:00
pedrocarlo
bd5dcd8d3c add timeout flag to throughput benchmark 2025-09-15 02:20:32 -03:00
pedrocarlo
3d265489dc modify semantics of busy_timeout to be more on par with sqlite 2025-09-15 02:20:32 -03:00
pedrocarlo
0586b75fbe expose function to set busy timeout duration 2025-09-15 02:20:32 -03:00
Pekka Enberg
246799c603 Fix simulator and Antithesis Docker images 2025-09-15 08:16:38 +03:00
pedrocarlo
16e79ed508 slight adjustment in perf throughtput printing 2025-09-15 02:16:18 -03:00