It was mentioned in https://github.com/tursodatabase/turso/pull/3720
that adding Miri support for `turso_stress` would be useful. And, that a
bash script to start Miri with the right config would be a big help.
Notable changes:
- `antithesis_sdk`'s default features are disabled at the workspace
level, and only enabled as needed with the `antithesis` feature flag in
the various turso crates. Miri needs the noop version of
`antithesis_sdk` to run `turso_stress`, and feature unification
previously prevented this. I'm not able to ensure locally that all the
Antithesis stuff is still happy with these changes.
- Bash script to run `turso_stress` - this is barebones for now, see
below
- Bash script to run `simulator` - this passes any args to the `cargo
run` invocation inside, intercepting `--seed` if it's present, and
generating one from `/dev/random` if it's not. The seed is passed to
both Miri and the simulator to keep the overall execution reproducible.
(I checked this with a simple case)
- A `const fn`, `normal_or_miri` to supply different defaults in things
like CLI args for normal operation and Miri, since it's so slow. (An
idea I stole from tokio.) Right now the relevant values are 100x smaller
for Miri, although Miri is probably 1000 to 10,000x slower overall from
a rough estimation.
Caught UB from running `turso_stress` with Miri:
- An unsafe cast of a `*u8` to `*u32` inside the BTree implementation
resulted in the `*u32` making an unaligned read: `read()` ->
`read_unaligned()` fixes this
Future work - Making `turso_stress` reproducible under Miri:
- Right now `turso_stress` is plugged in to Antithesis, which is great!
But, `antithesis_sdk`'s noop mode (`default-features = false`) turns
`antithesis_sdk::random::get_random()` into `rand::random<u64>()`, which
isn't seedable/reproducible. It's more work than I wanted to take on in
this PR, but I'd like to instead conditionally replace `get_random` with
a seedable `ChaCha8Rng` like in the simulator, if Miri is being used.
Comment:
- On a machine without all necessary dependencies, running the bash
scripts fails in a way that cargo prompts you through installing the
nightly toolchain, Miri, etc. until it works
- Below is a snippet of the output from Miri on the Btree alignment
issue. Because turso_stress isn't yet deterministic/reproducible under
Miri, I can't always reproduce it. (It doesn't always happen like the
ones in my last MR)
```
error: Undefined Behavior: accessing memory based on pointer with alignment 1, but alignment 4 is required
--> /home/rwp/git/turso/core/storage/btree.rs:2860:50
|
2860 | let mut pgno: u32 = unsafe { right_pointer.cast::<u32>().read().swap_bytes() };
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Undefined Behavior occurred here
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
```
Closes#3790
closes#3666
and probably other issues i'll have to go digging through to see if
there is any others.
<img width="948" height="445" alt="image" src="https://github.com/user-
attachments/assets/2844e09b-109a-4a70-bd18-d8a814e49ea0" />
Any ALTER COLUMN stmt will now update the constraints on the table
(primary key, foreign key, unique)
Closes#3776
After https://github.com/tursodatabase/turso/pull/3759 was merged, I
went back to the code to see if we could try and avoid this problem in
the future. One way I tried to achieve this is with scoped locking by
forcing all operations on the `SharedWalFile` to go through
`with_shared` and `with_shared_mut`. Also, I noticed some functions
still held locks across IO calls, so I fixed that as well.
If Rust already had`negative_impls` or custom auto traits in stable, I
could try to create a marker trait where you can mark any closure that
contains a `Completion` as IO related and throw a compile error when you
try to execute IO inside it.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#3781
DEFERRED was a bit too deferred - it allowed the dirty pages to be
written out to WAL before checking for violations, resulting in the
violations effectively being committed even though the transaction ended
up aborting
Closes#3784Closes#3785
DEFERRED was a bit too deferred - it allowed the dirty pages to be
written out to WAL before checking for violations, resulting in the
violations effectively being committed even though the transaction
ended up aborting
This adds a new fuzz test case to verify that any query returns the same
results with and without a rowid alias.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2952
When a stress thread runs out of work, execute COMMIT and ignore the
result.
This prevents the currently extremely common scenario where:
- Thread x does BEGIN
- Thread x does e.g. INSERT ...
Then runs out of iterations and stops. No other threads can write
anything and they just wait for 5 seconds (busy timeout) and then try
again, forever.
Closes#3697Closes#3762
Without this change and running:
```
cd stress
while cargo run -- --nr-threads=4 -i 1000 --verbose --busy-timeout=0; do; done
```
I can produce a deadlock quite reliably.
With this change, I can't.
Even with 5 second busy timeout (the default), the run makes progress
although it is slow as hell because of the busy timeout.
Full disclosure: i couldn't figure out based on parking lot RwLock
semantics why this would fix it, so maybe it just lessens the
probability
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3759
When a stress thread runs out of work, execute COMMIT and ignore
the result.
This prevents the currently extremely common scenario where:
- Thread x does BEGIN
- Thread x does e.g. INSERT ...
Then runs out of iterations and stops. No other threads can write
anything and they just wait for 5 seconds (busy timeout) and then
try again, forever.
Without this change and running:
```
cd stress
cargo run -- --nr-threads=4 -i 1000 --verbose --busy-timeout=0
```
I can produce a deadlock quite reliably.
With this change, I can't.
Even with 5 second busy timeout (the default), the run makes progress although it is slow as hell because of the busy timeout.