Commit Graph

10419 Commits

Author SHA1 Message Date
Pekka Enberg
1aba105df4 Merge 'Vector speedup' from Nikita Sivukhin
This PR reduces allocation for vector distance calculation and also use
[simsimd](https://github.com/ashvardanian/SimSIMD) library to execute
cosine/l2 distances for dense vectors.

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #3802
2025-10-22 12:28:43 +03:00
Jussi Saurio
ffc601b4b0 Merge 'Return better syntax error messages' from Diego Reis
Current error messages are too "low level", e.g returning tokens in
messages. This PR improves this a bit.
Before:
```text
 turso> with t as (select * from pragma_schema_version); select c.schema_version from t as c;

  × unexpected token at SourceSpan { offset: SourceOffset(47), length: 1 }
   ╭────
 1 │ with t as (select * from pragma_schema_version); select c.schema_version from t as c;
   ·                                                ┬
   ·                                                ╰── here
   ╰────
  help: expected [TK_SELECT, TK_VALUES, TK_UPDATE, TK_DELETE, TK_INSERT, TK_REPLACE] but found TK_SEMI
```
Now:
```text
 turso> with t as (select * from pragma_schema_version); select c.schema_version from t as c;

  × unexpected token ';' at offset 47
   ╭────
 1 │ with t as (select * from pragma_schema_version);select c.schema_version from t as c;
   ·                                                ┬
   ·                                                ╰── here
   ╰────
  help: expected SELECT, VALUES, UPDATE, DELETE, INSERT, or REPLACE but found ';'
  ```
@TcMits WDYT?

Closes #3190
2025-10-22 10:57:54 +03:00
Nikita Sivukhin
7d423d358f avoid unnecessary time measures 2025-10-22 11:08:46 +04:00
Pekka Enberg
d45a8da2f2 Merge 'parser: translate boolean values to literal when parsing column constraints' from Preston Thorpe
closes #3796

Closes #3804
2025-10-22 09:11:03 +03:00
Pekka Enberg
1aad1b224a Merge 'core/io: Make random generation deterministically simulated' from Pedro Muniz
Depends on #3584 to use the most up-to-date implementation of
`ThreadRng`
- Add `fill_bytes` method to `IO`
- use `thread_rng` instead of `getrandom`, as `getrandom` is much slower
and `thread_rng` offers enough security
- modify `exec_randomblob`, `exec_random` and random_rowid generation to
use methods from IO for determinism
- modified simulator IO to implement `fill_bytes`
This the PRNG for sqlite if someone is curious. It is similar to
`thread_rng`:
```c
/* Initialize the state of the random number generator once,
  ** the first time this routine is called.
  */
  if( wsdPrng.s[0]==0 ){
    sqlite3_vfs *pVfs = sqlite3_vfs_find(0);
    static const u32 chacha20_init[] = {
      0x61707865, 0x3320646e, 0x79622d32, 0x6b206574
    };
    memcpy(&wsdPrng.s[0], chacha20_init, 16);
    if( NEVER(pVfs==0) ){
      memset(&wsdPrng.s[4], 0, 44);
    }else{
      sqlite3OsRandomness(pVfs, 44, (char*)&wsdPrng.s[4]);
    }
    wsdPrng.s[15] = wsdPrng.s[12];
    wsdPrng.s[12] = 0;
    wsdPrng.n = 0;
  }

  assert( N>0 );
  while( 1 /* exit by break */ ){
    if( N<=wsdPrng.n ){
      memcpy(zBuf, &wsdPrng.out[wsdPrng.n-N], N);
      wsdPrng.n -= N;
      break;
    }
    if( wsdPrng.n>0 ){
      memcpy(zBuf, wsdPrng.out, wsdPrng.n);
      N -= wsdPrng.n;
      zBuf += wsdPrng.n;
    }
    wsdPrng.s[12]++;
    chacha_block((u32*)wsdPrng.out, wsdPrng.s);
    wsdPrng.n = 64;
  }
  sqlite3_mutex_leave(mutex);
```

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #3799
2025-10-22 09:10:36 +03:00
Pekka Enberg
9005458aa2 Merge 'antithesis-tests: Don't fail tests on ProgrammingError' from Pekka Enberg
The ProgrammingError exception is thrown when tables, indexes, or
columns are dropped in parallel. Let's not fail the Antithesis test
drivers when that happens.

Closes #3803
2025-10-22 09:10:30 +03:00
PThorpe92
2f401c0bcc Add regression tcl test for #3796 default bool col constraints 2025-10-21 21:22:09 -04:00
PThorpe92
892fcc881d Handle TRUE|FALSE literal case for default column constraint in the parser 2025-10-21 21:15:35 -04:00
Pere Diaz Bou
3227caaa1d Merge 'core: move BTreeCursor under MVCC cursor' from Pere Diaz Bou
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3756
2025-10-21 19:20:49 +02:00
Nikita Sivukhin
10ead9f3b6 one more clippy fix 2025-10-21 21:10:58 +04:00
pedrocarlo
72baf48863 add random generation in simulator IO 2025-10-21 14:10:38 -03:00
pedrocarlo
8c0b9c6979 add additional fill_bytes method to IO to deterministically generate
random bytes and modify random functions to use them
2025-10-21 14:10:38 -03:00
pedrocarlo
8501bc930a use workspace rand version 2025-10-21 14:10:05 -03:00
Pekka Enberg
4805c6b06b antithesis-tests: Don't fail tests on ProgrammingError
The ProgrammingError exception is thrown when tables, indexes, or
columns are dropped in parallel. Let's not fail the Antithesis test
drivers when that happens.
2025-10-21 20:08:11 +03:00
Pekka Enberg
eb835c39d4 Merge 'core/vdbe: fix ALTER COLUMN to propagate constraints to other table references' from Preston Thorpe
with fix to ensure `PRIMARY KEY` isn't written twice

Closes #3798
2025-10-21 20:05:41 +03:00
Nikita Sivukhin
792e0033ae fix tests and clippy 2025-10-21 21:03:45 +04:00
Pekka Enberg
177e4f39a8 Merge 'Move completion code to separate file' from Pedro Muniz
We have almost 1000 lines of code related to Completions the
`io/mod.rs`. My intention here is to declutter the file so that we can
focus more on what is relevant for the `IO` and `File` trait

Closes #3800
2025-10-21 20:03:40 +03:00
Pekka Enberg
edea108037 Merge 'avoid unnecessary allocations' from Nikita Sivukhin
Closes #3801
2025-10-21 20:03:25 +03:00
Pere Diaz Bou
128b426681 core/mvcc/cursor: imports 2025-10-21 18:22:37 +02:00
Pere Diaz Bou
92c0e74458 core/mvcc/cursor: implement seek_to_last 2025-10-21 18:22:37 +02:00
Pere Diaz Bou
8857604161 core/mvcc/cursor: fix rewind 2025-10-21 18:22:37 +02:00
Pere Diaz Bou
3f41a092f2 core/mvcc/cursor: add next rowid lock 2025-10-21 18:22:37 +02:00
Pere Diaz Bou
0fee588bca core/mvcc/cursor: add record cursor 2025-10-21 18:22:37 +02:00
Pere Diaz Bou
790859c62f core/mvcc/cursor: fix exists 2025-10-21 18:22:37 +02:00
Pere Diaz Bou
edac1ff256 core/mvcc/cursor: set null flag 2025-10-21 18:22:37 +02:00
Pere Diaz Bou
ea04e9033a core/mvcc: add btree_cursor under MVCC cursor 2025-10-21 18:22:37 +02:00
Nikita Sivukhin
00e382a7c7 avoid unnecessary allocations 2025-10-21 20:13:39 +04:00
Pekka Enberg
d2d995a9c0 Merge 'Make sure explicit column aliases have binding precedence in orderby' from Pavan Nambi
closes https://github.com/tursodatabase/turso/issues/3684

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3709
2025-10-21 19:04:42 +03:00
pedrocarlo
a327747531 organize completion code in a separate file 2025-10-21 12:43:49 -03:00
Pekka Enberg
51c5d6a66c Merge 'tests/integration: Reduce collation fuzz test iterations' from Pekka Enberg
The collation fuzz test case takes up to 4 minutes to run, making it the
slowest of all the test cases. Let's reduce iteration count a bit to
make this more CI friendly.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3797
2025-10-21 18:41:10 +03:00
PThorpe92
08197e345a Fix cdc test to assert for correct schema output 2025-10-21 11:22:29 -04:00
PThorpe92
c48d7a0963 Add tcl tests for alter column fixes 2025-10-21 10:47:08 -04:00
PThorpe92
2fbd4b7cec Ensure op_alter_column and Func::AlterColumn are fixing table references to columns with fk's 2025-10-21 10:46:52 -04:00
PThorpe92
06e3b9611b Add helpers to rewrite REFERENCES from foriegn keys in ColumnDefinition 2025-10-21 10:40:31 -04:00
Pekka Enberg
05bd75275f tests/integration: Reduce collation fuzz test iterations
The collation fuzz test case takes up to 4 minutes to run, making it the
slowest of all the test cases. Let's reduce iteration count a bit to
make this more CI friendly.
2025-10-21 17:29:11 +03:00
PThorpe92
7c746e476c Fix to_sql method on BTreeTable to not double write primary keys 2025-10-21 09:45:42 -04:00
Pekka Enberg
16d1398586 Merge 'Switch random blob creation to get_random' from Pedro Muniz
Also added a benchmark for inserting randomblobs. On my machine, I get
something like a 20% increase in performance.

Closes #3787
2025-10-21 16:04:55 +03:00
Pekka Enberg
74d4dd53dc Merge 'Fix git directory resolution in simulator to support worktrees' from Jussi Saurio
sim cannot be run in a git worktree on main

Closes #3794
2025-10-21 16:03:58 +03:00
Pekka Enberg
6139dde081 Revert "Merge 'core/translate: fix ALTER COLUMN to propagate other constraint references' from Preston Thorpe"
This reverts commit 1151f49ff4, reversing
changes made to f4da2194f4.
2025-10-21 16:00:04 +03:00
Nikita Sivukhin
2483d08bca do not allocate if possible 2025-10-21 16:28:00 +04:00
Jussi Saurio
b67fabdd62 Fix git directory resolution in simulator to support worktrees
sim cannot be run in a git worktree on main
2025-10-21 14:34:27 +03:00
Nikita Sivukhin
948bd557cd use simsimd for dense operations 2025-10-21 14:59:01 +04:00
Pekka Enberg
f764f3061d Merge 'Add Miri support for turso_stress, with bash scripts to run' from Bob Peterson
It was mentioned in https://github.com/tursodatabase/turso/pull/3720
that adding Miri support for `turso_stress` would be useful. And, that a
bash script to start Miri with the right config would be a big help.
Notable changes:
- `antithesis_sdk`'s default features are disabled at the workspace
level, and only enabled as needed with the `antithesis` feature flag in
the various turso crates. Miri needs the noop version of
`antithesis_sdk` to run `turso_stress`, and feature unification
previously prevented this. I'm not able to ensure locally that all the
Antithesis stuff is still happy with these changes.
- Bash script to run `turso_stress` - this is barebones for now, see
below
- Bash script to run `simulator` - this passes any args to the `cargo
run` invocation inside, intercepting `--seed` if it's present, and
generating one from `/dev/random` if it's not. The seed is passed to
both Miri and the simulator to keep the overall execution reproducible.
(I checked this with a simple case)
- A `const fn`, `normal_or_miri` to supply different defaults in things
like CLI args for normal operation and Miri, since it's so slow. (An
idea I stole from tokio.) Right now the relevant values are 100x smaller
for Miri, although Miri is probably 1000 to 10,000x slower overall from
a rough estimation.
Caught UB from running `turso_stress` with Miri:
- An unsafe cast of a `*u8` to `*u32` inside the BTree implementation
resulted in the `*u32` making an unaligned read: `read()` ->
`read_unaligned()` fixes this
Future work - Making `turso_stress` reproducible under Miri:
- Right now `turso_stress` is plugged in to Antithesis, which is great!
But, `antithesis_sdk`'s noop mode (`default-features = false`) turns
`antithesis_sdk::random::get_random()` into `rand::random<u64>()`, which
isn't seedable/reproducible. It's more work than I wanted to take on in
this PR, but I'd like to instead conditionally replace `get_random` with
a seedable `ChaCha8Rng` like in the simulator, if Miri is being used.
Comment:
- On a machine without all necessary dependencies, running the bash
scripts fails in a way that cargo prompts you through installing the
nightly toolchain, Miri, etc. until it works
- Below is a snippet of the output from Miri on the Btree alignment
issue. Because turso_stress isn't yet deterministic/reproducible under
Miri, I can't always reproduce it. (It doesn't always happen like the
ones in my last MR)
```
error: Undefined Behavior: accessing memory based on pointer with alignment 1, but alignment 4 is required
    --> /home/rwp/git/turso/core/storage/btree.rs:2860:50
     |
2860 |                     let mut pgno: u32 = unsafe { right_pointer.cast::<u32>().read().swap_bytes() };
     |                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Undefined Behavior occurred here
     |
     = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
     = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information

```

Closes #3790
2025-10-21 11:53:49 +03:00
Pekka Enberg
1151f49ff4 Merge 'core/translate: fix ALTER COLUMN to propagate other constraint references' from Preston Thorpe
closes #3666
and probably other issues i'll have to go digging through to see if
there is any others.
<img width="948" height="445" alt="image" src="https://github.com/user-
attachments/assets/2844e09b-109a-4a70-bd18-d8a814e49ea0" />
Any ALTER COLUMN stmt will now update the constraints on the table
(primary key, foreign key, unique)

Closes #3776
2025-10-21 11:53:42 +03:00
Pekka Enberg
f4da2194f4 Merge 'Shared WAL lock scoping' from Pedro Muniz
After https://github.com/tursodatabase/turso/pull/3759 was merged, I
went back to the code to see if we could try and avoid this problem in
the future. One way I tried to achieve this is with scoped locking by
forcing all operations on the `SharedWalFile` to go through
`with_shared` and `with_shared_mut`. Also, I noticed some functions
still held locks across IO calls, so I fixed that as well.
If Rust already had`negative_impls` or custom auto traits in stable, I
could try to create a marker trait where you can mark any closure that
contains a `Completion` as IO related and throw a compile error when you
try to execute IO inside it.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3781
2025-10-21 09:05:40 +03:00
Bob Peterson
b92f4cb9c4 Make Miri easier to run 2025-10-20 23:48:19 -05:00
pedrocarlo
a614b51ebf change randomblob generation to use thread_rng 2025-10-21 01:41:40 -03:00
Bob Peterson
2cb0a9b34b Use read_unaligned with *u8 cast to *u32
Avoids undefined behavior due to unaligned read caught with Miri
2025-10-20 22:50:57 -05:00
Bob Peterson
5d7b057b8a Enable turso_stress to run in Miri
antithesis_sdk needs to have default features disabled in the workspace
so turso_stress is free to select the noop implementation for Miri
2025-10-20 22:50:44 -05:00
pedrocarlo
baf649affb add insert randomblob benchmark 2025-10-20 14:47:47 -03:00