Commit Graph

5509 Commits

Author SHA1 Message Date
Jussi Saurio
d2f5e67b25 Merge 'Fix COLLATE' from Jussi Saurio
Fixes the following problems with COLLATE:
- Fix: incorrectly used e.g. `x COLLATE NOCASE = 'fOo'` as index
constraint on an index whose column was not case-insensitively collated
- Fix: various ephemeral indexes (in GROUP BY, ORDER BY, DISTINCT) and
subqueries did not retain proper collation information of columns
- Fix: collation of a given expression was not determined properly
according to SQLite's rules
Adds TCL tests and fuzz test
Closes #3476
Closes #1524
Closes #3305

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3538
2025-10-03 09:34:24 +03:00
Pekka Enberg
33e727ce8f Merge 'core/mvcc: Return completions from logical log methods' from Pedro Muniz
`IOResult` implies we have a state machine that needs to be polled to
`Completion`, which is not the case here. We are just emitting the IO
operation in this case. This led us to never reaching the
`IOResult::Done` branch that actually fsynced the logical log in
`Checkpoint`.
I also sprinkled some
```rust
if c.is_completed() {
   Ok(TransitionResult::Continue)
} else {
   Ok(TransitionResult::Io(IOCompletions::Single(c)))
}
```
just to be more efficient with sync IO, but it is not strictly necessary
here.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3549
2025-10-03 09:29:31 +03:00
pedrocarlo
131a5b8048 adjust logical log IO functions to return Completions and not IOResult 2025-10-03 01:44:41 -03:00
Preston Thorpe
990740ab73 Merge 'core/vdbe: Don't clear cursors in ProgramState::reset()' from Pekka Enberg
We don't need to clear the cursors explicitly because OpenRead and
OpenWrite will anyway replace them.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3526
2025-10-02 17:32:05 -04:00
Jussi Saurio
58ea9e4c3c clippy 2025-10-02 21:49:33 +03:00
Jussi Saurio
8e2e557da4 Collate: fix Insn::Compare to use collation seq of each compared column 2025-10-02 21:49:33 +03:00
Jussi Saurio
edd4651b97 Collate: add proper collation info for GROUP BY sorter columns 2025-10-02 21:49:33 +03:00
Jussi Saurio
f02757fe11 Collate: add proper collation to FROM-clause subquery result cols 2025-10-02 21:49:33 +03:00
Jussi Saurio
edfe0cb4fe Collate: prevent using an index if collation sequences don't match 2025-10-02 21:49:33 +03:00
Jussi Saurio
d42f3c7cbb Collate: compute collations properly for ORDER BY 2025-10-02 21:49:33 +03:00
Jussi Saurio
5a5f49933d Collate: add proper collation info to DISTINCT indexes 2025-10-02 21:49:33 +03:00
Jussi Saurio
f4ee0457b2 Collate: add proper collation info to compound select deduplication indexes 2025-10-02 21:49:33 +03:00
Jussi Saurio
e1fcd7b5e9 Collate: add get_collseq_from_expr()
Determines collation sequence to use for a given Expr
based on SQLite collation rules.
2025-10-02 21:49:33 +03:00
PThorpe92
43aba0ee95 Fix integer affinity for rowid expr type 2025-10-02 14:29:53 -04:00
Pekka Enberg
dc1463c70d Merge 'Improve error handling for cyclic views' from Duy Dang
The cycle is detected by marking a seen view, if a seen view is process
again, that's a cycle and we throw an error.
Close #3404

Closes #3467
2025-10-02 19:33:12 +03:00
Pekka Enberg
b11246278f Merge 'Enable encryption properly in Rust bindings, whopper, and throughput tests' from Avinash Sajjanshetty
This is a follow up from PR - #3457 which requires users to opt in to
enable encryption. This patch
- Makes appropriate changes to Whopper and Encryption throughput tests
- Updated Rust bindings to pass the encryption options properly
- Added a test for rust bindings
To use encryption in Rust bindings, one needs to do:
```rust
let opts = EncryptionOpts {
    hexkey: "b1bbfda...02a5669fc76327".to_string(),
    cipher: "aegis256".to_string(),
};

let builder = Builder::new_local(&db_file).experimental_encryption(true).with_encryption(opts.clone());
let db = builder.build().await.unwrap();
```
We will remove the `experimental_encryption` once the feature is stable.

Closes #3532
2025-10-02 18:32:06 +03:00
Pekka Enberg
d3b6adfb2d Merge 'Enable checksums only if its opted in via feature flag' from Avinash Sajjanshetty
Reviewed-by: Nikita Sivukhin (@sivukhin)
Reviewed-by: bit-aloo (@Shourya742)

Closes #3523
2025-10-02 17:26:14 +03:00
Pekka Enberg
3378afe8c6 Merge 'Fix MVCC drop table' from Jussi Saurio
MVCC shuoldn't try to destroy btrees in pager because pager operations
are only done in checkpoint

Closes #3524
2025-10-02 17:26:00 +03:00
Pekka Enberg
78e3311c3b Merge 'Sync engine defered sync' from Nikita Sivukhin
This PR makes sync client completely autonomous as now it can defer
initial sync.
This can open possibility to asynchronously create DB in the Turso Cloud
while giving user ability to interact with local DB straight away.

Closes #3531
2025-10-02 17:25:11 +03:00
Avinash Sajjanshetty
3653c1a853 clear page cache when the encryption context is set 2025-10-02 19:50:12 +05:30
Nikita Sivukhin
c0b6210756 add missed method in the core 2025-10-02 16:19:52 +04:00
Pekka Enberg
16f1c1ac8b core/vdbe: Don't clear cursors in ProgramState::reset()
We don't need to clear the cursors explicitly because OpenRead and
OpenWrite will anyway replace them.
2025-10-02 14:36:46 +03:00
Pekka Enberg
7bfb4dc203 Merge 'Fix MVCC startup infinite loop when using existing DB' from Jussi Saurio
MVCC bootstrap connection got stuck into an infinite statement reparsing
loop because the bootstrap procedure happened before the on-disk schema
was deserialized.
closes #3518

Closes #3522
2025-10-02 14:20:42 +03:00
Jussi Saurio
0e3132d24b Dont try to destroy btree in mvcc mode 2025-10-02 14:12:12 +03:00
Avinash Sajjanshetty
09ba4615ba return appropriate error if checksum was not compiled 2025-10-02 16:11:18 +05:30
Avinash Sajjanshetty
6d7dc6d183 enable checksums only if its opted in via feature flag 2025-10-02 16:01:56 +05:30
Jussi Saurio
3a1851ec06 Fix MVCC startup infinite loop when using existing DB
MVCC bootstrap connection got stuck into an infinite statement
reparsing loop because the bootstrap procedure happened before the
on-disk schema was deserialized.
2025-10-02 13:21:44 +03:00
Pekka Enberg
4c5a7cda08 core/vdbe: Avoid cloning Arc<MvStore> on every VDBE step
The VDBE step() function was taking Arc<MvStore> by value, causing it to
be cloned on every single step of query execution. This resulted in
thousands of atomic reference count increments/decrements per query,
showing up as a major hotspot in profiling.

Changed step() and related functions to take Option<&Arc<MvStore>>
instead, passing a reference rather than cloning the Arc. This eliminates
the unnecessary atomic operations while maintaining the same semantics.
2025-10-02 12:28:11 +03:00
Jussi Saurio
fa6ee6b850 Merge 'Fix: JOIN USING should pick columns from left table, not right' from Jussi Saurio
Closes #3468
Closes #3479

Closes #3485
2025-10-02 10:16:38 +03:00
Jussi Saurio
7360edc169 Merge 'mvcc: dont try to end pager tx on connection close' from Jussi Saurio
closes #3487

Closes #3491
2025-10-02 10:06:23 +03:00
Jussi Saurio
f48165eb72 fix/vdbe: reset op_transaction state properly
essentially after the first runthrough of `op_transaction` per a
given `ProgramState`, we weren't resetting the instruction state
to `Start´ at all, which means we didn't do any transaction state
checking/updating.

PR includes a rust bindings regression test that used to panic before
this change, and I bet it also fixes this issue in turso-go:

https://github.com/tursodatabase/turso-go/issues/28
2025-10-02 08:40:41 +03:00
Jussi Saurio
a9d782e319 Merge 'Add encryption internals docs' from Avinash Sajjanshetty
preview - https://github.com/tursodatabase/turso/blob/8d2ef700c9b087a7e2
904c25052e4365395b33b3/docs/manual.md#encryption-1

Closes #3461
2025-10-02 07:04:16 +03:00
Jussi Saurio
3c9a6993e3 Merge 'core/storage: Apple platforms support' from Charly Delaroche
Closes #3507
2025-10-02 07:01:56 +03:00
Jussi Saurio
e65eae764c Merge 'Resolve appropriate column name for rowid alias/PK' from Preston Thorpe
closes https://github.com/tursodatabase/turso/issues/3512

Closes #3513
2025-10-02 06:59:18 +03:00
Jussi Saurio
9e4ea6ea34 Merge 'core/mvcc/logical-log: fail in read_more_data if couldn't read enough' from Pere Diaz Bou
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3500
2025-10-02 06:58:28 +03:00
Jussi Saurio
bb4e54ca73 Merge 'fix/mvcc: deserialize table_id as i64' from Jussi Saurio
Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #3492
2025-10-02 06:58:01 +03:00
Jussi Saurio
30e6524c4e Fix: JOIN USING should pick columns from left table, not right
Closes #3468
Closes #3479
2025-10-02 06:56:52 +03:00
Jussi Saurio
c0da38e24a Merge 'Clear WhereTerm 'from_outer_join' state when LEFT JOIN is optimized to INNER JOIN' from Jussi Saurio
Closes #3470
## Background
In a query like `SELECT * FROM t LEFT JOIN s ON t.a=s.a WHERE s.a =
'foo'` we can remove the LEFT JOIN and replace it with an `INNER JOIN`
because NULL values will never be equal to 'foo'. Rewriting as `INNER
JOIN` allows the optimizer to also reorder the table join order to come
up with a more efficient query plan. In fact, we have this optimization
already.
## Problem
However, there is a dumb bug where `WhereTerm`s involving this join
still retain their `from_outer_join` state, resulting in forcing the
evaluation of those terms at the original join index, which results in
completely wrong bytecode if the join optimizer decides to reorder the
join as `s JOIN t` instead. Effectively it will evaluate `t.a=s.a` after
table `s` is open but table `t` is not open yet.
## Fix
This PR fixes that issue by clearing `from_outer_join` properly from the
relevant `WhereTerm`s.

Closes #3475
2025-10-02 06:56:07 +03:00
Jussi Saurio
78cccdd87a Merge 'Substr fix UTF-8' from Pedro Muniz
Fixes:
- `start_value` and `length_value` should be casted to integers
- proper handling of utf-8 characters
- do not need to cast blob to string, as substr in blobs refers to byte
indexes and not char-indexes

Closes #3465
2025-10-02 06:55:38 +03:00
PThorpe92
efac598232 Resolve appropriate column name for rowid alias/PK 2025-10-01 21:49:42 -04:00
Preston Thorpe
b310411997 Merge 'printf should truncates floats' from Pavan Nambi
closes https://github.com/tursodatabase/turso/issues/3308

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3415
2025-10-01 19:31:39 -04:00
Avinash Sajjanshetty
ca0d738f4d Add encryption internals docs 2025-10-02 00:14:28 +05:30
Mikaël Francoeur
6307774201 reject FROM clauses 2025-10-01 14:20:23 -04:00
Charly Delaroche
5856dc8733 core/storage: Apple platforms support 2025-10-01 09:59:22 -07:00
Pekka Enberg
02023ce821 Merge 'core/storage: Switch page cache queue to linked list' from Pekka Enberg
The page cache implementation uses a pre-allocated vector (`entries`)
with fixed capacity, along with a custom hash map and freelist. This
design requires expensive upfront allocation when creating a new
connection, which severely impacted performance in workloads that open
many short-lived connections (e.g., our concurrent write benchmarks that
create a new connection per transaction).
Therefore, replace the pre-allocated vector with an intrusive doubly-
linked list. This eliminates the page cache initialization overhead from
connection establishment, but also reduces memory usage to entries that
are actually used. Furthermore, the approach allows us to grow the page
cache with much less overhead.
The patch improves concurrent write throughput benchmark by 4x for
single-threaded performance.
Before:
```
$ write-throughput --threads 1 --batch-size 100 -i 1000 --mode concurrent
Running write throughput benchmark with 1 threads, 100 batch size, 1000 iterations, mode: Concurrent
Database created at: write_throughput_test.db
Thread 0: 100000 inserts in 3.82s (26173.63 inserts/sec)
```
After:
```
$ write-throughput --threads 1 --batch-size 100 -i 1000 --mode concurrent
Running write throughput benchmark with 1 threads, 100 batch size, 1000 iterations, mode: Concurrent
Database created at: write_throughput_test.db
Thread 0: 100000 inserts in 0.90s (110848.46 inserts/sec)
```

Closes #3456
2025-10-01 16:39:47 +03:00
Pere Diaz Bou
b89df44339 fmt 2025-10-01 14:31:30 +02:00
Pere Diaz Bou
dc0245d758 core/mvcc/logical-log: fail in read_more_data if couldn't read enough 2025-10-01 14:25:22 +02:00
Pekka Enberg
2b168cf7b0 core/storage: Switch page cache queue to linked list
The page cache implementation uses a pre-allocated vector (`entries`)
with fixed capacity, along with a custom hash map and freelist. This
design requires expensive upfront allocation when creating a new
connection, which severely impacted performance in workloads that open
many short-lived connections (e.g., our concurrent write benchmarks that
create a new connection per transaction).

Therefore, replace the pre-allocated vector with an intrusive
doubly-linked list. This eliminates the page cache initialization
overhead from connection establishment, but also reduces memory usage to
entries that are actually used. Furthermore, the approach allows us to
grow the page cache with much less overhead.

The patch improves concurrent write throughput benchmark by 4x for
single-threaded performance.

Before:

```
$ write-throughput --threads 1 --batch-size 100 -i 1000 --mode concurrent
Running write throughput benchmark with 1 threads, 100 batch size, 1000 iterations, mode: Concurrent
Database created at: write_throughput_test.db
Thread 0: 100000 inserts in 3.82s (26173.63 inserts/sec)
```

After:

```
$ write-throughput --threads 1 --batch-size 100 -i 1000 --mode concurrent
Running write throughput benchmark with 1 threads, 100 batch size, 1000 iterations, mode: Concurrent
Database created at: write_throughput_test.db
Thread 0: 100000 inserts in 0.90s (110848.46 inserts/sec)
```
2025-10-01 14:41:35 +03:00
Jussi Saurio
ee6b943586 Merge 'fix/mvcc: set log offset to end of file after recovery finishes' from Jussi Saurio
otherwise we start overwriting existing log entries
Closes #3495

Reviewed-by: Nikita Sivukhin (@sivukhin)
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #3496
2025-10-01 13:52:12 +03:00
Jussi Saurio
480d066147 Merge 'mvcc: dont use mv store for ephemeral tables' from Jussi Saurio
not sure how these would even work with mvcc - either way, an ephemeral
table use an ephemeral database file and pager so i don't think putting
its writes into MV store makes sense
TBH i have no idea if there are any weird interactions here but the code
we have now for sure does not work
Closes #3486

Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #3490
2025-10-01 13:50:30 +03:00