This is a follow up from PR - #3457 which requires users to opt in to
enable encryption. This patch
- Makes appropriate changes to Whopper and Encryption throughput tests
- Updated Rust bindings to pass the encryption options properly
- Added a test for rust bindings
To use encryption in Rust bindings, one needs to do:
```rust
let opts = EncryptionOpts {
hexkey: "b1bbfda...02a5669fc76327".to_string(),
cipher: "aegis256".to_string(),
};
let builder = Builder::new_local(&db_file).experimental_encryption(true).with_encryption(opts.clone());
let db = builder.build().await.unwrap();
```
We will remove the `experimental_encryption` once the feature is stable.
Closes#3532
This PR makes sync client completely autonomous as now it can defer
initial sync.
This can open possibility to asynchronously create DB in the Turso Cloud
while giving user ability to interact with local DB straight away.
Closes#3531
MVCC bootstrap connection got stuck into an infinite statement reparsing
loop because the bootstrap procedure happened before the on-disk schema
was deserialized.
closes#3518Closes#3522
MVCC bootstrap connection got stuck into an infinite statement
reparsing loop because the bootstrap procedure happened before the
on-disk schema was deserialized.
The VDBE step() function was taking Arc<MvStore> by value, causing it to
be cloned on every single step of query execution. This resulted in
thousands of atomic reference count increments/decrements per query,
showing up as a major hotspot in profiling.
Changed step() and related functions to take Option<&Arc<MvStore>>
instead, passing a reference rather than cloning the Arc. This eliminates
the unnecessary atomic operations while maintaining the same semantics.
essentially after the first runthrough of `op_transaction` per a
given `ProgramState`, we weren't resetting the instruction state
to `Start´ at all, which means we didn't do any transaction state
checking/updating.
PR includes a rust bindings regression test that used to panic before
this change, and I bet it also fixes this issue in turso-go:
https://github.com/tursodatabase/turso-go/issues/28
Closes#3470
## Background
In a query like `SELECT * FROM t LEFT JOIN s ON t.a=s.a WHERE s.a =
'foo'` we can remove the LEFT JOIN and replace it with an `INNER JOIN`
because NULL values will never be equal to 'foo'. Rewriting as `INNER
JOIN` allows the optimizer to also reorder the table join order to come
up with a more efficient query plan. In fact, we have this optimization
already.
## Problem
However, there is a dumb bug where `WhereTerm`s involving this join
still retain their `from_outer_join` state, resulting in forcing the
evaluation of those terms at the original join index, which results in
completely wrong bytecode if the join optimizer decides to reorder the
join as `s JOIN t` instead. Effectively it will evaluate `t.a=s.a` after
table `s` is open but table `t` is not open yet.
## Fix
This PR fixes that issue by clearing `from_outer_join` properly from the
relevant `WhereTerm`s.
Closes#3475
Fixes:
- `start_value` and `length_value` should be casted to integers
- proper handling of utf-8 characters
- do not need to cast blob to string, as substr in blobs refers to byte
indexes and not char-indexes
Closes#3465
The page cache implementation uses a pre-allocated vector (`entries`)
with fixed capacity, along with a custom hash map and freelist. This
design requires expensive upfront allocation when creating a new
connection, which severely impacted performance in workloads that open
many short-lived connections (e.g., our concurrent write benchmarks that
create a new connection per transaction).
Therefore, replace the pre-allocated vector with an intrusive doubly-
linked list. This eliminates the page cache initialization overhead from
connection establishment, but also reduces memory usage to entries that
are actually used. Furthermore, the approach allows us to grow the page
cache with much less overhead.
The patch improves concurrent write throughput benchmark by 4x for
single-threaded performance.
Before:
```
$ write-throughput --threads 1 --batch-size 100 -i 1000 --mode concurrent
Running write throughput benchmark with 1 threads, 100 batch size, 1000 iterations, mode: Concurrent
Database created at: write_throughput_test.db
Thread 0: 100000 inserts in 3.82s (26173.63 inserts/sec)
```
After:
```
$ write-throughput --threads 1 --batch-size 100 -i 1000 --mode concurrent
Running write throughput benchmark with 1 threads, 100 batch size, 1000 iterations, mode: Concurrent
Database created at: write_throughput_test.db
Thread 0: 100000 inserts in 0.90s (110848.46 inserts/sec)
```
Closes#3456
The page cache implementation uses a pre-allocated vector (`entries`)
with fixed capacity, along with a custom hash map and freelist. This
design requires expensive upfront allocation when creating a new
connection, which severely impacted performance in workloads that open
many short-lived connections (e.g., our concurrent write benchmarks that
create a new connection per transaction).
Therefore, replace the pre-allocated vector with an intrusive
doubly-linked list. This eliminates the page cache initialization
overhead from connection establishment, but also reduces memory usage to
entries that are actually used. Furthermore, the approach allows us to
grow the page cache with much less overhead.
The patch improves concurrent write throughput benchmark by 4x for
single-threaded performance.
Before:
```
$ write-throughput --threads 1 --batch-size 100 -i 1000 --mode concurrent
Running write throughput benchmark with 1 threads, 100 batch size, 1000 iterations, mode: Concurrent
Database created at: write_throughput_test.db
Thread 0: 100000 inserts in 3.82s (26173.63 inserts/sec)
```
After:
```
$ write-throughput --threads 1 --batch-size 100 -i 1000 --mode concurrent
Running write throughput benchmark with 1 threads, 100 batch size, 1000 iterations, mode: Concurrent
Database created at: write_throughput_test.db
Thread 0: 100000 inserts in 0.90s (110848.46 inserts/sec)
```
not sure how these would even work with mvcc - either way, an ephemeral
table use an ephemeral database file and pager so i don't think putting
its writes into MV store makes sense
TBH i have no idea if there are any weird interactions here but the code
we have now for sure does not work
Closes#3486
Reviewed-by: Nikita Sivukhin (@sivukhin)
Closes#3490
Consolidates the `exec_trim`, `exec_rtrim`, `exec_ltrim` code and only
pattern matches on whitespace character.
Fixes#3319
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#3437
There were 2 problems:
1. The SELECT wasn't propagating which register it used for its results,
so sometimes the INSERT read bad data.
2. `TableReferences::contains_table` was only checking the top-level
tables, not the nested tables in FROM queries. This condition is used to
emit "template 4", the bytecode template for self-inserts.
Closes https://github.com/tursodatabase/turso/issues/3312
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#3436
Sqlite has a crazy easter egg where a 1 Gib file offset, it creates a
`PENDING_BYTE_PAGE` that is used only by the VFS layer, and is never
read or written into.
To properly test this, I took inspiration from SQLITE testing framework,
and defined a helper method, that is conditionally compiled with the
`test_helper` feature enabled.
https://github.com/sqlite/sqlite/blob/7e38287da43ea3b661da3d8c1f431aa907
d648c9/src/main.c#L4327
As the `PENDING_BYTE` is normally at the 1 Gib mark, I created a
function that modifies the static `PENDING_BYTE` atomic to whatever
value we want. This means we can test this unusual behaviours at any DB
file size we want.
`fuzz_pending_byte_database` is the test that fuzzes different pending
byte offsets and does an integrity check at the end to confirm, we are
compatible with SQLITE
Closes#2749
<img width="1100" height="740" alt="image" src="https://github.com/user-
attachments/assets/06eb258f-b4b4-47bf-85f9-df1cf411e1df" />
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#3431
**Handle table ID / rootpages properly for both checkpointed and non-
checkpointed tables**
Table ID is an opaque identifier that is only meaningful to the MV
store.
Each checkpointed MVCC table corresponds to a single B-tree on the
pager,
which naturally has a root page.
**We cannot use root page as the MVCC table ID directly because:**
- We assign table IDs during MVCC commit, but
- we commit pages to the pager only during checkpoint
which means the root page is not easily knowable ahead of time.
**Hence:**
- MVCC table ids are always negative
- sqlite_schema rows will have a negative rootpage column if the
table has not been checkpointed yet.
- on checkpoint when the table is allocated a real root page, we update
the row in sqlite_schema and in MV store's internal mapping
**On recovery:**
- All sqlite_schema tables are read directly from disk and assigned
`table_id = -1 * root_page` -- root_page on disk must be positive
- Logical log is deserialized and inserted into MV store
- Schema changes from logical_log are captured into the DB's global
schema
**Note about recovery:**
I changed MVCC recovery to happen on DB initialization which should
prevent any races, so no need for `recover_lock`, right @pereman2 ?
Closes#3419
This PR implements support for `ON CONFLICT` clause chain, e.g.
```
INSERT INTO ct(id, x, y) VALUES (4, 'x', 'y1'), (5, 'a1', 'b'), (3, '_', '_')
ON CONFLICT(x) DO UPDATE SET x = excluded.x || '-' || x, y = excluded.y || '@' || y, z = 'x'
ON CONFLICT(y) DO UPDATE SET x = excluded.x || '+' || x, y = excluded.y || '!' || y, z = 'y'
ON CONFLICT DO UPDATE SET x = excluded.x || '#' || x, y = excluded.y || '%' || y, z = 'fallback';
```
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3453
Closes#2470
In a query like `SELECT * FROM t LEFT JOIN s ON t.a=s.a WHERE s.a = 'foo'` we can
remove the LEFT JOIN because NULL values will be equal to 'foo'. In fact, we have
this optimization already.
However, there was a dumb bug where `WhereTerm`s involving this join still retained
their `from_outer_join` state, resulting in forcing the evaluation of those terms
at the original join index, which results in completely wrong bytecode if the join
optimizer decides to reorder the join as `s JOIN t` instead. Effectively it will
evaluate `t.a=s.a` after table `s` is open but table `t` is not open yet.
This PR fixes that issue by clearing `from_outer_join` properly from the relevant
`WhereTerm`s.