Closes#2478
```console
turso>create table t(a,b);
turso>insert into t select 1,2 from generate_series(1,20);
turso>create index idxa on t(a);
turso>create index idxb on t(b);
# segfault
```
The issue was the `turso_ext::Conn` pointer was stored on the
`VirtualTable`, which lives longer than necessary and the underlying
core `Connection` is not guaranteed pinned. Moving the `turso_ext::Conn`
on to the cursor fixes this issue because it's independent of the
Schema.
This also fixes the issue that comes up when that issue is fixed, and
some of the relevant code for this hadn't been updated when other
surrounding changes had been made.
Closes#2479
This PR rewrites `turso-sync` package introduced in the #2334 and
renames it to the `turso-sync-engine` (anyway the diff will be
unreadable).
The purpose of rewrite is to get rid of Tokio because this makes things
harder when we wants to export bindings to WASM.
In order to achieve "runtime"-agnostic sync core but still be able to
use async/await machiner - this PR introduce usage of `genawaiter` crate
which allows to transfer async/await Rust state machines to the
generators. So, sync operations are just generators which can yield `IO`
command in case where there is a need for it.
Also, this PR introduces separate `ProtocolIo` in the `turso-sync-
engine` which defines extra IO methods:
1. HTTP interaction
2. Atomic read/writes to the file. This is not strictly necessary and
`turso_core::IO` methods can be extended to support few more things
(like `delete`/`rename`) - but I decided that it will be simpler to just
expose 2 more methods for sync protocol for the sake of atomic metadata
update (which is very small - dozens of bytes).
* As a bonus, we can store metadata for browser in the
`LocalStorage` which may be more natural thing to do(?) (user can reset
everything by just clearing local storage)
The `ProtocolIo` works similarly to the `IO` in a sense that it gives
the caller `Completion` which it can check periodically for new data.
Closes#2457
**86%** performance improvement. We are **25x** faster than SQLite.
<img width="953" height="511" alt="image" src="https://github.com/user-
attachments/assets/fd717d1e-bbbe-4959-ae48-41afc73e5e9f" />
```
ALTER TABLE _ DROP COLUMN _`/limbo_drop_column/
time: [1.8821 ms 1.8929 ms 1.9047 ms]
change: [-86.850% -86.733% -86.614%] (p = 0.00 < 0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
3 (3.00%) high mild
2 (2.00%) high severe
Benchmarking `ALTER TABLE _ DROP COLUMN _`/sqlite_drop_column/: Warming up for 3.0000 s
`ALTER TABLE _ DROP COLUMN _`/sqlite_drop_column/
time: [46.227 ms 46.258 ms 46.291 ms]
change: [-1.3202% -1.0505% -0.8109%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 15 outliers among 100 measurements (15.00%)
10 (10.00%) high mild
5 (5.00%) high severe
```
Closes#2452
## Problem:
Currently `WriteState`, usually triggered by an insert operation, "owns"
the balancing state machine, even though a delete operation (tracked by
a separate `DeleteState`) can also trigger balancing, which results in
awkward back-and-forth switching between `CursorState::Write` and
`CursorState::Delete` during balancing.
## Fix:
1. Extract `balance_state` as a separate state machine, since its state
transitions are exactly the same regardless of whether an insert or a
delete triggered the balancing.
2. This allows to remove the different 'Balance-xxx' variants from
`WriteState`, as well as removing `WriteInfo` and `DeleteInfo`, as the
delete&insert states become just simple enums now. Each of them now has
a substate called `Balancing` which just delegates work to the balancing
state machine.
3. This further allows us to remove the awkward switching between
`CursorState::Delete` and `CursorState::Write` during a balance that
happens as a result of a deletion.
Reviewed-by: Nikita Sivukhin (@sivukhin)
Reviewed-by: Avinash Sajjanshetty (@avinassh)
Closes#2468
This PR emit CDC entries as changes in `sqlite_schema` table for DDL
statements: `CREATE TABLE` / `CREATE INDEX` / etc.
The logic is a bit tricky as under the hood `turso` can do some implicit
DDL operations like:
1. Creating auto-indexes in case of `CREATE TABLE`
2. Deletion of all attached indices in case of `DROP TABLE`
```
turso> PRAGMA unstable_capture_data_changes_conn('full');
turso> CREATE TABLE t(x, y, z UNIQUE, q, PRIMARY KEY (x, y));
turso> CREATE INDEX t_xy ON t(x, y);
turso> CREATE TABLE q(a, b, c);
turso> ALTER TABLE q DROP COLUMN b;
turso> SELECT
change_id,
id,
change_type,
table_name,
bin_record_json_object(table_columns_json_array(table_name), before) AS before,
bin_record_json_object(table_columns_json_array(table_name), after) AS after
FROM turso_cdc;
┌───────────┬────┬─────────────┬───────────────┬─────────────────────────────────────────────────────────────────────┬─────────────────────────────────────────────────────────────────────┐
│ change_id │ id │ change_type │ table_name │ before │ after │
├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│ 1 │ 2 │ 1 │ sqlite_schema │ │ {"type":"table","name":"t","tbl_name":"t","rootpage":3,"sql":"CREA… │
├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│ 2 │ 5 │ 1 │ sqlite_schema │ │ {"type":"index","name":"t_xy","tbl_name":"t","rootpage":6,"sql":"C… │
├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│ 3 │ 6 │ 1 │ sqlite_schema │ │ {"type":"table","name":"q","tbl_name":"q","rootpage":7,"sql":"CREA… │
├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│ 4 │ 6 │ 0 │ sqlite_schema │ {"type":"table","name":"q","tbl_name":"q","rootpage":7,"sql":"CREA… │ {"type":"table","name":"q","tbl_name":"q","rootpage":7,"sql":"CREA… │
└───────────┴────┴─────────────┴───────────────┴─────────────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────┘
```
For now, CDC capture only all explicit operations and ignore all
implicit operations. The reasoning for that is that one use case for CDC
is to apply logical changes as is with simple SQL statements - but if
implicit operations will be logged to the CDC table too - we can have
hard times using simple SQL statement (for example, creation of
`autoindices` will always work; implicit deletion of indices for `DROP
TABLE` also can lead to some troubles and force us to is `DROP INDEX IF
EXISTS ...` statements + we will need to filter out autoindices in this
case too).
Also, to simplify PR, for now `DatabaseTape` from `turso-sync` package
just ignore all schema changes from CDC table.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2426
Problem:
Currently `WriteState` "owns" the balancing state machine, even
though a separate `DeleteState` can also trigger balancing, which
results in awkward back-and-forth switching between `CursorState::Write`
and `CursorState::Delete` during balancing.
Fix:
1. Extract `balance_state` as a separate state machine, since its
state transitions are exactly the same regardless of whether an
insert or a delete triggered the balancing.
2. This allows to remove the different 'Balance-xxx' variants from
`WriteState`, as well as removing `WriteInfo` and `DeleteInfo`, as
those states become just simple enums now. Each of them now has a state
called `Balancing` which just delegates work to the balancing state
machine.
3. This further allows us to remove the awkward switching between
`CursorState::Delete` and `CursorState::Write` during a balance that
happens as a result of a deletion.
The built-in `unreachable!` macro, believe it or not is just an alias
for `panic!` and does not actually provide the compiler with a hint that
the path is not reachable.
This provides a wrapper around the actual
`std::hint::unreachable_unchecked()`, to be used only in the very hot
path of `execute` where it is not possible to be the incorrect variant.
Closes#2459
## Beef
Removes cloning of `WriteState` in `balance_non_root()` and removes the
`Clone` implementation from `WriteState`. Not only is it unnecessary to
clone the state, but even having the `Clone` implementation available is
a real regression risk in `insert_into_page()` as it will deep clone
vectors that we require stable raw pointers to in `fill_cell_payload()`
## Details
In removing the cloning of `WriteState` in `balance_non_root()`, this
replaces the `next_write_state` logic with a plain loop, which ends up
looking like a lot of changes, but it's really not. Main things that
were changed:
- Loop and mutate state instead of create new state and return
- Make `WriteInfo` an `Option` in `WriteState` for ergonomics so it can
be `.take()`:n instead of `.clone()`:d
- Clone a `Rc<Pager>` in `balance_non_root()` so that we can call
`pager.do_allocate_page()` directly instead of `self.allocate_page()`
which is just a facade for the pager method, and would violate borrowing
rules.
- assign `let usable_space = self.usable_space()` at the beginning of
the balance loop for similar reasons.
- Make the balance debug validation methods static so they don't require
a reference to self, for similar reasons
Reviewed-by: Nikita Sivukhin (@sivukhin)
Reviewed-by: Avinash Sajjanshetty (@avinassh)
Closes#2466
## Problem
We currently clone `WriteState` in every loop iteration of
`insert_into_page()`, which was probably done for borrow checker
reasons, but since `WriteState` has expanded to include buffers that
must not be moved in memory or dropped, it has necessitated a really
annoying workaround of wrapping the buffers in `Arc<Mutex>>` which is
just completely wasteful.
## Fix
Do not clone `WriteState` in `insert_into_page()`, and instead work with
the borrow checker a bit more. Note that `WriteState` still _implements_
`Clone` because it's also cloned in `balance_non_root()` - that can be a
separate refactor.
Reviewed-by: Avinash Sajjanshetty (@avinassh)
Reviewed-by: Nikita Sivukhin (@sivukhin)
Closes#2464