remote protocol can require client to perform checkpoint of the WAL.
This PR supports that need in the sync-engine implementation
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#2565
in #2521, I messed up and introduced improper calculation of the current
checkpoint's max safe frame (mostly due to incorrect comments that I had
left on the method).
The confusion partially stems from our lack of Busy handling at the
moment, but essentially when determining the max safe frame for all
readers, for passive mode we cannot simply `break` out of the loop when
we find a reader with a lower read mark than we have, because _another_
reader might have an even _lower_ read mark, and we could proceed with
the first mark < shared_max.
And for !passive modes, we still attempt to backfill with the same lower
frame, we just return `Busy` at the end, after backfilling what we can
(we just don't reset the log for restart/truncate).
Most of the changes in this PR is just the renaming the fields of
Checkpoint Result, because the names were confusing
Closes#2560
We have to update the Transaction State before checking for the Schema
Cookie so that we can rollback the transaction later on correctly.
Closes#2535Closes#2549
Small contribution to my current work on making checkpointing efficient.
We hold a write lock, and especially here on main there is no reason to
mark the pages as dirty in the cache, so we can do away with that
Vec<u64> and just track whether it's `Done`
Closes#2545
Mainly the performance impact here comes from removing some unnecessary
checks and inlining `read_integer_fast()` directly into `op_column()`,
but I also added some fiddly nano-optimizations for fun
On main, we are roughly 3.4x slower than sqlite on `SELECT * FROM users
LIMIT 100`, and here we are roughly 3.2x slower, which ain't much, but
it's honest work.
A more impactful optimization, but a much more annoying refactor, would
be #2304Closes#2516
This PR adds new `updates` column to the CDC table. This column holds
updated fields of the row in the following format:
```
[C boolean values where true set for changed columns]
[C values with updates where NULL is set for not-changed columns]
```
For example:
```
turso> UPDATE t SET y = 'turso', q = 'db' WHERE rowid = 1;
turso> SELECT bin_record_json_object('["x","y","z","q","x","y","z","q"]', updates) as updates FROM turso_cdc;
┌──────────────────────────────────────────────────────────────────┐
│ updates │
├──────────────────────────────────────────────────────────────────┤
│ {"x":0,"y":1,"z":0,"q":1,"x":null,"y":"turso","z":null,"q":"db"} │
└──────────────────────────────────────────────────────────────────┘
```
Also, this column works differently for `ALTER TABLE` statements where
update value for `sql` will be equal to the original `ALTER TABLE`:
```
turso> ALTER TABLE t ADD COLUMN t;
turso> SELECT bin_record_json_object('["type","name","tbl_name","rootpage","sql","type","name","tbl_name","rootpage","sql"]', updates) as updates FROM turso_cdc WHERE rowid = 2;
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ updates │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ {"type":0,"name":0,"tbl_name":0,"rootpage":0,"sql":1,"type":null,"name":null,"tbl_name":null,"rootpage":null,"sql":"ALTER TABLE t ADD COLUMN t;"} │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
```
This will help turso-db to implement logical replication which supports
both column-level updates and schema changes
Closes#2538
1. Introduce state machines to insert and delete to make sure IO is
handled properly
2. When `rowid` is changed in `UPDATE`, it is handled as a combination
of delete and insert, so `op_insert` doesn't need to update the
incremental view with the deleted old column since the preceding
`op_delete` instruction should already do it.
Closes#2542
populate now has its own code path to apply changes to the view. It was
okay until now because all we do is filter. But now that we are also
applying aggregations, we'll end up with two disjoint code paths.
A better approach is to just apply the results of our select to the
delta set, and apply it.