This PR rewrites `turso-sync` package introduced in the #2334 and
renames it to the `turso-sync-engine` (anyway the diff will be
unreadable).
The purpose of rewrite is to get rid of Tokio because this makes things
harder when we wants to export bindings to WASM.
In order to achieve "runtime"-agnostic sync core but still be able to
use async/await machiner - this PR introduce usage of `genawaiter` crate
which allows to transfer async/await Rust state machines to the
generators. So, sync operations are just generators which can yield `IO`
command in case where there is a need for it.
Also, this PR introduces separate `ProtocolIo` in the `turso-sync-
engine` which defines extra IO methods:
1. HTTP interaction
2. Atomic read/writes to the file. This is not strictly necessary and
`turso_core::IO` methods can be extended to support few more things
(like `delete`/`rename`) - but I decided that it will be simpler to just
expose 2 more methods for sync protocol for the sake of atomic metadata
update (which is very small - dozens of bytes).
* As a bonus, we can store metadata for browser in the
`LocalStorage` which may be more natural thing to do(?) (user can reset
everything by just clearing local storage)
The `ProtocolIo` works similarly to the `IO` in a sense that it gives
the caller `Completion` which it can check periodically for new data.
Closes#2457
Problem:
Currently `WriteState` "owns" the balancing state machine, even
though a separate `DeleteState` can also trigger balancing, which
results in awkward back-and-forth switching between `CursorState::Write`
and `CursorState::Delete` during balancing.
Fix:
1. Extract `balance_state` as a separate state machine, since its
state transitions are exactly the same regardless of whether an
insert or a delete triggered the balancing.
2. This allows to remove the different 'Balance-xxx' variants from
`WriteState`, as well as removing `WriteInfo` and `DeleteInfo`, as
those states become just simple enums now. Each of them now has a state
called `Balancing` which just delegates work to the balancing state
machine.
3. This further allows us to remove the awkward switching between
`CursorState::Delete` and `CursorState::Write` during a balance that
happens as a result of a deletion.
We currently clone WriteState on every iteration of `insert_into_page()`,
presumably for Borrow Checker Reasons (tm).
There was a bug in `WriteState::Insert` handling where if `fill_cell_payload()`
returned IO, the `fill_cell_payload_state` was not updated in
`write_info.state`, leading to an infinite loop of allocating new pages.
This bug was surfaced by, but not caused by, #2400.
In preparation for tracking IO Completions, we need to start to return
IO in places where completions are created. Doing some more plumbing now
to avoid bigger PRs for the future
Closes#2438
Implement sqlite's fast path defragment algorithm. This path is taken
when:
1. There are 1-2 freeblocks
2. There are at most `max_frag_bytes` fragmented free bytes (-1..=4)
Instead of reconstructing the entire page, it merges the two freeblocks
and then moves the merged freeblock to the left, effectively turning it
into free space in the unallocated region, instead of a freeblock.
`max_frag_bytes` is particularly important when jnserting a new cell,
because if the page contains (in total) ~just enough space for the new
cell, then there can be hardly any fragmented free space because
otherwise, merging the 1-2 freeblocks won't produce enough contiguous
free space to fit the cell.
## Benchmark
```sql
Insert rows in batches/limbo_insert_1_rows
time: [26.692 µs 27.153 µs 27.695 µs]
change: [-9.9033% -2.9097% +1.6336%] (p = 0.55 > 0.05)
No change in performance detected.
Insert rows in batches/limbo_insert_10_rows
time: [38.618 µs 40.022 µs 42.201 µs]
change: [-8.9137% -6.6405% -4.2299%] (p = 0.00 < 0.05)
Performance has improved.
Insert rows in batches/limbo_insert_100_rows
time: [168.94 µs 169.58 µs 170.31 µs]
change: [-22.520% -17.669% -12.790%] (p = 0.00 < 0.05)
Performance has improved.
```
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#2411
Closes#1967
To support this I had to change how we did `epilogue` similarly to how
SQLite does it. SQLIte first declares a `beginWriteOperation` when some
statement is going to necessitate a Write Transaction. And as we now
need to pass the current schema cookie to `epilogue` it was easier to
call epilogue only in one location (like we do with prologue), and just
have each statement declare their intentions separately. This allows us
to not have to pass the Schema around just to do the epilogue. I believe
this is something that @jussisaurio would be interested in.
~Also had to disable the MVCC test, as it was extremely buggy for me.~
Just disabled reprepare statements for MVCC
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2214
More changes. I want to avoid big PRs, so doing these changes in small
increments. I think in like 2 PRs after this one, I will be able make
the change effectively.
Closes#2400
This PR extends raw WAL API with few methods which will be helpful for
offline-sync:
1. `try_wal_watermark_read_page` - try to read page from the DB with
given WAL watermark value\
* Usually, WAL max_frame is set automatically to the latest value
(`shared.max_frame`) when transaction is started and then this
"watermark" is preserved throughout whole transaction
* New method allows to simulate "read from the past" by controlling
frame watermark explicitly
* There is an alternative to implement some API like
`start_read_session(frame_watermark: u64)` - but I decided to expose
just single method to simplify the logic and reduce "surface" of actions
which can be executed in this "controllable" manner
* Also, for simplicity, now `try_wal_watermark_read_page` always
read data from disk and bypass any cached values (and also do not
populate the cache)
2. `wal_changed_pages_after` - return set of unique pages changed after
watermark WAL position in the current WAL session
With these 2 methods we can implement `REVERT frame_watermark` logic
which will just fetch all changed pages first, and then revert them to
the previous value by using `try_wal_watermark_read_page` and
`wal_insert_frame` methods (see `test_wal_api_revert_pages` test).
Note, that if there were schema changes - than `REVERT` logic described
above can bring connection to the inconsistent state, as it will
preserve schema information in memory and will still think that table
exist (while it can be reverted). This should be considered by any
consumer of this new methods.
Closes#2433
Closes#2421
## Background
We have some kind of transaction-local hack (`start_pages_in_frames`)
for bookkeeping how many pages are currently in the in-memory WAL frame
cache, I assume for performance reasons or whatever.
`wal.rollback()` clears all the frames from `shared.frame_cache` that
the rollbacking tx is allowed to clear, and then truncates
`shared.pages_in_frames` to however much its local
`start_pages_in_frames` value was.
## Problem
In `complete_append_frame`, we check if `frame_cache` has that key
(page) already, and if not, we add it to `pages_in_frames`.
However, `wal.rollback()` never _removes_ the key (page) if its value is
empty, so we can end up in a scenario where the `frame_cache` key for
`page P` exists but has no frames, and so `page P` does not get added to
`pages_in_frames` in `complete_append_frame`.
This leads to a checkpoint data loss scenario:
- transaction rolls back, has start_pages_in_frames=0, so truncates
shared pages_in_frames to an empty vec. let's say `page P` key in
`frame_cache` still remains but it has no frames.
- The next time someone commits a frame for `page P`, it does NOT get
added to `pages_in_frames` because `frame_cache` has that key (although
the value vector is empty)
- At some point, a checkpoint checkpoints `n` frames, but since
`pages_in_frames` does not have `page P`, it doesn't actually checkpoint
it and all the "checkpointed" frames are simply thrown away
- very similar to the scenario in #2366
## Fix
Remove the `start_pages_in_frames` hack entirely and just make
`pages_in_frames` effectively the same as `frame_cache.keys`. I think we
could also just get rid of `pages_in_frames` and just use
`frame_cache.contains_key(p)` but maybe Pere can chime in here
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#2422