- Instead of using a confusing CheckpointStatus for many different things,
introduce the following statuses:
* PagerCacheflushStatus - cacheflush can result in either:
- the WAL being written to disk and fsynced
- but also a checkpoint to the main BD file, and fsyncing the main DB file
Reflect this in the type.
* WalFsyncStatus - previously CheckpointStatus was also used for this, even
though fsyncing the WAL doesn't checkpoint.
* CheckpointStatus/CheckpointResult is now used only for actual checkpointing.
- Rename HaltState to CommitState (program.halt_state -> program.commit_state)
- Make WAL a non-optional property in Pager
* This gets rid of a lot of if let Some(...) boilerplate
* For ephemeral indexes, provide a DummyWAL implementation that does nothing.
- Rename program.halt() to program.commit_txn()
- Add some documentation comments to structs and functions
Currently we are simply unable to read any WAL frames from disk
once a fresh process w/ Limbo is opened, since we never try to read
anything from disk unless we already have it in our in-memory
frame cache.
This commit implements a crude way of reading entire WAL into memory
as a single buffer and reconstructing the frame cache.
WalShared state can be shared without having to wrap everything with a
lock, and instead use atomics on some places and rwlock on others -- for
now.
## Results:
From:
----
Execute `SELECT 1`/limbo_execute_select_1
time: [34.125 ns 34.218 ns 34.324 ns]
Execute `SELECT 1`/sqlite_execute_select_1
time: [28.124 ns 28.254 ns 28.385 ns]
To:
----
Gnuplot not found, using plotters backend
Execute `SELECT 1`/limbo_execute_select_1
time: [31.919 ns 32.113 ns 32.327 ns]
Execute `SELECT 1`/sqlite_execute_select_1
time: [29.662 ns 29.900 ns 30.139 ns]
- Use `CheckpointResult::default()` instead of `CheckpointResult::new()`
- Correct WAL frame header salt and checksum handling
- Ensure frame ID is 1-based and adjust frame offset calculation
- Add `Default` implementation for `CheckpointResult`
- Use random values for WAL header salts
We really need to make the WAL lock less expensive, but switching to
`parking_lot` is anyway something we should do.
Before:
```
Execute `SELECT 1`/Limbo
time: [56.230 ns 56.463 ns 56.688 ns]
```
After:
```
Execute `SELECT 1`/Limbo
time: [52.003 ns 52.132 ns 52.287 ns]
```
Since page cache is now shared by default, we need to cache pages by
page number and something else. I chose to go with max_frame of
connection, because this connection will have a max_frame set until from
the start of a transaction until the end of it.
With key pairs of (pgno, max_frame) we make sure each connection is
caching based on the snapshot it is at as two different connections
might have the same pageno being using but a different frame. If both
have same max_frame then they will share same page.
This includes an inner struct in Page wrapped with Unsafe cell to access
it. This is done intentionally because concurrency control of pages is
handled by pages and not by the page itself.
Since we expect to ensure thread safety between multiple threads in the
future, we extract what is important to be shared between multiple
connections with regards to WAL.
This is WIP so I just put whatever feels like important behind a RwLock
but expect this to change to Atomics in the future as needed. Maybe even
these locks might disappear because they will be better served with
transaction locks.
Implemeted checksums so that sqlite3 is able to read our WAL. This also
helps with future work on proper recovery of WAL.
Create some frames with CREATE TABLE and kill the process so that there
is no checkpoint.
```
Limbo v0.0.6
Enter ".help" for usage hints.
limbo> create table x(x);
limbo> [1] 15910 killed cargo run xlimbo.db
```
Now sqlite3 is able to recover from this WAL created in limbo:
```
sqlite3 xlimbo.db
SQLite version 3.43.2 2023-10-10 13:08:14
Enter ".help" for usage hints.
sqlite> .schema
CREATE TABLE x (x);
```