This patch adds support for per page encryption. The code is of alpha
quality, was to test my hypothesis. All the encryption code is gated
behind a `encryption` flag. To play with it, you can do:
```sh
cargo run --features encryption -- database.db
turso> PRAGMA key='turso_test_encryption_key_123456';
turso> CREATE TABLE t(v);
```
Right now, most stuff is hard coded. We use AES GCM 256. This
information is not stored anywhere, but in future versions we will start
saving this info in the file. When writing to disk, we will generate a
cryptographically secure random salt, use that to encrypt the page. Then
we will store the authentication tag and the salt in the page itself. To
accommodate this encryption hardcodes reserved space of 28 bytes.
Once the key is set in the connection, we propagate that information to
pager and the WAL, to encrypt / decrypt when reading from disk.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2567
1. We spend a lot of time in `cell_get_raw_region` in the balancing
routine, and especially calling `contents.page_type()` there a lot, so
extract a version that can take some precomputed arguments so those
don't have to be redundantly computed multiple times for successive
calls where those values are going to be the same
2. Avoid calling `self.usable_space()` in a loop in
`insert_into_page()`.
3. Avoid accessing `pages_in_frames` lock if we're not going to modify
it
main improvement is to the "insert 100 rows" bench which ends up doing
balancing a lot:
```
Insert rows in batches/limbo_insert_1_rows
time: [22.856 µs 24.342 µs 27.496 µs]
change: [-3.3579% +15.495% +67.671%] (p = 0.62 > 0.05)
No change in performance detected.
Benchmarking Insert rows in batches/limbo_insert_10_rows: Collecting 100 samples in estim
Insert rows in batches/limbo_insert_10_rows
time: [32.196 µs 32.604 µs 32.981 µs]
change: [+1.3253% +2.9177% +4.5863%] (p = 0.00 < 0.05)
Performance has regressed.
Insert rows in batches/limbo_insert_100_rows
time: [89.425 µs 92.105 µs 96.304 µs]
change: [-18.317% -13.605% -9.1022%] (p = 0.00 < 0.05)
Performance has improved.
```
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#2483
Problem:
A very easy source of bugs is to mistakenly use e.g. PageContent::read_u16()
instead of PageContent::read_u16_no_offset(). The difference between the two
is that `read_u16()` adds 100 bytes to the requested byte offset if and only if
the page in question is page 1, which contains a 100-byte database header.
Case in point: see #2491.
Observation:
In all of the cases where we want to read from or write to a page "header-sensitively",
those reads/writes are to so-called "well known offsets", e.g. specific bytes in a btree
page header.
In all other cases, the "no-offset" versions, i.e. the ones taking the absolute byte offset
as parameter, should be used.
Solution:
1. Make all the offset-sensitive versions (read_u16() and friends) private methods of
`PageContent`.
2. Expose dedicated methods for things like updating rightmost pointer, updating fragmented
bytes count and so on, and use them instead of the plain read/write methods universally.
- try_wal_watermark_read_page - try to read page from the DB with given WAL watermark value
- wal_changed_pages_after - return set of unique pages changed after watermark WAL position
This PR introduces two methods to pager. Very much inspired by
`with_schema` and `with_schema_mut`. `Pager::with_header` and
`Pager::with_header_mut` will give to the closure a shared and unique
reference respectively that are transmuted references from the `PageRef`
buffer.
This PR also adds type-safe wrappers for `Version`, `PageSize`,
`CacheSize` and `TextEncoding`, as they have special in-memory
representations.
Writing the `DatabaseHeader` is just a single `memcpy` now.
```rs
pub fn write_database_header(&self, header: &DatabaseHeader) {
let buf = self.as_ptr();
buf[0..DatabaseHeader::SIZE].copy_from_slice(bytemuck::bytes_of(header));
}
```
`HeaderRef` and `HeaderRefMut` are used in the `with_header*` methods,
but also can be used on its own when there are multiple reads and writes
to the header, where putting everything in a closure would add too much
nesting.
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes#2234