If a connection does e.g. CREATE TABLE, it will start a "child statement"
to reparse the schema. That statement does not start its own transaction,
and so should not try to end the existing one either.
We had a logic bug where these steps would happen:
- `CREATE TABLE` executed successfully
- pread fault happens inside `ParseSchema` child stmt
- `handle_program_error()` is called
- `pager.end_tx()` returns immediately because `is_nested_stmt` is true
and we correctly no-op it.
- however, crucially: `handle_program_error()` then sets tx state to None
- parent statement now catches error from nested stmt and calls
`handle_program_error()`, which calls `pager.end_tx()` again, and since
txn state is None, when it calls `rollback()` we panic on the assertion
`"dirty pages should be empty for read txn"`
Solution:
Do not do _any_ error processing in `handle_program_error()` inside a nested
stmt. This means that the parent write txn is still active when it processes
the error from the child and we avoid this panic.
Added `from_hex_string` which gets us `EncryptionKey` from a
hex string. Now we can use securely generated keys, like from openssl
$ openssl rand -hex 32
Depends on #2722
This builds upon the #2722 PR which let me configure the encryption
algorithm. So this adds AEGIS and uses it as default.
Note that choice of cipher at higher APIs is still not possible. I have
a follow up PR which updates the PRAGMAs
AEGIS is way too damn fast, here are some numbers:
```
* MACs:
aegis128x4-mac : 223.91 Gb/s
aegis128x2-mac : 270.87 Gb/s
aegis128l-mac : 229.35 Gb/s
sthash : 83.60 Gb/s
hmac-sha256 (boring): 27.46 Gb/s
blake3 : 21.41 Gb/s
* Encryption:
aegis128x4 : 104.19 Gb/s
aegis128x2 : 182.46 Gb/s
aegis128l : 181.62 Gb/s
aegis256x2 : 133.45 Gb/s
aegis256x4 : 125.23 Gb/s
aegis256 : 102.12 Gb/s
aes128-gcm (aes-gcm): 2.16 Gb/s
aes128-gcm (boring) : 63.25 Gb/s
aes256-gcm (aes-gcm): 1.70 Gb/s
aes256-gcm (boring) : 59.14 Gb/s
chacha20-poly1305 : 2.39 Gb/s
ascon128a : 5.84 Gb/s
```
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2742
Previously, the encryption module had hardcoded a lot of things. This
refactor makes it slightly nice and makes it configurable.
Right now cipher algorithm is assumed and hardcoded, I will make that
configurable in the upcoming PR
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2722
Problems:
1. fill_cell_payload() is not re-entrant because it can yield IO
on allocating a new overflow page, resulting in losing some of the
input data.
2. fill_cell_payload() in its current form is not safe for cache spilling
because the previous overflow page in the chain of allocated overflow pages
can be evicted by a spill caused by the next overflow page allocation,
invalidating the page pointer and causing corruption.
3. fill_cell_payload() uses raw pointers and `unsafe` as a workaround from a previous time when we used to clone `WriteState`, resulting in hard-to-read code.
Solutions:
1. Introduce a new substate to the fill_cell_payload state machine to handle
re-entrancy wrt. allocating overflow pages.
2. Always pin the current overflow page so that it cannot be evicted during the
overflow chain construction. Also pin the regular page the overflow chain is
attached to, because it is immediately accessed after fill_cell_payload is done.
3. Remove all explicit usages of `unsafe` from `fill_cell_payload` (although our pager is ofc still extremely unsafe under the hood :] )
Note that solution 2 addresses a problem that arose in the development of page cache
spilling, which is not yet implemented, but will be soon.
Miscellania:
1. Renamed a bunch of variables to be clearer
2. Added more comments about what is happening in fill_cell_payload
I'm working on ANALYZE. I'm using EXPLAIN. The lack of highlighting
for them in the CLI annoyed me a bit.
I don't think there's any tests for this? I'm mostly at a "it seems to
work for me". I double checked that `EXPLAIN SELECT CASE 0 WHEN 0 THEN
0 ELSE 1` syntax highlights, to make sure I didn't break the longer
parsing (which I had).
Closes#2741