We use relaxed ordering in a lot of places where we really need to
ensure all CPUs see the write. Switch to sequential consistency, unless
acquire/release is explicitly used. If there are places that can be
optimized, we can switch to relaxed case-by-case, but have a comment
explaning *why* it is safe.
The `run_once()` name is just a historical accident. Furthermore, it now
started to appear elsewhere as well, so let's just call it IO::step() as we
should have from the beginning.
closes#1419
When submitting a `pwritev` for flushing dirty pages, in the case that
it's a commit frame, we use a new completion type which tells io_uring
to add a flag, which ensures the following:
1. If any operation in the chain fails, subsequent operations get
cancelled with -ECANCELED
2. All operations in the chain complete in order
If there is an ongoing chain of `IO_LINK`, it ends at the `fsync`
barrier, and ensures everything submitted before it has completed.
for 99% of the cases, the syscall that immediately proceeds the
`pwritev` is going to be the fsync, but just in case, this
implementation links everything that comes between the final commit
`pwritev` and the next `fsync`
In the event that we get a partial write, if it was linked, then we
submit an additional fsync after the partial write completes, with an
`IO_DRAIN` flag after forcing a `submit`, which will mean durability is
maintained, as that fsync will flush/drain everything in the squeue
before submission.
The other option in the event of partial writes on commit frames/linked
writes is to error.. not sure which is the right move here. I guess it's
possible that since the fsync completion fired, than the commit could be
over without us being durable ondisk. So maybe it's an assertion
instead? Thoughts?
Closes#2909
- otherwise, in multi-threading environment, other thread can think that completion is finished
and start execution
- this can lead to violated assertions (for example, page must be loaded, but as callback is not executed yet
assert will be fired)
Because we can abort a read_page completion, this means a page can be in
the cache but be unloaded and unlocked. However, if we do not evict that
page from the page cache, we will return an unloaded page later which
will trigger assertions later on. This is worsened by the fact that page
cache is not per `Statement`, so you can abort a completion in one
Statement, and trigger some error in the next one if we don't evict the
page in these circumstances.
Also, to propagate IO errors we need to return the Error from
IOCompletions on step.
Closes#2785
Commit ebe6aa0d28 ("adjust cfg for unix
and linux IO") adjusted the I/O conditional compilation, but forgot that
Android and iOS are also part of Unix target family.
Fixes#2500
The Unix backend is a syscall()-based, blocking implementation. The
O_NONBLOCK adds nothing.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2708
This patch adds support for per page encryption. The code is of alpha
quality, was to test my hypothesis. All the encryption code is gated
behind a `encryption` flag. To play with it, you can do:
```sh
cargo run --features encryption -- database.db
turso> PRAGMA key='turso_test_encryption_key_123456';
turso> CREATE TABLE t(v);
```
Right now, most stuff is hard coded. We use AES GCM 256. This
information is not stored anywhere, but in future versions we will start
saving this info in the file. When writing to disk, we will generate a
cryptographically secure random salt, use that to encrypt the page. Then
we will store the authentication tag and the salt in the page itself. To
accommodate this encryption hardcodes reserved space of 28 bytes.
Once the key is set in the connection, we propagate that information to
pager and the WAL, to encrypt / decrypt when reading from disk.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2567
The completion callback can be invoked only once via `OnceLock`, let's not
crash if we e.g. call `Completion::abort()` on an already finished completion.
Closes#2673
bringing #1127 back to life, except better because this doesn't add
Tokio as a dependency for extension lib just for tests.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2418