Commit Graph

299 Commits

Author SHA1 Message Date
Pekka Enberg
8337e86794 core: Use sequential consistency for atomics by default
We use relaxed ordering in a lot of places where we really need to
ensure all CPUs see the write. Switch to sequential consistency, unless
acquire/release is explicitly used. If there are places that can be
optimized, we can switch to relaxed case-by-case, but have a comment
explaning *why* it is safe.
2025-09-18 13:38:13 +03:00
pedrocarlo
3d265489dc modify semantics of busy_timeout to be more on par with sqlite 2025-09-15 02:20:32 -03:00
pedrocarlo
a56680f79e implement Busy Handler in Turso statements 2025-09-15 02:16:18 -03:00
PThorpe92
6098bca211 Handle partial writes in unix IO for pwrite and pwritev 2025-09-12 18:13:02 -04:00
Pekka Enberg
2131a04b7d core: Rename IO::run_once() to IO::step()
The `run_once()` name is just a historical accident. Furthermore, it now
started to appear elsewhere as well, so let's just call it IO::step() as we
should have from the beginning.
2025-09-10 14:36:02 +03:00
PThorpe92
ccae3ab0f2 Change callsites to cancel any further IO when an error occurs and drain 2025-09-08 13:18:40 -04:00
PThorpe92
a750505762 Impl cancel and drain methods for io_uring 2025-09-08 13:18:38 -04:00
PThorpe92
02df372811 Add cancel and drain methods to IO trait 2025-09-08 13:18:03 -04:00
Pekka Enberg
6d80d862ee Merge 'io_uring: prevent out of order operations that could interfere with durability' from Preston Thorpe
closes #1419
When submitting a `pwritev` for flushing dirty pages, in the case that
it's a commit frame, we use a new completion type which tells io_uring
to add a flag, which ensures the following:
1. If any operation in the chain fails, subsequent operations get
cancelled with -ECANCELED
2. All operations in the chain complete in order
If there is an ongoing chain of `IO_LINK`, it ends at the `fsync`
barrier, and ensures everything submitted before it has completed.
for 99% of the cases, the syscall that immediately proceeds the
`pwritev` is going to be the fsync, but just in case, this
implementation links everything that comes between the final commit
`pwritev` and the next `fsync`
In the event that we get a partial write, if it was linked, then we
submit an additional fsync after the partial write completes, with an
`IO_DRAIN` flag after forcing a `submit`, which will mean durability is
maintained, as that fsync will flush/drain everything in the squeue
before submission.
The other option in the event of partial writes on commit frames/linked
writes is to error.. not sure which is the right move here. I guess it's
possible that since the fsync completion fired, than the commit could be
over without us being durable ondisk. So maybe it's an assertion
instead? Thoughts?

Closes #2909
2025-09-05 08:34:35 +03:00
Nikita Sivukhin
4a3d3b3b8c mark completion as done only after callback will be executed
- otherwise, in multi-threading environment, other thread can think that completion is finished
  and start execution
- this can lead to violated assertions (for example, page must be loaded, but as callback is not executed yet
  assert will be fired)
2025-09-04 23:48:08 +04:00
PThorpe92
d894a62132 Add plumbing in io_uring to handle linked writes to ensure consistency 2025-09-03 16:01:16 -04:00
PThorpe92
3831218f6c Add linked to completion types in io/mod 2025-09-03 16:01:16 -04:00
PThorpe92
c5b6df4249 Use mutex in place of spinlock for io_uring 2025-09-03 11:12:33 -04:00
PThorpe92
30454336a6 Make io_uring sound for connections across multiple threads 2025-09-03 10:54:42 -04:00
Pekka Enberg
87d3f74e6e Merge 'Evict page from cache if page is unlocked and unloaded' from Pedro Muniz
Because we can abort a read_page completion, this means a page can be in
the cache but be unloaded and unlocked. However, if we do not evict that
page from the page cache, we will return an unloaded page later which
will trigger assertions later on. This is worsened by the fact that page
cache is not per `Statement`, so you can abort a completion in one
Statement, and trigger some error in the next one if we don't evict the
page in these circumstances.
Also, to propagate IO errors we need to return the Error from
IOCompletions on step.

Closes #2785
2025-09-02 09:08:12 +03:00
pedrocarlo
53cfae1db4 return Error from step if IO failed 2025-09-01 11:10:39 -03:00
PThorpe92
a0e5536360 Fix clippy warnings and remove self casts 2025-08-28 09:45:19 -04:00
PThorpe92
0a56d23402 Use u64 for file offsets in IO and calculate such offsets in u64 2025-08-28 09:44:00 -04:00
Pekka Enberg
bf7f80a937 core/io: Switch Unix I/O to use libc::pwrite()
We use libc elsewhere for fault injection reasons, so let's do this
call-site too.
2025-08-27 17:56:23 +03:00
PThorpe92
8c64b772e7 Use previous WindowsIO impl as generic IO 2025-08-25 19:04:14 -04:00
PThorpe92
177c717f25 Remove windows IO in place of Generic IO 2025-08-25 18:47:21 -04:00
Pekka Enberg
5fe5e1548b core/io: Fix build on Android and iOS
Commit ebe6aa0d28 ("adjust cfg for unix
and linux IO") adjusted the I/O conditional compilation, but forgot that
Android and iOS are also part of Unix target family.

Fixes #2500
2025-08-25 11:21:46 +03:00
rajajisai
9068a29380 Use unsafe block 2025-08-24 18:56:05 -04:00
rajajisai
84d20ba60f Use F_FULLSYNC in darwin based operating systems 2025-08-24 18:45:46 -04:00
Jussi Saurio
14873c76fb unixio: use Mutex::lock() instead of Mutex::try_lock()
we should wait to obtain the lock, not immediately fail if we cant.
2025-08-22 10:47:50 +03:00
Pekka Enberg
ae8b1eb00d Merge 'core/io: Don't open file as non-blocking in Unix backend' from Pekka Enberg
The Unix backend is a syscall()-based, blocking implementation. The
O_NONBLOCK adds nothing.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2708
2025-08-21 19:13:39 +03:00
Pekka Enberg
ce8b4c20f6 core/io: Don't open file as non-blocking in Unix backend
The Unix backend is a syscall()-based, blocking implementation. The
O_NONBLOCK adds nothing.
2025-08-21 14:43:00 +03:00
Nikita Sivukhin
f99843cc9e fix windows io 2025-08-21 14:57:07 +04:00
Nikita Sivukhin
c771487933 add remove_file method to the IO 2025-08-21 14:51:02 +04:00
Pekka Enberg
9233f48e08 core/io: Switch Unix I/O operations to use libc
We need it for LD_PRELOAD fault injection to work.
2025-08-20 13:43:47 +03:00
Pekka Enberg
c2208a542a Merge 'Initial pass to support per page encryption' from Avinash Sajjanshetty
This patch adds support for per page encryption. The code is of alpha
quality, was to test my hypothesis. All the encryption code is gated
behind a `encryption` flag. To play with it, you can do:
```sh
cargo run --features encryption -- database.db

turso> PRAGMA key='turso_test_encryption_key_123456';

turso> CREATE TABLE t(v);
```
Right now, most stuff is hard coded. We use AES GCM 256. This
information is not stored anywhere, but in future versions we will start
saving this info in the file. When writing to disk, we will generate a
cryptographically secure random salt, use that to encrypt the page. Then
we will store the authentication tag and the salt in the page itself. To
accommodate this encryption hardcodes reserved space of 28 bytes.
Once the key is set in the connection, we propagate that information to
pager and the WAL, to encrypt / decrypt when reading from disk.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2567
2025-08-20 11:11:24 +03:00
Avinash Sajjanshetty
bd9b4bbfd2 encrypt/decrypt when writing/reading from DB 2025-08-20 11:47:23 +05:30
pedrocarlo
f27d4d14f2 remove polling code in UnixIO so we can implement it correctly later and so we do not fool ourselves that we have any async code there that actually runs 2025-08-20 01:36:08 -03:00
Jussi Saurio
b5439dd068 Remove assertions from Completion::complete() and Completion::error()
The completion callback can be invoked only once via `OnceLock`, let's not
crash if we e.g. call `Completion::abort()` on an already finished completion.

Closes #2673
2025-08-19 22:02:02 +03:00
pedrocarlo
66171527b4 thread safely store the result of completion 2025-08-19 10:48:21 -03:00
pedrocarlo
de1811dea7 abort completions on error 2025-08-19 10:48:21 -03:00
pedrocarlo
4dca1c00db fix merge conflict 2025-08-19 10:48:21 -03:00
pedrocarlo
ab3b68e360 change completion callbacks to take a Result param + create separate functions to declare a completion errored 2025-08-19 10:48:21 -03:00
pedrocarlo
71ca221390 clippy 2025-08-19 10:48:21 -03:00
pedrocarlo
2d6fad5ea3 nit: adjust order of struct completions 2025-08-19 10:48:21 -03:00
pedrocarlo
fadf78fe67 use a dedicated Error enum for Completion Error 2025-08-19 10:48:21 -03:00
pedrocarlo
7bc0545442 default impl for get_memory_io 2025-08-19 10:48:21 -03:00
pedrocarlo
d5a59c6bee default impl for generate_random_number 2025-08-19 10:48:21 -03:00
pedrocarlo
f72bcbc5da default impl for wait_for_completion + check for errors in completion there 2025-08-19 10:48:21 -03:00
pedrocarlo
002390b5a5 store error inside Completion 2025-08-19 10:48:21 -03:00
pedrocarlo
d0c13f0104 remove IOError from Parser + store only ErrorKind in LimboError 2025-08-19 10:48:21 -03:00
PThorpe92
d3d01cefc8 Add to_system_time for our io::clock::Instant type 2025-08-18 19:27:37 -04:00
PThorpe92
cc2fed3297 Remove copy_to API from file IO trait 2025-08-14 21:31:13 -04:00
PThorpe92
55f09a01c4 Update copy_to method in file trait to separate source and destination IO 2025-08-14 21:31:13 -04:00
Preston Thorpe
5cea0f572e Merge 'Revive async io extension PR' from Preston Thorpe
bringing #1127 back to life, except better because this doesn't add
Tokio as a dependency for extension lib just for tests.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2418
2025-08-14 16:10:09 -04:00