Commit Graph

294 Commits

Author SHA1 Message Date
Bob Peterson
74ef9ad5ca Drop weak in TursoRwLock::read's compare_exchange
compare_exchange_weak can spuriously fail, which Miri obliges us with,
causing a read deadlock
2025-10-13 14:54:16 -05:00
Pekka Enberg
a72b07e949 Merge 'Fix VDBE program abort' from Nikita Sivukhin
This PR add proper program abort in case of unfinished statement reset
and interruption.
Also, this PR makes rollback methods non-failing because otherwise of
their callers usually unclear (if rollback failed - what is the state of
statement/connection/transaction?)

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3591
2025-10-07 09:07:07 +03:00
pedrocarlo
5a7390735d rename Completion functions 2025-10-06 11:07:06 -03:00
Nikita Sivukhin
8dae601fac make rollback non-failing method 2025-10-06 13:21:45 +04:00
Pekka Enberg
be6f3d09ea core/storage: Switch checkpoint_inner() to completion group 2025-10-06 07:33:31 +03:00
pedrocarlo
f3dc0bef5d remove some explicit Arc<dyn File> references 2025-10-03 16:39:57 -03:00
Pere Diaz Bou
8f103f7c35 core/wal: introduce transaction_count, same as iChange in sqlite 2025-10-03 13:02:47 +02:00
Pere Diaz Bou
b5a969933c core/wal: remove dbg! 2025-10-03 12:17:35 +02:00
Pere Diaz Bou
fe29fcbb09 core/wal: update checkpoint_seq and last_checkpoint on begin_read_tx 2025-10-01 16:17:40 +02:00
Pere Diaz Bou
e84f960516 core/wal: check index header on begin_write_tx 2025-10-01 16:03:17 +02:00
Pekka Enberg
d3abeb6281 core/storage: Wrap WalFile::{max,min}_frame with AtomicU64 2025-09-28 16:47:54 +03:00
Pekka Enberg
aba596441c core/storage: Wrap WalFile::max_frame_read_lock_index with AtomicUsize 2025-09-28 13:42:32 +03:00
Pekka Enberg
8d9d2dad1d core/storage: Wrap WalFile::syncing with AtomicBool 2025-09-27 14:07:26 +03:00
Pekka Enberg
aa454a6637 core: Wrap Connection::pager in RwLock 2025-09-22 17:02:08 +03:00
Pekka Enberg
372daef656 core: Wrap Pager::io_ctx in RwLock 2025-09-22 15:00:29 +03:00
Nikita Sivukhin
b106220743 main thread in browser can't execute parking - so we use parking lot in spin-lock style for that target 2025-09-19 13:21:00 +04:00
Samuel Marks
e333f151ba [*.rs] Resolve warnings (mostly "hiding a lifetime that's elided elsewhere is confusing") 2025-09-18 22:47:43 -05:00
Pekka Enberg
8337e86794 core: Use sequential consistency for atomics by default
We use relaxed ordering in a lot of places where we really need to
ensure all CPUs see the write. Switch to sequential consistency, unless
acquire/release is explicitly used. If there are places that can be
optimized, we can switch to relaxed case-by-case, but have a comment
explaning *why* it is safe.
2025-09-18 13:38:13 +03:00
Nikita Sivukhin
ed819c9865 Merge branch 'main' into more-async 2025-09-18 10:48:54 +04:00
Pekka Enberg
d2376a239a Merge 'core/mvcc: introduce with_header for MVCC header update tracking' from Pere Diaz Bou
Currently header changes are tracked through pager by reading page 1.
MVCC has it's own layer to track changes during txn so this commit makes
it so that headers are tracked by each txn separately.
On commit we update the _global_ header which is used to update
`database_size` because pager commits require it to be up to date. This
also makes it _simpler_ to keep track of header updates and update
pager's header accordingly.
This PR is needed in order to make logical log work because we don't
want to rely on pager as much as possible!

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3156
2025-09-18 08:13:14 +03:00
Nikita Sivukhin
c1176356f7 small fixes 2025-09-17 19:20:42 +04:00
Nikita Sivukhin
d16d86b85d fix blocking ensure_header_if_needed implementation 2025-09-17 19:09:55 +04:00
Nikita Sivukhin
5c4d8aa10b fix bug after making checkpoint async 2025-09-17 19:09:54 +04:00
Nikita Sivukhin
b2afdb8d29 fix comment 2025-09-17 19:06:07 +04:00
Nikita Sivukhin
2c09d17dfe make checkpoint async 2025-09-17 19:06:07 +04:00
Nikita Sivukhin
9cc1d0fcc2 make append_frames fully async and non-blocking 2025-09-17 19:06:06 +04:00
Jussi Saurio
8bf52de94b Merge 'Remove serialization of normal write/commit path' from Preston Thorpe
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3089
2025-09-17 17:30:45 +03:00
Pere Diaz Bou
64616dc2ca core/mvcc: introduce with_header for MVCC header update tracking
Currently header changes are tracked through pager by reading page 1.
MVCC has it's own layer to track changes during txn so this commit makes
it so that headers are tracked by each txn separately.

On commit we update the _global_ header which is used to update
`database_size` because pager commits require it to be up to date. This
also makes it _simpler_ to keep track of header updates and update
pager's header accordingly.
2025-09-17 11:42:44 +02:00
Jussi Saurio
dc103da2ed Remove LimboResult
this is only used for returning LimboResult::Busy, and we already
have LimboError::Busy, so it only adds confusion.

Moreover, the current busy handler was not handling LimboError::Busy,
because it's returned as an error, not as Ok. So this may fix the
"busy handler not working" issue in the perf thrpt benchmark.
2025-09-17 11:04:44 +03:00
Jussi Saurio
32cd01a615 fix deadlock 2025-09-15 14:48:26 +03:00
PThorpe92
703cb4a70f Link all writes to the fsync barrier, not just the commit frame 2025-09-14 10:39:52 -04:00
Avinash Sajjanshetty
4b59cf19e5 use checksums when reading/writing from wal 2025-09-13 11:00:39 +05:30
PThorpe92
b04c364981 Fix clippy error 2025-09-12 11:43:38 -04:00
PThorpe92
7a14c7394f Remove the header copy stored on the WalFile, fix fast_path 2025-09-12 11:29:43 -04:00
PThorpe92
25e7c719f1 Update checkpoint_seq on each checkpoint, not just when log restarts
This was causing checkpoint_seq to be 0 when we had already successfully
ran a passive checkpoint, and causing us to use improper pages from the
cache.
2025-09-12 11:29:42 -04:00
Pere Diaz Bou
9b6d181be4 wal: add hacky update max frame for mvcc use
When multiple tx writes happen concurrently in mvcc, max frame will be
updated. This new max_frame makes is the point of view of the other
transaction return busy because his current wal snapshot is outdated.
2025-09-12 13:49:14 +00:00
PThorpe92
f60ca3970f Remove old comment from wal 2025-09-12 06:39:59 -04:00
PThorpe92
faf3531a4e Fix checkpoint fast-path, don't use cached pages w/o write lock
closes #3024
Also we snapshot the page when we determine that it's eligible, and pay a
memcpy instead of the read from disk, but this further prevents any in-memory
changes to the page/TOCTOU issues.
2025-09-12 06:38:02 -04:00
Pekka Enberg
7d8a1a0d5f Merge 'whopper: A new DST with concurrency' from Pekka Enberg
Our simulator is currently limited to concurrency of one. This
introduces a much less sophisticated DST with focus on finding
concurrency bugs.

Closes #2985
2025-09-11 18:42:45 +03:00
Jussi Saurio
c30d320cab Fix: read transaction cannot be allowed to start with a stale max frame
If both of the following are true:

1. All read locks are already held
2. The highest readmark of any read lock is less than the committed max frame

Then we must return Busy to the reader, because otherwise they would begin a
transaction with a stale local max frame, and thus not see some committed
changes.
2025-09-11 15:58:13 +03:00
Pekka Enberg
ca51a60b3c core/storage: Demote restart_log() logging to debug 2025-09-11 08:35:18 +03:00
PThorpe92
2f4f67efa8 Remove some unused attributes 2025-09-09 16:17:49 -04:00
PThorpe92
02bebf02a5 Remove read_entire_wal_dumb in favor of reading chunks 2025-09-09 16:06:27 -04:00
PThorpe92
37ec77eec2 Fix read_entire_wal_dumb to prefer streaming read if over 32mb wal file 2025-09-09 13:12:58 -04:00
Pekka Enberg
6d80d862ee Merge 'io_uring: prevent out of order operations that could interfere with durability' from Preston Thorpe
closes #1419
When submitting a `pwritev` for flushing dirty pages, in the case that
it's a commit frame, we use a new completion type which tells io_uring
to add a flag, which ensures the following:
1. If any operation in the chain fails, subsequent operations get
cancelled with -ECANCELED
2. All operations in the chain complete in order
If there is an ongoing chain of `IO_LINK`, it ends at the `fsync`
barrier, and ensures everything submitted before it has completed.
for 99% of the cases, the syscall that immediately proceeds the
`pwritev` is going to be the fsync, but just in case, this
implementation links everything that comes between the final commit
`pwritev` and the next `fsync`
In the event that we get a partial write, if it was linked, then we
submit an additional fsync after the partial write completes, with an
`IO_DRAIN` flag after forcing a `submit`, which will mean durability is
maintained, as that fsync will flush/drain everything in the squeue
before submission.
The other option in the event of partial writes on commit frames/linked
writes is to error.. not sure which is the right move here. I guess it's
possible that since the fsync completion fired, than the commit could be
over without us being durable ondisk. So maybe it's an assertion
instead? Thoughts?

Closes #2909
2025-09-05 08:34:35 +03:00
Pekka Enberg
5950003eaf core: Simplify WalFileShared life cycle
Create one WalFileShared for a Database and update its state
accordingly. Also support case where the WAL is disabled.
2025-09-04 21:09:12 +03:00
PThorpe92
e3f366963d Compute the final db page or make the commit frame submit a linked pwritev completion 2025-09-03 16:01:16 -04:00
Pekka Enberg
87d3f74e6e Merge 'Evict page from cache if page is unlocked and unloaded' from Pedro Muniz
Because we can abort a read_page completion, this means a page can be in
the cache but be unloaded and unlocked. However, if we do not evict that
page from the page cache, we will return an unloaded page later which
will trigger assertions later on. This is worsened by the fact that page
cache is not per `Statement`, so you can abort a completion in one
Statement, and trigger some error in the next one if we don't evict the
page in these circumstances.
Also, to propagate IO errors we need to return the Error from
IOCompletions on step.

Closes #2785
2025-09-02 09:08:12 +03:00
Pekka Enberg
d959319b42 Merge 'Use u64 for file offsets in I/O and calculate such offsets in u64' from Preston Thorpe
Using `usize` to compute file offsets caps us at ~16GB on 32-bit
systems. For example, with 4 KiB pages we can only address up to 1048576
pages; attempting the next page overflows a 32-bit usize and can wrap
the write offset, corrupting data. Switching our I/O APIs and offset
math to u64 avoids this overflow on 32-bit targets

Closes #2791
2025-09-02 09:06:49 +03:00
pedrocarlo
4618df9d1a because we can abort a read_page completion, this means that the page can be in the cache but be unloaded and unlocked. However, if we do not evict that page from the page cache, we will return an unloaded page later 2025-09-01 11:10:39 -03:00