Commit Graph

1030 Commits

Author SHA1 Message Date
pedrocarlo
1abe8fd70c state machine seek_to_last 2025-07-31 11:51:17 -03:00
pedrocarlo
543cdb3e2c underscoring completions and IOResult to avoid warning messages 2025-07-31 11:51:17 -03:00
pedrocarlo
6bfba2518e state machine for move_to_rightmost 2025-07-31 11:49:12 -03:00
pedrocarlo
966b96882e move_to_root should return completion 2025-07-31 11:49:12 -03:00
pedrocarlo
cf951e24cd add state machine for is_empty_table in preparation for IO Completion refactor 2025-07-31 11:49:12 -03:00
pedrocarlo
7012860800 create separate state machines file 2025-07-31 11:49:12 -03:00
Jussi Saurio
eeceefe49d Merge 'fix/wal: only rollback WAL if txn was write + fix start state for WalFile' from Jussi Saurio
Closes #2363
## What
The following sequence of actions is possible:
```
Some committed frames already exist in the WAL. shared.pages_in_frames.len() > 0.

Brand new connection does this:
BEGIN
^-- deferred, no read tx started yet, so its `self.start_pages_in_frames` is `0`
       because it's a brand new WalFile instance

ROLLBACK   <-- calls `wal.rollback()` and truncates `shared.pages_in_frames` to length `0`

PRAGMA wal_checkpoint();
^-- because `pages_in_frames` is empty, it doesnt actually
checkpoint anything but still sets shared.max_frame to 0, causing effectively data loss
```
## Fix
- Only call `wal.rollback()` for write transactions
- Set `start_pages_in_frames` correctly so that this doesn't happen even
if a regression starts calling `wal.rollback()` again

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2366
2025-07-31 16:16:20 +03:00
Jussi Saurio
62e804480e fix/wal: make db_changed check detect cases where max frame happens to be the same 2025-07-31 14:37:33 +03:00
Jussi Saurio
e88707c6fd fix/wal: only rollback WAL if txn was write 2025-07-31 14:18:43 +03:00
Jussi Saurio
39dec647a7 fix/wal: reset page cache when another connection checkpointed in between 2025-07-31 12:44:22 +03:00
Jussi Saurio
7d082ab614 small fix after header accessor refactor 2025-07-31 10:05:52 +03:00
Jussi Saurio
f619556344 Merge 'Direct DatabaseHeader reads and writes – with_header and with_header_mut' from Levy A.
This PR introduces two methods to pager. Very much inspired by
`with_schema` and `with_schema_mut`. `Pager::with_header` and
`Pager::with_header_mut` will give to the closure a shared and unique
reference respectively that are transmuted references from the `PageRef`
buffer.
This PR also adds type-safe wrappers for `Version`, `PageSize`,
`CacheSize` and `TextEncoding`, as they have special in-memory
representations.
Writing the `DatabaseHeader` is just a single `memcpy` now.
```rs
pub fn write_database_header(&self, header: &DatabaseHeader) {
    let buf = self.as_ptr();
    buf[0..DatabaseHeader::SIZE].copy_from_slice(bytemuck::bytes_of(header));
}
```
`HeaderRef` and `HeaderRefMut` are used in the `with_header*` methods,
but also can be used on its own when there are multiple reads and writes
to the header, where putting everything in a closure would add too much
nesting.

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2234
2025-07-31 10:02:47 +03:00
Jussi Saurio
62d79e8c16 Merge 'refactor/btree: simplify get_next_record()/get_prev_record()' from Jussi Saurio
When traversing, we are only interested the following things:
- Is the page a leaf or not
- Is the page an index or table page
- If not a leaf, what is the left child page
This means we don't have to read the entire cell, just the left child
page.

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2317
2025-07-31 10:02:08 +03:00
PThorpe92
2e741641e6 Add test to assert we are backfilling all the rows properly with vectored writes 2025-07-30 19:42:54 -04:00
PThorpe92
ade1c182de Add is_full method to checkpoint batch 2025-07-30 19:42:54 -04:00
PThorpe92
693b71449e Clean up writev batching and apply suggestions 2025-07-30 19:42:53 -04:00
PThorpe92
ef69df7258 Apply review suggestions 2025-07-30 19:42:53 -04:00
PThorpe92
73882b97d6 Remove unnecessary collecting CQEs into an array in run_once, comments 2025-07-30 19:42:53 -04:00
PThorpe92
efcffd380d Clean up io_uring writev implementation, add iovec and cqe cache 2025-07-30 19:42:52 -04:00
PThorpe92
b8e6cd5ae2 Fix taking page content from cached pages in checkpoint loop 2025-07-30 19:42:51 -04:00
PThorpe92
b04128b585 Fix write_pages_vectored to properly track completion 2025-07-30 19:42:50 -04:00
PThorpe92
0f94cdef03 Fix io_uring pwritev to properly handle partial writes 2025-07-30 19:42:50 -04:00
PThorpe92
88445328a5 Handle partial writes for pwritev calls in io_uring and fix JS bindings 2025-07-30 19:42:50 -04:00
PThorpe92
62f004c898 Fix write counter for writev batching in checkpoint 2025-07-30 19:42:49 -04:00
PThorpe92
7b2163208b batch backfilling pages when checkpointing 2025-07-30 19:42:48 -04:00
Levy A.
2bde1dbd42 fix: PageSize bounds check 2025-07-30 17:33:59 -03:00
Levy A.
fe66c61ff5 add usable_space to DatabaseHeader
we already have the `DatabaseHeader`, we don't need the cached result
2025-07-30 17:33:59 -03:00
Levy A.
e35fdb8263 feat: zero-copy DatabaseHeader 2025-07-30 17:33:59 -03:00
Jussi Saurio
ac8a123e38 refactor/btree: simplify get_next_record()/get_prev_record()
When traversing, we are only interested the following things:

- Is the page a leaf or not
- Is the page an index or table page
- If not a leaf, what is the left child page

This means we don't have to read the entire cell, just the left child
page.
2025-07-30 21:29:14 +03:00
Jussi Saurio
7240d7903c fmt 2025-07-30 18:22:17 +03:00
Jussi Saurio
c00d1fcfc0 fmt 2025-07-30 17:21:29 +03:00
Jussi Saurio
66c4b44c55 pager: call rollback() after ending txn so that read lock info is not lost when ending txn 2025-07-30 17:21:19 +03:00
Jussi Saurio
7b1f04dc5e pager: only ROLLBACK your own transaction, not if someone else is writing 2025-07-30 17:00:38 +03:00
Jussi Saurio
b1aa13375d call pager.end_tx() everywhere instead of pager.rollback() 2025-07-30 16:39:38 +03:00
Jussi Saurio
975b7b5434 wal: fix test incorrect expectation 2025-07-30 15:53:13 +03:00
Jussi Saurio
af660326d8 finish_append_frames_commit: revert bumping readmark incorrectly 2025-07-30 15:53:01 +03:00
Jussi Saurio
43d1321033 ignore completion result of self.read_frame 2025-07-30 14:58:03 +03:00
Jussi Saurio
9a63425b43 clippy 2025-07-30 14:58:03 +03:00
Jussi Saurio
772b71963e finish_append_frames_commit: properly increase readmark on commit 2025-07-30 14:58:03 +03:00
Jussi Saurio
1562c1df10 begin_read_tx: better assertion failure message 2025-07-30 14:58:03 +03:00
PThorpe92
4dc15492d8 Integrate changes from tx isolation commits from @jussisaurio 2025-07-30 14:10:12 +03:00
PThorpe92
2c3a9fe5ef Finish wal transaction handling and add more wal and chkpt testing 2025-07-30 14:10:10 +03:00
PThorpe92
8806b77d26 Clear snapshot and readmark/lock index flags on failure 2025-07-30 14:09:18 +03:00
PThorpe92
d702e6a80c Polish checkpointing and fix tests, add documentation 2025-07-30 14:08:53 +03:00
PThorpe92
8ec99a9143 Remove assert for !NO_LOCK_HELD, properly handle writing header if reset 2025-07-30 14:08:51 +03:00
PThorpe92
529cc14e29 Fix wal tests remove unwrap from previous Result return val 2025-07-30 14:08:33 +03:00
PThorpe92
7640535ba4 Fix transaction read0 shortcut in WAL and track whether we have snapshot 2025-07-30 14:08:33 +03:00
PThorpe92
ff1987a45c Temporarily remove optimization for new read tx to grab read mark 0 and skip db file 2025-07-30 14:08:33 +03:00
PThorpe92
318bfa9590 Change incorrect comments and rename guard 2025-07-30 14:08:33 +03:00
PThorpe92
1490a586b1 Apply suggestions/fixes and add extensive comments to wal chkpt 2025-07-30 14:08:33 +03:00