turso

mirror of https://github.com/aljazceru/turso.git synced 2025-12-28 13:34:24 +01:00

Author	SHA1	Message	Date
Jussi Saurio	64c8587f27	Merge 'IO More State Machine' from Pedro Muniz I swear we just need one more state machine. Just more state machine until we achieve IO tracking. (PS: this is a meme) Closes #2462	2025-08-06 21:26:19 +03:00
pedrocarlo	7b746ccc65	adjust state machine for `ptrmap_get`	2025-08-06 11:32:21 -03:00
pedrocarlo	b529305b82	state machine for `ptrmap_put`	2025-08-06 11:32:21 -03:00
pedrocarlo	931384afb6	state machine fix for btree create for AutoVacuum::Full	2025-08-06 11:32:21 -03:00
pedrocarlo	f656d0bc20	header ref state machine	2025-08-06 11:32:21 -03:00
Jussi Saurio	c8d2a1a480	btree: add a few more assertions about balance state	2025-08-06 13:39:20 +03:00
Jussi Saurio	a86a0e194d	refactor/btree: cleanup write/delete/balancing states Problem: Currently `WriteState` "owns" the balancing state machine, even though a separate `DeleteState` can also trigger balancing, which results in awkward back-and-forth switching between `CursorState::Write` and `CursorState::Delete` during balancing. Fix: 1. Extract `balance_state` as a separate state machine, since its state transitions are exactly the same regardless of whether an insert or a delete triggered the balancing. 2. This allows to remove the different 'Balance-xxx' variants from `WriteState`, as well as removing `WriteInfo` and `DeleteInfo`, as those states become just simple enums now. Each of them now has a state called `Balancing` which just delegates work to the balancing state machine. 3. This further allows us to remove the awkward switching between `CursorState::Delete` and `CursorState::Write` during a balance that happens as a result of a deletion.	2025-08-06 13:37:35 +03:00
Jussi Saurio	5f3cfaac60	refactor/btree: don't clone WriteState in balance_non_root()	2025-08-06 11:30:09 +03:00
Jussi Saurio	a15d7dd2e7	refactor/btree: don't clone WriteState in balance()	2025-08-06 11:30:09 +03:00
Jussi Saurio	1c1f55fdfb	refactor/btree: remove cloning of WriteState in insert_into_page()	2025-08-06 08:50:56 +03:00
Jussi Saurio	c3a32b63bf	refactor/btree: remove unnecessary ref of self in overwrite_content()	2025-08-06 08:45:34 +03:00
Jussi Saurio	6dd08c21e4	refactor/btree: remove unnecessary mut ref of self in rowid()	2025-08-06 08:44:52 +03:00
Jussi Saurio	839d428e36	core/btree: fix re-entrancy bug in insert_into_page() We currently clone WriteState on every iteration of `insert_into_page()`, presumably for Borrow Checker Reasons (tm). There was a bug in `WriteState::Insert` handling where if `fill_cell_payload()` returned IO, the `fill_cell_payload_state` was not updated in `write_info.state`, leading to an infinite loop of allocating new pages. This bug was surfaced by, but not caused by, #2400.	2025-08-06 08:01:49 +03:00
PThorpe92	f6a68cffc2	Remove RefCell from IO and Page apis	2025-08-05 16:24:49 -04:00
PThorpe92	914c10e095	Remove Clone impl for Buffer and PageContent	2025-08-05 14:26:53 -04:00
Jussi Saurio	cde8567b1d	Merge 'More state machine + Return IO in places where completions are created' from Pedro Muniz In preparation for tracking IO Completions, we need to start to return IO in places where completions are created. Doing some more plumbing now to avoid bigger PRs for the future Closes #2438	2025-08-05 15:47:51 +03:00
Jussi Saurio	a28e64bfdd	cleanup: remove unused page uptodate flag	2025-08-05 14:25:42 +03:00
Pekka Enberg	d2fea25fef	Merge 'perf/btree: implement fast algorithm for defragment_page' from Jussi Saurio Implement sqlite's fast path defragment algorithm. This path is taken when: 1. There are 1-2 freeblocks 2. There are at most `max_frag_bytes` fragmented free bytes (-1..=4) Instead of reconstructing the entire page, it merges the two freeblocks and then moves the merged freeblock to the left, effectively turning it into free space in the unallocated region, instead of a freeblock. `max_frag_bytes` is particularly important when jnserting a new cell, because if the page contains (in total) ~just enough space for the new cell, then there can be hardly any fragmented free space because otherwise, merging the 1-2 freeblocks won't produce enough contiguous free space to fit the cell. ## Benchmark ```sql Insert rows in batches/limbo_insert_1_rows time: [26.692 µs 27.153 µs 27.695 µs] change: [-9.9033% -2.9097% +1.6336%] (p = 0.55 > 0.05) No change in performance detected. Insert rows in batches/limbo_insert_10_rows time: [38.618 µs 40.022 µs 42.201 µs] change: [-8.9137% -6.6405% -4.2299%] (p = 0.00 < 0.05) Performance has improved. Insert rows in batches/limbo_insert_100_rows time: [168.94 µs 169.58 µs 170.31 µs] change: [-22.520% -17.669% -12.790%] (p = 0.00 < 0.05) Performance has improved. ``` Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #2411	2025-08-05 12:44:48 +03:00
Pekka Enberg	e355fc4c65	Merge 'core/mvcc: implement seeking operations with rowid' from Pere Diaz Bou Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #2429	2025-08-05 12:40:48 +03:00
Jussi Saurio	ad35cf07eb	Add extra illustrative doodle for pere	2025-08-05 11:24:15 +03:00
Jussi Saurio	a5330aa6fb	perf/btree: implement fast algorithm for defragment_page	2025-08-05 11:24:14 +03:00
Jussi Saurio	5b84ad6b0f	Merge 'Update defragment page to defragment in-place' from João Severo Change original code from doing a full copy of the original buffer to modify the buffer in-place using a temporary vector with offsets. Closes #2258	2025-08-05 11:22:22 +03:00
pedrocarlo	aa8d17cbf1	state machine for `ptrmap_get`	2025-08-05 01:38:42 -03:00
pedrocarlo	0ac040cc87	return IO in some other functions in Pager	2025-08-04 23:28:57 -03:00
pedrocarlo	a4a2425ffd	return IO in places where completions are created	2025-08-04 23:28:57 -03:00
Jussi Saurio	a66b56678d	Merge 'Reprepare Statements when Schema changes' from Pedro Muniz Closes #1967 To support this I had to change how we did `epilogue` similarly to how SQLite does it. SQLIte first declares a `beginWriteOperation` when some statement is going to necessitate a Write Transaction. And as we now need to pass the current schema cookie to `epilogue` it was easier to call epilogue only in one location (like we do with prologue), and just have each statement declare their intentions separately. This allows us to not have to pass the Schema around just to do the epilogue. I believe this is something that @jussisaurio would be interested in. ~Also had to disable the MVCC test, as it was extremely buggy for me.~ Just disabled reprepare statements for MVCC Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #2214	2025-08-05 00:01:14 +03:00
Jussi Saurio	1e59165ea6	Merge 'More State Machines in preparation for tracking IO Completions' from Pedro Muniz More changes. I want to avoid big PRs, so doing these changes in small increments. I think in like 2 PRs after this one, I will be able make the change effectively. Closes #2400	2025-08-05 00:00:09 +03:00
Jussi Saurio	13219dbf87	Merge 'extend raw WAL API with few more methods' from Nikita Sivukhin This PR extends raw WAL API with few methods which will be helpful for offline-sync: 1. `try_wal_watermark_read_page` - try to read page from the DB with given WAL watermark value\ * Usually, WAL max_frame is set automatically to the latest value (`shared.max_frame`) when transaction is started and then this "watermark" is preserved throughout whole transaction * New method allows to simulate "read from the past" by controlling frame watermark explicitly * There is an alternative to implement some API like `start_read_session(frame_watermark: u64)` - but I decided to expose just single method to simplify the logic and reduce "surface" of actions which can be executed in this "controllable" manner * Also, for simplicity, now `try_wal_watermark_read_page` always read data from disk and bypass any cached values (and also do not populate the cache) 2. `wal_changed_pages_after` - return set of unique pages changed after watermark WAL position in the current WAL session With these 2 methods we can implement `REVERT frame_watermark` logic which will just fetch all changed pages first, and then revert them to the previous value by using `try_wal_watermark_read_page` and `wal_insert_frame` methods (see `test_wal_api_revert_pages` test). Note, that if there were schema changes - than `REVERT` logic described above can bring connection to the inconsistent state, as it will preserve schema information in memory and will still think that table exist (while it can be reverted). This should be considered by any consumer of this new methods. Closes #2433	2025-08-04 23:53:46 +03:00
PThorpe92	6cbc8ff868	Replace values with constants	2025-08-04 15:14:06 -04:00
PThorpe92	73d1fdef14	Fix and change bitmap, apply suggestions and add some optimizations	2025-08-04 14:58:58 -04:00
PThorpe92	f4197f1eb5	change debug assertions to turso asserts	2025-08-04 14:55:48 -04:00
PThorpe92	54696d2f0d	Add additional test for edge cases	2025-08-04 14:55:48 -04:00
PThorpe92	5378195ad6	Add page bitmap to storage mod.rs	2025-08-04 14:55:48 -04:00
PThorpe92	3e30335ea5	Add tests for PageBitmap	2025-08-04 14:55:48 -04:00
PThorpe92	7b1f908c00	Add PageBitmap for use with arena page allocator	2025-08-04 14:55:48 -04:00
pedrocarlo	f2d84a534c	adjust `clear_overflow_pages`	2025-08-04 15:28:06 -03:00
pedrocarlo	718ad5e7fd	`btree_destroy` retunrn IO	2025-08-04 14:12:51 -03:00
pedrocarlo	e0978844e6	adjust `integrity_check`	2025-08-04 14:12:50 -03:00
Jussi Saurio	7045d44fdc	Merge 'fix/wal: remove start_pages_in_frames_hack to prevent checkpoint data loss' from Jussi Saurio Closes #2421 ## Background We have some kind of transaction-local hack (`start_pages_in_frames`) for bookkeeping how many pages are currently in the in-memory WAL frame cache, I assume for performance reasons or whatever. `wal.rollback()` clears all the frames from `shared.frame_cache` that the rollbacking tx is allowed to clear, and then truncates `shared.pages_in_frames` to however much its local `start_pages_in_frames` value was. ## Problem In `complete_append_frame`, we check if `frame_cache` has that key (page) already, and if not, we add it to `pages_in_frames`. However, `wal.rollback()` never _removes_ the key (page) if its value is empty, so we can end up in a scenario where the `frame_cache` key for `page P` exists but has no frames, and so `page P` does not get added to `pages_in_frames` in `complete_append_frame`. This leads to a checkpoint data loss scenario: - transaction rolls back, has start_pages_in_frames=0, so truncates shared pages_in_frames to an empty vec. let's say `page P` key in `frame_cache` still remains but it has no frames. - The next time someone commits a frame for `page P`, it does NOT get added to `pages_in_frames` because `frame_cache` has that key (although the value vector is empty) - At some point, a checkpoint checkpoints `n` frames, but since `pages_in_frames` does not have `page P`, it doesn't actually checkpoint it and all the "checkpointed" frames are simply thrown away - very similar to the scenario in #2366 ## Fix Remove the `start_pages_in_frames` hack entirely and just make `pages_in_frames` effectively the same as `frame_cache.keys`. I think we could also just get rid of `pages_in_frames` and just use `frame_cache.contains_key(p)` but maybe Pere can chime in here Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #2422	2025-08-04 19:49:55 +03:00
pedrocarlo	aa05616845	fix tests	2025-08-04 13:08:30 -03:00
pedrocarlo	5f52d9b6b4	state machine for `count`	2025-08-04 13:00:43 -03:00
pedrocarlo	1585d5cbee	state machine for 'next' and `prev`	2025-08-04 13:00:43 -03:00
pedrocarlo	f1df9a909e	state machine for 'rewind'	2025-08-04 12:59:52 -03:00
pedrocarlo	266a7e1c66	do not error in `op_transaction` if page 1 was not allocated	2025-08-04 12:32:34 -03:00
Nikita Sivukhin	9694366645	add one more assert	2025-08-04 17:23:34 +04:00
Nikita Sivukhin	76bdf0c1ab	small fixes	2025-08-04 17:02:53 +04:00
Nikita Sivukhin	2e23230e79	extend raw WAL API with few more methods - try_wal_watermark_read_page - try to read page from the DB with given WAL watermark value - wal_changed_pages_after - return set of unique pages changed after watermark WAL position	2025-08-04 16:55:50 +04:00
Pere Diaz Bou	662da34e7d	core/mvcc: implement seeking operations with rowid	2025-08-04 13:52:54 +02:00
Pekka Enberg	e4accdc29d	Merge 'hide dangerous methods behind conn_raw_api feature' from Nikita Sivukhin WAL API shouldn't be exposed by default because this is relatively dangerous API which we use internally and ordinary users shouldn't not be interested in it. Reviewed-by: Pekka Enberg <penberg@iki.fi> Closes #2424	2025-08-04 14:52:40 +03:00
Pere Diaz Bou	f26e442597	core/mvcc: fix new rowid next rowid was being tracked globally for all tables and restarted to 0 every time database was opened	2025-08-04 12:31:17 +02:00

1 2 3 4 5 ...

1102 Commits