turso

mirror of https://github.com/aljazceru/turso.git synced 2026-02-16 13:34:33 +01:00

Author	SHA1	Message	Date
Nikita Sivukhin	05931f70ce	add optional upper_bound_inclusive parameter to some checkpoint modes - will be used in sync-engine protocol	2025-08-21 14:12:11 +04:00
Preston Thorpe	306bc7e264	Merge 'Improve WAL checkpointing performance' from Preston Thorpe ### General idea: (outside of other optimizations made mostly around concurrency): When checkpointing, use pages from the PageCache if we can determine that they are exactly the page/frame that we want. e.g. if the frame_cache has an entry: `Page ID: 104 -> Frame ID's: [1001, 1002]` and the OngoingCheckpoint has min_frame of 999 and max_frame of 1020, we should be able to check the PageCache and see if it has page 104, and only if it is tagged with frame_id = 1002, can we use that page to backfill the DB file. Since using a cached page during checkpoint is purely an optimization, we can be conservative in terms of when we accept that a cached page is valid to use. I came up with a `wal_tag` which is the frame_id + checkpoint_seq, which is set only in the two following places: 1. When explicitly reading a frame from the WAL. (inside Wall::read_frame) - read_frame is perhaps the most obvious path of ensuring it's the exact page + frame combination that we want. 2. When appending a frame to the log during the normal process of writing (during `[Pager::cacheflush]`) - cacheflush calls append_frame, and inside the Completion, the dirty flag is cleared, and the wal_tag flag is set to the frame_id. Inside `finish_read_page` (which is called for every page we read from either the DB file or WAL.. the `wal_tag` is cleared along with the `dirty` flag, so that any re-used `PageRef's` don't contain wal_tag's from any previous or stale pages. #### Proposal: (In order to merge and simultaneously be able to sleep at night) there is this debug assertion: ```rust #[cfg(debug_assertions)] { let mut raw = vec![0u8; self.page_size() as usize + WAL_FRAME_HEADER_SIZE]; self.io.wait_for_completion(self.read_frame_raw(target_frame, &mut raw)?)?; let (_, wal_page) = sqlite3_ondisk::parse_wal_frame_header(&raw); let cached = cached_page.get_contents().buffer.as_slice(); // while being horrible for performance, we can ensure that the bytes are identical // when using the cached page vs what we would otherwise have read from disk. turso_assert!(wal_page == cached, "cache fast-path returned wrong content for page {page_id} frame {target_frame}"); } ``` Performance ===================================== Average latency for a checkpoint on my local machine: #### Before: `7-12ms` #### After: `2-5ms` Reviewed-by: Nikita Sivukhin (@sivukhin) Closes #2568	2025-08-20 18:57:14 -04:00
Preston Thorpe	a943dd9dc7	Merge 'Fix: normalize table name in DELETE' from Jussi Saurio Closes #2696 Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #2697	2025-08-20 18:56:27 -04:00
PThorpe92	4a2da6c262	Remove assertion for checkpoint seq in favor of selectively using cached pages	2025-08-20 18:26:55 -04:00
PThorpe92	7082086061	Remove ENV var and enable cache by default, track which pages were cached	2025-08-20 17:42:17 -04:00
PThorpe92	345b80d14c	Change env var to ENABLE instead of DISABLE so its disabled by default	2025-08-20 17:36:00 -04:00
PThorpe92	51e4cd0f1d	Add debug assertion for cached pages used during checkpoint	2025-08-20 17:35:59 -04:00
PThorpe92	e28a38abc5	Fix wal tag safety issues, and add debug assertion that we are reading the proper frames	2025-08-20 17:28:48 -04:00
PThorpe92	4100737358	remove page entries without frames in frame cache in WAL rollback method	2025-08-20 17:28:19 -04:00
PThorpe92	d2c3ba14c8	Remove inefficient vec in WAL for tracking pages present in frame cache	2025-08-20 17:28:18 -04:00
PThorpe92	d6d72d2966	Update Page to carry epoch of frame + checkpont seq to ensure proper cached page for chkpt	2025-08-20 17:28:17 -04:00
PThorpe92	00f2a0f216	Performance improvements to checkpointing. prevent serializing I/O	2025-08-20 17:26:54 -04:00
PThorpe92	fe7a5e98b8	Track frame_ids on PageInner and use the page cache for reading pages to checkpoint	2025-08-20 17:24:10 -04:00
Jussi Saurio	b0b66114c3	Fix: normalize table name in DELETE	2025-08-21 00:03:52 +03:00
Pekka Enberg	9b22026eda	Merge 'Add libc fault injection to Antithesis' from Pekka Enberg Fixes #2644 Reviewed-by: Preston Thorpe <preston@turso.tech> Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #2647	2025-08-20 18:13:32 +03:00
Pekka Enberg	1dc6fb97c0	Merge 'core/mvcc: store txid in conn and reset transaction state on commit ' from Pere Diaz Bou We were storing `txid` in `ProgramState`, this meant it was impossible to track interactive transactions. This was extracted to `Connection` instead. Moreover, transaction state for mvcc now is reset on commit. Closes #2689	2025-08-20 16:51:41 +03:00
Pekka Enberg	72a5de3551	Merge 'core/mvcc: support for MVCC' from Pere Diaz Bou This PR tries to add simple support for delete, with limited testing for now. Moreover, there was an error with `forward`, which wasn't obvious without delete, which didn't skip deleted rows. Reviewed-by: Avinash Sajjanshetty (@avinassh) Closes #2672	2025-08-20 16:51:31 +03:00
Pere Diaz Bou	a4d282874f	Merge 'core/mvcc: start first rowid at 1' from Pere Diaz Bou Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Reviewed-by: Avinash Sajjanshetty (@avinassh) Closes #2688	2025-08-20 12:51:54 +02:00
Pekka Enberg	9233f48e08	core/io: Switch Unix I/O operations to use libc We need it for LD_PRELOAD fault injection to work.	2025-08-20 13:43:47 +03:00
Pere Diaz Bou	ccbbe0a6b3	clippy	2025-08-20 12:41:27 +02:00
Pere Diaz Bou	636a3e76e6	clippy mvcc tests	2025-08-20 12:34:11 +02:00
Pere Diaz Bou	9e3b7b0c98	core/mvcc: store txid in conn and reset transaction state on commit	2025-08-20 12:23:28 +02:00
Pere Diaz Bou	ffaf8580e0	mvcc/core: simple interactive transaction tests for mvcc	2025-08-20 12:22:31 +02:00
Pere Diaz Bou	3927aa037c	core/mvcc: start first rowid at 1	2025-08-20 11:22:51 +02:00
Pekka Enberg	9998834d3d	Merge 'Fix column nullability constraint' from Closes #2553 . Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #2681	2025-08-20 11:24:21 +03:00
Jussi Saurio	e5f04ae100	Merge 'refactor/vdbe: move insert-related seeking to VDBE from BTreeCursor' from Jussi Saurio This gets rid of `InsertState` in `BTreeCursor` plus the `moved_before` parameter to `BTreeCursor::insert` -- instead, seek logic is now in the existing state machines for `op_insert` and `op_idx_insert` Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #2639	2025-08-20 11:15:09 +03:00
Pekka Enberg	c2208a542a	Merge 'Initial pass to support per page encryption' from Avinash Sajjanshetty This patch adds support for per page encryption. The code is of alpha quality, was to test my hypothesis. All the encryption code is gated behind a `encryption` flag. To play with it, you can do: ```sh cargo run --features encryption -- database.db turso> PRAGMA key='turso_test_encryption_key_123456'; turso> CREATE TABLE t(v); ``` Right now, most stuff is hard coded. We use AES GCM 256. This information is not stored anywhere, but in future versions we will start saving this info in the file. When writing to disk, we will generate a cryptographically secure random salt, use that to encrypt the page. Then we will store the authentication tag and the salt in the page itself. To accommodate this encryption hardcodes reserved space of 28 bytes. Once the key is set in the connection, we propagate that information to pager and the WAL, to encrypt / decrypt when reading from disk. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #2567	2025-08-20 11:11:24 +03:00
Avinash Sajjanshetty	40a209c000	simplify feature flag usage for encryption	2025-08-20 12:49:38 +05:30
Avinash Sajjanshetty	bd9b4bbfd2	encrypt/decrypt when writing/reading from DB	2025-08-20 11:47:23 +05:30
Avinash Sajjanshetty	657daeded3	encrypt/decrypt when writing/reading from WAL	2025-08-20 11:44:08 +05:30
Avinash Sajjanshetty	201262b3dd	Update `DatabaseStorage` to pass encryption context	2025-08-20 11:41:08 +05:30
Avinash Sajjanshetty	94d38be1a2	Set reserved_space to 28 for encrypted databases We will use this space to store nonce and tag	2025-08-20 11:39:09 +05:30
Avinash Sajjanshetty	a6e9237c94	Set encryption key in pager and WAL	2025-08-20 11:39:09 +05:30
Avinash Sajjanshetty	93774ffc3b	Add `PRAGMA key` to set the encryption key If set, set the key for the connection	2025-08-20 11:39:07 +05:30
Avinash Sajjanshetty	100a0d8e97	Add encryption module Let's add an encryption module, hard coded to use AES 256 GCM. Other required parameters are also hard coded and will be made configurable in the future PRs. The module is behind a `encryption` feature flag.	2025-08-20 11:38:11 +05:30
pedrocarlo	d61d6c0872	when `run_once` fails we abort the current IOCompletions	2025-08-20 01:36:08 -03:00
pedrocarlo	f27d4d14f2	remove polling code in UnixIO so we can implement it correctly later and so we do not fool ourselves that we have any async code there that actually runs	2025-08-20 01:36:08 -03:00
pedrocarlo	7e98a464a7	check if completion finished instead of completed for step	2025-08-20 00:38:16 -03:00
rajajisai	89cd3fe196	`notnull` is now set based on the `nullable` field instead of being hardcoded.	2025-08-19 21:49:04 -04:00
pedrocarlo	46c756b130	clear locked on pages when completion errors	2025-08-19 17:29:57 -03:00
Jussi Saurio	b5439dd068	Remove assertions from Completion::complete() and Completion::error() The completion callback can be invoked only once via `OnceLock`, let's not crash if we e.g. call `Completion::abort()` on an already finished completion. Closes #2673	2025-08-19 22:02:02 +03:00
Pere Diaz Bou	73b19c66e4	core/mvcc: fix forward with deleted non-visible rows We need to make sure we skip rows that are not visible, if they aren't it means we deleted them so we need to skip it.	2025-08-19 19:49:40 +02:00
Pere Diaz Bou	c8f59a352b	core/mvcc: test delete	2025-08-19 19:48:51 +02:00
Pere Diaz Bou	4314bc13e6	core/mvcc: delete support	2025-08-19 19:48:36 +02:00
Jussi Saurio	a82930d641	Merge 'Completion Error' from Pedro Muniz Completions can now carry errors inside of them. This allows us to wait for a completion to complete or to error. When it errors we can properly tell the caller of `wait_for_completion` that we errored. This will also allow us to abort completions. Currently, this just creates the scaffold for us to store the error in the completion. But to correctly achieve this, it will require some refactor of our IO implementations to store the `run_once` error for a particular completion inside of it instead of short circuiting. This would also allow us to check for an error in `program.step` and properly rollback. Also, creates default impls for some common IO methods, this is important specially for `wait_for_completion` as we want to check the error in the `Completion` before returning `Ok`. Maybe we could also accept a Result type in the completion callback so that we can execute some sort of compensating action on error, like unlocking a page so it can be evicted by the page cache later. EDIT: actually implemented this in this PR. We store a `Result` object inside `CompletionInner` behind a `OnceLock` for thread-safety. We also pass a result object to Completion callbacks to execute compensating actions. Reviewed-by: Avinash Sajjanshetty (@avinassh) Closes #2589	2025-08-19 19:07:57 +03:00
Jussi Saurio	c2855cb0db	refactor/idx_insert: move seeking to VDBE instead of BTreeCursor Also removes `InsertState` and `moved_before` since neither are needed anymore.	2025-08-19 19:04:42 +03:00
Jussi Saurio	d191c7d98b	refactor/insert: move seeking to VDBE instead of BTreeCursor	2025-08-19 19:04:20 +03:00
pedrocarlo	66171527b4	thread safely store the result of completion	2025-08-19 10:48:21 -03:00
pedrocarlo	de1811dea7	abort completions on error	2025-08-19 10:48:21 -03:00
pedrocarlo	4dca1c00db	fix merge conflict	2025-08-19 10:48:21 -03:00

1 2 3 4 5 ...

4350 Commits