turso

mirror of https://github.com/aljazceru/turso.git synced 2026-01-26 03:14:23 +01:00

Author	SHA1	Message	Date
Pekka Enberg	247d4c06c6	Merge 'Fix MVCC update' from Jussi Saurio Based on #3126 Closes #3029 Closes #3030 Closes #3065 Closes #3083 Closes #3084 Closes #3085 simple reason why mvcc update didn't work: it didn't try to update. Closes #3127	2025-09-15 14:24:59 +03:00
Jussi Saurio	59f18e2dc8	fix mvcc update simple reason why mvcc update didn't work: it didn't try to update.	2025-09-15 11:27:56 +03:00
Nikita Sivukhin	3bcac441e4	reduce log level of some very frequent logs	2025-09-15 11:35:41 +04:00
Jussi Saurio	db3428a7a9	remove unused pager parameter	2025-09-14 23:44:24 +03:00
Avinash Sajjanshetty	c2c1ec2dba	Pass use `usable_space()` instead of hardcoding the value	2025-09-13 11:00:38 +05:30
Preston Thorpe	b1420904bb	Merge 'fix(btree): advance cursor after interior node replacement in delete' from Jussi Saurio ## Problem When a delete replaces an index interior cell, the replacement key is LT the deleted key. Currently on the main branch, after the deletion happens, the following call to BTreeCursor::next() stops at the replaced interior cell. This is incorrect - imagine the following sequence: - We are executing a query that deletes all keys WHERE key > 5 - We delete <key=6> from an interior node, and take a replacement <key=5> from the left subtree of that interior page - next() is called, and we land on the interior node again, which now has <key=5>, and we incorrectly delete it even though our WHERE condition is key > 5. ## Solution This PR: - Tracks `interior_node_was_replaced` in CheckNeedsBalancing - If no balancing is needed and a replacement occurred, advances once so the next invocation of next() will skip the replaced cell properly i.e. we prevent next() from landing on the replaced content and ensures iteration continues with the next logical record. ## Details This problem only became apparent once we started using indexes as valid iteration cursors for DELETE operations in #2981 Closes #3045 Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3049	2025-09-12 17:37:01 -04:00
Pekka Enberg	2bc8c0c850	core/storage: Remove unused import warning	2025-09-12 21:09:38 +03:00
Pekka Enberg	14da283e36	Merge 'MVCC: remove reliance on BTreeCursor::has_record()' from Jussi Saurio Closes #3051 Closes #3032 Closes #3056	2025-09-12 17:31:15 +03:00
Jussi Saurio	305b2f55ae	MVCC: remove reliance on BTreeCursor::has_record()	2025-09-12 16:03:55 +03:00
Jussi Saurio	9f6e1a2e7c	fix(btree): advance cursor after interior node replacement in delete When a delete replaces an interior cell, the replacement key is LT the deleted key. Currently on the main branch, after the deletion happens, the following call to BTreeCursor::next() stops at the replaced interior cell. This is incorrect - imagine the following sequence: - We are executing a query that deletes all keys WHERE key > 5 - We delete <key=6> from an interior node, and take a replacement <key=5> from the left subtree of that interior page - next() is called, and we land on the interior node again, which now has <key=5>, and we incorrectly delete it even though our WHERE condition is key > 5. This PR: - Tracks `interior_node_was_replaced` in CheckNeedsBalancing - If no balancing is needed and a replacement occurred, advances once so the next invocation of next() will skip the replaced cell properly i.e. we prevent next() from landing on the replaced content and ensures iteration continues with the next logical record. Closes #3045	2025-09-12 10:49:44 +03:00
Jussi Saurio	9b14c0022d	Implement the balance_quick algorithm Fast balancing routine for the common special case where the rightmost leaf page of a given subtree overflows (= an append). In this case we just add a new leaf page as the right sibling of that page, and insert a new divider cell into the parent. The high level steps are: 1. Allocate a new leaf page and insert the overflow cell payload in it. 2. Create a new divider cell in the parent - it contains the page number of the old rightmost leaf, plus the largest rowid on that page. 3. Update the rightmost pointer of the parent to point to the new leaf page. 4. Continue balance from the parent page (inserting the new divider cell may have overflowedImplement the balance_quick algorithm	2025-09-12 00:42:27 +03:00
PThorpe92	b93ad749a9	Remove some traces in super hot paths in btree	2025-09-10 09:54:32 -04:00
Pekka Enberg	bb3fbb7962	Merge 'check freelist count in integrity check' from Jussi Saurio Closes #3003	2025-09-10 16:15:39 +03:00
Jussi Saurio	d7ce781a2a	Merge 'Enable the use of indexes in DELETE statements' from Jussi Saurio Closes #1714 This PR enables the use of an index as the iteration cursor for a point or range deletion operation. Main changes: - Use `Delete` opcode for the index that is iterating the rows - avoids unnecessary seeking on that index, since it's already positioned correctly - Fix delete balancing; details below: ### current state - a deletion may cause a btree rebalancing operation - to get the cursor back to the right place after a rebalancing, we must remember what the deleted key was and seek to it - right now we are using `SeekOp::LT` to move to one slot BEFORE the deleted key, so that if we delete rows in a loop, the following `Next()` call will put us back into the right place ### problem - When we delete multiple rows, we always iterate forwards. Using `SeekOp::LT` implies backwards iteration, but it works OK for table btrees since the cursor never remains on an internal node, because table internal cells do not have payloads. However: this behavior is problematic for indexes because we can effectively end up skipping visiting a page entirely. Honestly: despite spending some debugging the _old_ code, I still don't remember what exactly causes this to happen. :) It's one of the `iter_dir` specific behaviors in `indexbtree_move_to` or `get_prev_record()`, but I'm too tired to spend more time figuring it out. I had the reason in my head before going on vacation, but it was evicted from the cache it seems... ### solution use `SeekOp::GE { eq_only: true }` instead and make the next call to `Next()` a no-op instead. This has the same effect as SeekOp::LT + next(), but without introducing bugs due to `LT` being implied backwards iteration. Reviewed-by: Nikita Sivukhin (@sivukhin) Closes #2981	2025-09-10 16:00:54 +03:00
Jussi Saurio	e3594d0ae0	make the comment for skip_advance more accurate	2025-09-10 15:38:57 +03:00
Jussi Saurio	618f51330a	advance despite skip_advance flag if cursor not pointing at record	2025-09-10 14:54:51 +03:00
Jussi Saurio	80f8794fda	add comments	2025-09-10 14:54:51 +03:00
Jussi Saurio	36ec654631	Seek with GE after delete balancing and skip next advance	2025-09-10 14:54:51 +03:00
Jussi Saurio	df83b56083	check freelist count in integrity check	2025-09-10 14:53:28 +03:00
Pekka Enberg	2131a04b7d	core: Rename IO::run_once() to IO::step() The `run_once()` name is just a historical accident. Furthermore, it now started to appear elsewhere as well, so let's just call it IO::step() as we should have from the beginning.	2025-09-10 14:36:02 +03:00
PThorpe92	ccae3ab0f2	Change callsites to cancel any further IO when an error occurs and drain	2025-09-08 13:18:40 -04:00
Pekka Enberg	081a7b563b	Merge 'Fix crash in Next opcode if cursor stack has no pages' from Jussi Saurio Closes #2924 Unsure if this fix is that great, but it does fix the issue described in #2924 -- added minimal regression test to illustrate the behavior This crash requires a pretty specific set of circumstances: - 3-way join with two innermost being left joins - nullable seek key on the innermost table: * middle table gets nulled out because no matches with the outermost table * hence when we seek the innermost table using middle table values, the seek key is null, so `Insn::IsNull` entirely skips the innermost table Perhaps a bytecode plan illustrates this better: ```sql turso> explain select a.x, b.x, c.x from a left join b on a.y=b.x left join c on b.y=c.x; addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 34 0 0 Start at 34 1 OpenRead 0 2 0 0 table=a, root=2, iDb=0 2 OpenRead 1 4 0 0 table=b, root=4, iDb=0 3 OpenRead 2 5 0 0 index=sqlite_autoindex_b_1, root=5, iDb=0 4 OpenRead 3 7 0 0 index=sqlite_autoindex_c_1, root=7, iDb=0 5 Rewind 0 33 0 0 Rewind table a 6 Integer 0 4 0 0 r[4]=0 7 Column 0 1 6 0 r[6]=a.y 8 IsNull 6 28 0 0 if (r[6]==NULL) goto 28 9 SeekGE 2 28 6 0 key=[6..6] 10 IdxGT 2 28 6 0 key=[6..6] 11 DeferredSeek 2 1 0 0 12 Integer 1 4 0 0 r[4]=1 13 Integer 0 5 0 0 r[5]=0 14 Column 1 1 7 0 r[7]=b.y -- if b.y is NULL, we skip the entire table loop between insns 16-23 -- except when we call NullRow and then Goto to re-enter that loop in order to -- return NULL values for the table 15 IsNull 7 24 0 0 if (r[7]==NULL) goto 24 16 SeekGE 3 24 7 0 key=[7..7] 17 IdxGT 3 24 7 0 key=[7..7] 18 Integer 1 5 0 0 r[5]=1 19 Column 0 0 1 0 r[1]=a.x 20 Column 1 0 2 0 r[2]=b.x 21 Column 3 0 3 0 r[3]=sqlite_autoindex_c_1.x 22 ResultRow 1 3 0 0 output=r[1..3] 23 Next 3 17 0 0 24 IfPos 5 27 0 0 r[5]>0 -> r[5]-=0, goto 27 25 NullRow 3 0 0 0 Set cursor 3 to a (pseudo) NULL row 26 Goto 0 18 0 0 27 Next 2 10 0 0 28 IfPos 4 32 0 0 r[4]>0 -> r[4]-=0, goto 32 29 NullRow 1 0 0 0 Set cursor 1 to a (pseudo) NULL row 30 NullRow 2 0 0 0 Set cursor 2 to a (pseudo) NULL row 31 Goto 0 12 0 0 32 Next 0 6 0 0 33 Halt 0 0 0 0 34 Transaction 0 0 3 0 iDb=0 write=false 35 Goto 0 1 0 0 ``` Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #2967	2025-09-08 17:45:29 +03:00
Jussi Saurio	5820f691af	fix: do not crash in Next if cursor stack has no pages	2025-09-08 16:54:35 +03:00
TcMits	3aa4650f06	make mr.clippy happy	2025-09-08 18:24:50 +07:00
TcMits	a6ff568530	reduce cloning 'Arc<Page>'	2025-09-08 18:00:18 +07:00
Nikita Sivukhin	cd627c2368	remove unnecessary changes	2025-09-07 19:56:06 +04:00
Nikita Sivukhin	5b9fe0cdf3	fix	2025-09-07 19:56:06 +04:00
Nikita Sivukhin	0b6a6e7713	remove comma	2025-09-07 19:56:06 +04:00
Nikita Sivukhin	9aed831f2f	format	2025-09-07 19:56:05 +04:00
Nikita Sivukhin	db7c6b3370	try to speed up count(*) where 1 = 1	2025-09-07 19:55:42 +04:00
Nikita Sivukhin	c374cf0c93	remove Cell/RefCell from PageStack	2025-09-07 19:54:50 +04:00
PThorpe92	03d5598cfb	Use sieve algorithm in page cache in place of full LRU	2025-09-05 16:13:26 -04:00
Jussi Saurio	ce860b7ec9	clippy	2025-08-28 21:48:29 +03:00
Jussi Saurio	9aae3fa859	refactor: remove BTreePageInner it wasn't used for anything. no more `page.get().get().id`.	2025-08-28 21:44:54 +03:00
Pekka Enberg	2ea4354afe	Merge 'Improve integrity check' from Nikita Sivukhin - check free list trunk and pages - use shared hash map to check for duplicate references for pages - properly check overflow pages Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #2816	2025-08-28 16:06:15 +03:00
Nikita Sivukhin	1c0efcfbff	fix clippy	2025-08-27 23:22:21 +04:00
Nikita Sivukhin	09d4590ece	fix compilation	2025-08-27 23:19:26 +04:00
Nikita Sivukhin	ae705445bf	improve integrity check - check free list trunk and pages - use shared hash map to check for duplicate references for pages - properly check overflow pages	2025-08-27 23:14:21 +04:00
Avinash Sajjanshetty	2c0842ff52	Set and propagate `IOContext` as required	2025-08-27 22:05:01 +05:30
Pekka Enberg	3176df64a2	Merge 'Fix: return NULL for rowid() when cursor's null flag is on' from Jussi Saurio Fixes TPC-H query 13 from returning an incorrect result. In this specific case, we were returning non-null `IdxRowid` values for the right-hand side table even when there was no match with the left-hand side table, meaning the join produced matches even in cases where there shouldn't have been any. Closes #2794 Closes #2795	2025-08-26 09:33:49 +03:00
Jussi Saurio	e52f807c7d	Fix: return NULL for rowid() when cursor's null flag is on Fixes TPC-H query 13 from returning an incorrect result. In this specific case, we were returning non-null `IdxRowid` values for the right-hand side table even when there was no match with the left-hand side table, meaning the join produced matches even in cases where there shouldn't have been any. Closes #2794	2025-08-26 09:08:48 +03:00
Pekka Enberg	114ece0375	Merge 'Make fill_cell_payload() safe for async IO and cache spilling' from Jussi Saurio ## Make fill_cell_payload() safe for async IO and cache spilling ### Problems: 1. fill_cell_payload() is not re-entrant because it can yield IO on allocating a new overflow page, resulting in losing some of the input data. 2. fill_cell_payload() in its current form is not safe for cache spilling because the previous overflow page in the chain of allocated overflow pages can be evicted by a spill caused by the next overflow page allocation, invalidating the page pointer and causing corruption. 3. fill_cell_payload() uses raw pointers and `unsafe` as a workaround from a previous time when we used to clone `WriteState`, resulting in hard-to-read code. ### Solutions: 1. Introduce a new substate to the fill_cell_payload state machine to handle re-entrancy wrt. allocating overflow pages. 2. Always pin the current overflow page so that it cannot be evicted during the overflow chain construction. Also pin the regular page the overflow chain is attached to, because it is immediately accessed after fill_cell_payload is done. 3. Remove all explicit usages of `unsafe` from `fill_cell_payload` (although our pager is ofc still extremely unsafe under the hood :] ) Note that solution 2 addresses a problem that arose in the development of page cache spilling, which is not yet implemented, but will be soon. ### Miscellania: 1. Renamed a bunch of variables to be clearer 2. Added more comments about what is happening in fill_cell_payload Closes #2737	2025-08-26 08:36:46 +03:00
Jussi Saurio	8cae10f744	Fix several issues with integrity_check Things that were just wrong: 1. No pages other than the root page were checked, because no looping was done. Add a loop. 2. Rightmost child page was never added to page stack. Add it. New integrity check features: - Add overflow pages to stack as well - Check that no page is referenced more than once in the tree	2025-08-25 16:51:57 +03:00
Pekka Enberg	3f5878243f	Merge 'Remove unnecessary argument from Pager::end_tx()' from Nikita Sivukhin No need to pass `disable` flag to the `end_tx` method as it has that info from connection itself Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #2777	2025-08-25 15:34:41 +03:00
Jussi Saurio	16b1ae4a9f	Handle unpinning btree page in case of allocate overflow page error	2025-08-25 15:12:37 +03:00
Jussi Saurio	c6553d82b8	Clarify expected behavior with assertion	2025-08-25 15:05:04 +03:00
Jussi Saurio	42c8a77bb7	use existing payload_overflows() utility in local space calculation	2025-08-25 15:03:10 +03:00
Nikita Sivukhin	f7ad55b680	remove unnecessary argument	2025-08-25 12:24:39 +04:00
Jussi Saurio	dc6bcd4d41	refactor/btree: rewrite find_free_cell()	2025-08-25 10:08:39 +03:00
Jussi Saurio	4ea8cd0007	refactor/btree: rewrite the free_cell_range() function i had a rough time reading this function earlier and trying to understand it, so rewrote it in a way that, to me, is much more readable.	2025-08-25 09:41:44 +03:00

1 2 3 4 5 ...

766 Commits