Commit Graph

796 Commits

Author SHA1 Message Date
Pere Diaz Bou
bc05497d99 core/mvcc: implement CursorTrait on MVCC cursor 2025-10-13 19:26:18 +02:00
Pere Diaz Bou
d0d6db301b core/btree: CursorTrait 2025-10-10 15:04:15 +02:00
Pekka Enberg
13566e5cad Merge 'Integrity check enhancements' from Jussi Saurio
- add index root pages to list of root pages to check
- check for dangling (unused) pages
```sql
$ cargo run wut.db 
turso> .mode list
turso> pragma integrity_check;
Page 3: never used
Page 4: never used
Page 7: never used
Page 8: never used
```
```sql
$ sqlite3 wut.db 'pragma integrity_check;'
*** in database main ***
Page 3: never used
Page 4: never used
Page 7: never used
Page 8: never used
```

Closes #3613
2025-10-08 08:57:18 +03:00
Levy A.
cf53ecb7e3 refactor: remove TextRef and RawSlice and fix tests 2025-10-07 10:43:45 -03:00
Levy A.
77a412f6af refactor: remove unsafe reference semantics from RefValue
also renames `RefValue` to `ValueRef`, to align with rusqlite and other
crates
2025-10-07 10:43:44 -03:00
Pere Diaz Bou
3e508a4b42 core/io: remove new_dummy in place of new_yield
Yield is a completion that does not allocate any inner state. By design
it is completed from the start and has no errors. This allows lightly
yield without allocating any locks nor heap allocate inner state.
2025-10-07 12:00:33 +02:00
Jussi Saurio
5941c03a4f integrity check: check for dangling (unused) pages 2025-10-07 11:35:38 +03:00
Nikita Sivukhin
8dae601fac make rollback non-failing method 2025-10-06 13:21:45 +04:00
pedrocarlo
e93add6c80 remove dyn DatabaseStorage and replace it with DatabaseFile 2025-10-03 14:14:15 -03:00
Pere Diaz Bou
9c9d4d147e core/btree: fuzz tests force page 1 allocation with a transaction 2025-10-03 13:28:28 +02:00
Jussi Saurio
a52dbb7842 Handle table ID / rootpages properly for both checkpointed and non-checkpointed tables
Table ID is an opaque identifier that is only meaningful to the MV store.
Each checkpointed MVCC table corresponds to a single B-tree on the pager,
which naturally has a root page.

We cannot use root page as the MVCC table ID directly because:
- We assign table IDs during MVCC commit, but
- we commit pages to the pager only during checkpoint
which means the root page is not easily knowable ahead of time.

Hence, we:

- store the mapping between table id and btree rootpage
- sqlite_schema rows will have a negative rootpage column if the
  table has not been checkpointed yet.
2025-09-30 16:53:12 +03:00
Pere Diaz Bou
2fff6bb119 core: page id to usize 2025-09-30 11:35:06 +02:00
Pere Diaz Bou
0f631101df core: change page idx type from usize to i64
MVCC is like the annoying younger cousin (I know because I was him) that
needs to be treated differently. MVCC requires us to use root_pages that
might not be allocated yet, and the plan is to use negative root_pages
for that case. Therefore, we need i64 in order to fit this change.
2025-09-29 18:38:43 +02:00
Pekka Enberg
931cf2658e core/storage: Display page category for rowid integrity check failure
Let's add more hints to hunt down the reason for #2896.
2025-09-26 18:25:49 +03:00
Pekka Enberg
b857f94fe4 Merge 'core: Wrap Connection::pager in RwLock' from Pekka Enberg
Closes #3247
2025-09-23 07:29:09 +03:00
Pekka Enberg
aa454a6637 core: Wrap Connection::pager in RwLock 2025-09-22 17:02:08 +03:00
PThorpe92
21f6455190 Fix clippy warnings and tests 2025-09-20 14:38:50 -04:00
Pekka Enberg
9dda5a6263 Merge 'bugfix: clear reserved space for a reused page' from Avinash Sajjanshetty
fixes #3184

Closes #3198
2025-09-19 14:16:24 +03:00
Avinash Sajjanshetty
d5295fb45c Put the unused variable behind a flag as intended 2025-09-19 14:55:02 +05:30
Samuel Marks
e333f151ba [*.rs] Resolve warnings (mostly "hiding a lifetime that's elided elsewhere is confusing") 2025-09-18 22:47:43 -05:00
Avinash Sajjanshetty
91e2a679b9 bugfix: clear reserved space for a reused page 2025-09-18 19:00:03 +05:30
Preston Thorpe
ec79a9063d Merge 'remove io.blocks from btree balancing code' from Nikita Sivukhin
This PR removes `io.block` usage from B-Tree balancing code (similarly
as in the #3179)

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3194
2025-09-18 07:24:51 -04:00
Nikita Sivukhin
1e0fb143f6 remove io.blocks from btree balancing code 2025-09-18 14:28:53 +04:00
Pekka Enberg
2a5284afb9 core/storage: Use AtomicU32 for Pager::page_size 2025-09-18 11:33:32 +03:00
Pekka Enberg
182565fe0c core: Wrap MvCursor in Arc<RwLock<>>
Make it Send and Sync.
2025-09-17 12:46:55 +03:00
Pekka Enberg
17e9f05ea4 core: Convert Rc<Pager> to Arc<Pager> 2025-09-17 09:32:49 +03:00
Jussi Saurio
cae234818b Merge 'Inital support for window functions' from Piotr Rżysko
This adds basic support for window functions. For now:
* Only existing aggregate functions can be used as window functions.
* Specialized window-specific functions (`rank`, `row_number`, etc.) are
not yet supported.
* Only the default frame definition is implemented:
`RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW EXCLUDE NO OTHERS`.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3079
2025-09-17 08:29:16 +03:00
pedrocarlo
7021386f86 move divider_cell_is_overflow_cell to debug assertions so it stops appearing in release builds 2025-09-15 11:11:28 -03:00
Pekka Enberg
247d4c06c6 Merge 'Fix MVCC update' from Jussi Saurio
Based on #3126
Closes #3029
Closes #3030
Closes #3065
Closes #3083
Closes #3084
Closes #3085
simple reason why mvcc update didn't work: it didn't try to update.

Closes #3127
2025-09-15 14:24:59 +03:00
Jussi Saurio
59f18e2dc8 fix mvcc update
simple reason why mvcc update didn't work: it didn't try to update.
2025-09-15 11:27:56 +03:00
Nikita Sivukhin
3bcac441e4 reduce log level of some very frequent logs 2025-09-15 11:35:41 +04:00
Jussi Saurio
db3428a7a9 remove unused pager parameter 2025-09-14 23:44:24 +03:00
Piotr Rzysko
867bef55d8 Add ResetSorter instruction
This instruction isn't used yet, but it will be needed for window
functions, since they heavily rely on ephemeral tables.
2025-09-13 10:44:56 +02:00
Piotr Rzysko
ea9599681e Add OpenDup instruction
The instruction isn’t used yet, but it’ll be needed for window functions,
since they heavily rely on ephemeral tables.
2025-09-13 10:35:33 +02:00
Avinash Sajjanshetty
c2c1ec2dba Pass use usable_space() instead of hardcoding the value 2025-09-13 11:00:38 +05:30
Preston Thorpe
b1420904bb Merge 'fix(btree): advance cursor after interior node replacement in delete' from Jussi Saurio
## Problem
When a delete replaces an index interior cell, the replacement key is LT
the deleted key. Currently on the main branch, after the deletion
happens, the following call to BTreeCursor::next() stops at the replaced
interior cell.
This is incorrect - imagine the following sequence:
- We are executing a query that deletes all keys WHERE key > 5
- We delete <key=6> from an interior node, and take a replacement
<key=5> from the left subtree of that interior page
- next() is called, and we land on the interior node again, which now
has <key=5>, and we incorrectly delete it even though our WHERE
condition is key > 5.
## Solution
This PR:
- Tracks `interior_node_was_replaced` in CheckNeedsBalancing
- If no balancing is needed and a replacement occurred, advances once so
the next invocation of next() will skip the replaced cell properly
i.e. we prevent next() from landing on the replaced content and ensures
iteration continues with the next logical record.
## Details
This problem only became apparent once we started using indexes as valid
iteration cursors for DELETE operations in #2981
Closes #3045

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3049
2025-09-12 17:37:01 -04:00
Pekka Enberg
2bc8c0c850 core/storage: Remove unused import warning 2025-09-12 21:09:38 +03:00
Pekka Enberg
14da283e36 Merge 'MVCC: remove reliance on BTreeCursor::has_record()' from Jussi Saurio
Closes #3051
Closes #3032

Closes #3056
2025-09-12 17:31:15 +03:00
Jussi Saurio
305b2f55ae MVCC: remove reliance on BTreeCursor::has_record() 2025-09-12 16:03:55 +03:00
Jussi Saurio
9f6e1a2e7c fix(btree): advance cursor after interior node replacement in delete
When a delete replaces an interior cell, the replacement key is LT the
deleted key. Currently on the main branch, after the deletion happens,
the following call to BTreeCursor::next() stops at the replaced interior
cell.

This is incorrect - imagine the following sequence:

- We are executing a query that deletes all keys WHERE key > 5
- We delete <key=6> from an interior node, and take a replacement
  <key=5> from the left subtree of that interior page
- next() is called, and we land on the interior node again, which
  now has <key=5>, and we incorrectly delete it even though our
  WHERE condition is key > 5.

This PR:
- Tracks `interior_node_was_replaced` in CheckNeedsBalancing
- If no balancing is needed and a replacement occurred, advances once
  so the next invocation of next() will skip the replaced cell properly

i.e. we prevent next() from landing on the replaced content and ensures iteration continues with the next logical record.

Closes #3045
2025-09-12 10:49:44 +03:00
Jussi Saurio
9b14c0022d Implement the balance_quick algorithm
Fast balancing routine for the common special case where the rightmost leaf page of a given subtree overflows (= an append).
In this case we just add a new leaf page as the right sibling of that page, and insert a new divider cell into the parent.
The high level steps are:
1. Allocate a new leaf page and insert the overflow cell payload in it.
2. Create a new divider cell in the parent - it contains the page number of the old rightmost leaf, plus the largest rowid on that page.
3. Update the rightmost pointer of the parent to point to the new leaf page.
4. Continue balance from the parent page (inserting the new divider cell may have overflowedImplement the balance_quick algorithm
2025-09-12 00:42:27 +03:00
PThorpe92
b93ad749a9 Remove some traces in super hot paths in btree 2025-09-10 09:54:32 -04:00
Pekka Enberg
bb3fbb7962 Merge 'check freelist count in integrity check' from Jussi Saurio
Closes #3003
2025-09-10 16:15:39 +03:00
Jussi Saurio
d7ce781a2a Merge 'Enable the use of indexes in DELETE statements' from Jussi Saurio
Closes #1714
This PR enables the use of an index as the iteration cursor for a point
or range deletion operation. Main changes:
- Use `Delete` opcode for the index that is iterating the rows - avoids
unnecessary seeking on that index, since it's already positioned
correctly
- Fix delete balancing; details below:
### current state
- a deletion may cause a btree rebalancing operation
- to get the cursor back to the right place after a rebalancing, we must
remember what the deleted key was and seek to it
- right now we are using `SeekOp::LT` to move to one slot BEFORE the
deleted key, so that if we delete rows in a loop, the following `Next()`
call will put us back into the right place
### problem
- When we delete multiple rows, we always iterate forwards. Using
`SeekOp::LT` implies backwards iteration, but it works OK for table
btrees since the cursor never remains on an internal node, because table
internal cells do not have payloads. However: this behavior is
problematic for indexes because we can effectively end up skipping
visiting a page entirely. Honestly: despite spending some debugging the
_old_ code, I still don't remember what exactly causes this to happen.
:) It's one of the `iter_dir` specific behaviors in `indexbtree_move_to`
or `get_prev_record()`, but I'm too tired to spend more time figuring it
out. I had the reason in my head before going on vacation, but it was
evicted from the cache it seems...
### solution
use `SeekOp::GE { eq_only: true }` instead and make the next call to
`Next()` a no-op instead. This has the same effect as SeekOp::LT +
next(), but without introducing bugs due to `LT` being implied backwards
iteration.

Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #2981
2025-09-10 16:00:54 +03:00
Jussi Saurio
e3594d0ae0 make the comment for skip_advance more accurate 2025-09-10 15:38:57 +03:00
Jussi Saurio
618f51330a advance despite skip_advance flag if cursor not pointing at record 2025-09-10 14:54:51 +03:00
Jussi Saurio
80f8794fda add comments 2025-09-10 14:54:51 +03:00
Jussi Saurio
36ec654631 Seek with GE after delete balancing and skip next advance 2025-09-10 14:54:51 +03:00
Jussi Saurio
df83b56083 check freelist count in integrity check 2025-09-10 14:53:28 +03:00
Pekka Enberg
2131a04b7d core: Rename IO::run_once() to IO::step()
The `run_once()` name is just a historical accident. Furthermore, it now
started to appear elsewhere as well, so let's just call it IO::step() as we
should have from the beginning.
2025-09-10 14:36:02 +03:00