Commit Graph

776 Commits

Author SHA1 Message Date
Samuel Marks
e333f151ba [*.rs] Resolve warnings (mostly "hiding a lifetime that's elided elsewhere is confusing") 2025-09-18 22:47:43 -05:00
Preston Thorpe
ec79a9063d Merge 'remove io.blocks from btree balancing code' from Nikita Sivukhin
This PR removes `io.block` usage from B-Tree balancing code (similarly
as in the #3179)

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3194
2025-09-18 07:24:51 -04:00
Nikita Sivukhin
1e0fb143f6 remove io.blocks from btree balancing code 2025-09-18 14:28:53 +04:00
Pekka Enberg
2a5284afb9 core/storage: Use AtomicU32 for Pager::page_size 2025-09-18 11:33:32 +03:00
Pekka Enberg
182565fe0c core: Wrap MvCursor in Arc<RwLock<>>
Make it Send and Sync.
2025-09-17 12:46:55 +03:00
Pekka Enberg
17e9f05ea4 core: Convert Rc<Pager> to Arc<Pager> 2025-09-17 09:32:49 +03:00
Jussi Saurio
cae234818b Merge 'Inital support for window functions' from Piotr Rżysko
This adds basic support for window functions. For now:
* Only existing aggregate functions can be used as window functions.
* Specialized window-specific functions (`rank`, `row_number`, etc.) are
not yet supported.
* Only the default frame definition is implemented:
`RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW EXCLUDE NO OTHERS`.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3079
2025-09-17 08:29:16 +03:00
pedrocarlo
7021386f86 move divider_cell_is_overflow_cell to debug assertions so it stops appearing in release builds 2025-09-15 11:11:28 -03:00
Pekka Enberg
247d4c06c6 Merge 'Fix MVCC update' from Jussi Saurio
Based on #3126
Closes #3029
Closes #3030
Closes #3065
Closes #3083
Closes #3084
Closes #3085
simple reason why mvcc update didn't work: it didn't try to update.

Closes #3127
2025-09-15 14:24:59 +03:00
Jussi Saurio
59f18e2dc8 fix mvcc update
simple reason why mvcc update didn't work: it didn't try to update.
2025-09-15 11:27:56 +03:00
Nikita Sivukhin
3bcac441e4 reduce log level of some very frequent logs 2025-09-15 11:35:41 +04:00
Jussi Saurio
db3428a7a9 remove unused pager parameter 2025-09-14 23:44:24 +03:00
Piotr Rzysko
867bef55d8 Add ResetSorter instruction
This instruction isn't used yet, but it will be needed for window
functions, since they heavily rely on ephemeral tables.
2025-09-13 10:44:56 +02:00
Piotr Rzysko
ea9599681e Add OpenDup instruction
The instruction isn’t used yet, but it’ll be needed for window functions,
since they heavily rely on ephemeral tables.
2025-09-13 10:35:33 +02:00
Avinash Sajjanshetty
c2c1ec2dba Pass use usable_space() instead of hardcoding the value 2025-09-13 11:00:38 +05:30
Preston Thorpe
b1420904bb Merge 'fix(btree): advance cursor after interior node replacement in delete' from Jussi Saurio
## Problem
When a delete replaces an index interior cell, the replacement key is LT
the deleted key. Currently on the main branch, after the deletion
happens, the following call to BTreeCursor::next() stops at the replaced
interior cell.
This is incorrect - imagine the following sequence:
- We are executing a query that deletes all keys WHERE key > 5
- We delete <key=6> from an interior node, and take a replacement
<key=5> from the left subtree of that interior page
- next() is called, and we land on the interior node again, which now
has <key=5>, and we incorrectly delete it even though our WHERE
condition is key > 5.
## Solution
This PR:
- Tracks `interior_node_was_replaced` in CheckNeedsBalancing
- If no balancing is needed and a replacement occurred, advances once so
the next invocation of next() will skip the replaced cell properly
i.e. we prevent next() from landing on the replaced content and ensures
iteration continues with the next logical record.
## Details
This problem only became apparent once we started using indexes as valid
iteration cursors for DELETE operations in #2981
Closes #3045

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3049
2025-09-12 17:37:01 -04:00
Pekka Enberg
2bc8c0c850 core/storage: Remove unused import warning 2025-09-12 21:09:38 +03:00
Pekka Enberg
14da283e36 Merge 'MVCC: remove reliance on BTreeCursor::has_record()' from Jussi Saurio
Closes #3051
Closes #3032

Closes #3056
2025-09-12 17:31:15 +03:00
Jussi Saurio
305b2f55ae MVCC: remove reliance on BTreeCursor::has_record() 2025-09-12 16:03:55 +03:00
Jussi Saurio
9f6e1a2e7c fix(btree): advance cursor after interior node replacement in delete
When a delete replaces an interior cell, the replacement key is LT the
deleted key. Currently on the main branch, after the deletion happens,
the following call to BTreeCursor::next() stops at the replaced interior
cell.

This is incorrect - imagine the following sequence:

- We are executing a query that deletes all keys WHERE key > 5
- We delete <key=6> from an interior node, and take a replacement
  <key=5> from the left subtree of that interior page
- next() is called, and we land on the interior node again, which
  now has <key=5>, and we incorrectly delete it even though our
  WHERE condition is key > 5.

This PR:
- Tracks `interior_node_was_replaced` in CheckNeedsBalancing
- If no balancing is needed and a replacement occurred, advances once
  so the next invocation of next() will skip the replaced cell properly

i.e. we prevent next() from landing on the replaced content and ensures iteration continues with the next logical record.

Closes #3045
2025-09-12 10:49:44 +03:00
Jussi Saurio
9b14c0022d Implement the balance_quick algorithm
Fast balancing routine for the common special case where the rightmost leaf page of a given subtree overflows (= an append).
In this case we just add a new leaf page as the right sibling of that page, and insert a new divider cell into the parent.
The high level steps are:
1. Allocate a new leaf page and insert the overflow cell payload in it.
2. Create a new divider cell in the parent - it contains the page number of the old rightmost leaf, plus the largest rowid on that page.
3. Update the rightmost pointer of the parent to point to the new leaf page.
4. Continue balance from the parent page (inserting the new divider cell may have overflowedImplement the balance_quick algorithm
2025-09-12 00:42:27 +03:00
PThorpe92
b93ad749a9 Remove some traces in super hot paths in btree 2025-09-10 09:54:32 -04:00
Pekka Enberg
bb3fbb7962 Merge 'check freelist count in integrity check' from Jussi Saurio
Closes #3003
2025-09-10 16:15:39 +03:00
Jussi Saurio
d7ce781a2a Merge 'Enable the use of indexes in DELETE statements' from Jussi Saurio
Closes #1714
This PR enables the use of an index as the iteration cursor for a point
or range deletion operation. Main changes:
- Use `Delete` opcode for the index that is iterating the rows - avoids
unnecessary seeking on that index, since it's already positioned
correctly
- Fix delete balancing; details below:
### current state
- a deletion may cause a btree rebalancing operation
- to get the cursor back to the right place after a rebalancing, we must
remember what the deleted key was and seek to it
- right now we are using `SeekOp::LT` to move to one slot BEFORE the
deleted key, so that if we delete rows in a loop, the following `Next()`
call will put us back into the right place
### problem
- When we delete multiple rows, we always iterate forwards. Using
`SeekOp::LT` implies backwards iteration, but it works OK for table
btrees since the cursor never remains on an internal node, because table
internal cells do not have payloads. However: this behavior is
problematic for indexes because we can effectively end up skipping
visiting a page entirely. Honestly: despite spending some debugging the
_old_ code, I still don't remember what exactly causes this to happen.
:) It's one of the `iter_dir` specific behaviors in `indexbtree_move_to`
or `get_prev_record()`, but I'm too tired to spend more time figuring it
out. I had the reason in my head before going on vacation, but it was
evicted from the cache it seems...
### solution
use `SeekOp::GE { eq_only: true }` instead and make the next call to
`Next()` a no-op instead. This has the same effect as SeekOp::LT +
next(), but without introducing bugs due to `LT` being implied backwards
iteration.

Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #2981
2025-09-10 16:00:54 +03:00
Jussi Saurio
e3594d0ae0 make the comment for skip_advance more accurate 2025-09-10 15:38:57 +03:00
Jussi Saurio
618f51330a advance despite skip_advance flag if cursor not pointing at record 2025-09-10 14:54:51 +03:00
Jussi Saurio
80f8794fda add comments 2025-09-10 14:54:51 +03:00
Jussi Saurio
36ec654631 Seek with GE after delete balancing and skip next advance 2025-09-10 14:54:51 +03:00
Jussi Saurio
df83b56083 check freelist count in integrity check 2025-09-10 14:53:28 +03:00
Pekka Enberg
2131a04b7d core: Rename IO::run_once() to IO::step()
The `run_once()` name is just a historical accident. Furthermore, it now
started to appear elsewhere as well, so let's just call it IO::step() as we
should have from the beginning.
2025-09-10 14:36:02 +03:00
PThorpe92
ccae3ab0f2 Change callsites to cancel any further IO when an error occurs and drain 2025-09-08 13:18:40 -04:00
Pekka Enberg
081a7b563b Merge 'Fix crash in Next opcode if cursor stack has no pages' from Jussi Saurio
Closes #2924
Unsure if this fix is that great, but it does fix the issue described in
#2924 -- added minimal regression test to illustrate the behavior
This crash requires a pretty specific set of circumstances:
- 3-way join with two innermost being left joins
- nullable seek key on the innermost table:
    * middle table gets nulled out because no matches with the outermost
table
    * hence when we seek the innermost table using middle table values,
the seek key is null, so `Insn::IsNull` entirely skips the innermost
table
Perhaps a bytecode plan illustrates this better:
```sql
turso> explain select a.x, b.x, c.x from a left join b on a.y=b.x left join c on b.y=c.x;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     34    0                    0   Start at 34
1     OpenRead           0     2     0                    0   table=a, root=2, iDb=0
2     OpenRead           1     4     0                    0   table=b, root=4, iDb=0
3     OpenRead           2     5     0                    0   index=sqlite_autoindex_b_1, root=5, iDb=0
4     OpenRead           3     7     0                    0   index=sqlite_autoindex_c_1, root=7, iDb=0
5     Rewind             0     33    0                    0   Rewind table a
6       Integer          0     4     0                    0   r[4]=0
7       Column           0     1     6                    0   r[6]=a.y
8       IsNull           6     28    0                    0   if (r[6]==NULL) goto 28
9       SeekGE           2     28    6                    0   key=[6..6]
10        IdxGT          2     28    6                    0   key=[6..6]
11        DeferredSeek   2     1     0                    0   
12        Integer        1     4     0                    0   r[4]=1
13        Integer        0     5     0                    0   r[5]=0
14        Column         1     1     7                    0   r[7]=b.y
-- if b.y is NULL, we skip the entire table loop between insns 16-23
-- except when we call NullRow and then Goto to re-enter that loop in order to
-- return NULL values for the table
15        IsNull         7     24    0                    0   if (r[7]==NULL) goto 24
16        SeekGE         3     24    7                    0   key=[7..7]
17          IdxGT        3     24    7                    0   key=[7..7]
18          Integer      1     5     0                    0   r[5]=1
19          Column       0     0     1                    0   r[1]=a.x
20          Column       1     0     2                    0   r[2]=b.x
21          Column       3     0     3                    0   r[3]=sqlite_autoindex_c_1.x
22          ResultRow    1     3     0                    0   output=r[1..3]
23        Next           3     17    0                    0   
24        IfPos          5     27    0                    0   r[5]>0 -> r[5]-=0, goto 27
25        NullRow        3     0     0                    0   Set cursor 3 to a (pseudo) NULL row
26        Goto           0     18    0                    0   
27      Next             2     10    0                    0   
28      IfPos            4     32    0                    0   r[4]>0 -> r[4]-=0, goto 32
29      NullRow          1     0     0                    0   Set cursor 1 to a (pseudo) NULL row
30      NullRow          2     0     0                    0   Set cursor 2 to a (pseudo) NULL row
31      Goto             0     12    0                    0   
32    Next               0     6     0                    0   
33    Halt               0     0     0                    0   
34    Transaction        0     0     3                    0   iDb=0 write=false
35    Goto               0     1     0                    0
```

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2967
2025-09-08 17:45:29 +03:00
Jussi Saurio
5820f691af fix: do not crash in Next if cursor stack has no pages 2025-09-08 16:54:35 +03:00
TcMits
3aa4650f06 make mr.clippy happy 2025-09-08 18:24:50 +07:00
TcMits
a6ff568530 reduce cloning 'Arc<Page>' 2025-09-08 18:00:18 +07:00
Nikita Sivukhin
cd627c2368 remove unnecessary changes 2025-09-07 19:56:06 +04:00
Nikita Sivukhin
5b9fe0cdf3 fix 2025-09-07 19:56:06 +04:00
Nikita Sivukhin
0b6a6e7713 remove comma 2025-09-07 19:56:06 +04:00
Nikita Sivukhin
9aed831f2f format 2025-09-07 19:56:05 +04:00
Nikita Sivukhin
db7c6b3370 try to speed up count(*) where 1 = 1 2025-09-07 19:55:42 +04:00
Nikita Sivukhin
c374cf0c93 remove Cell/RefCell from PageStack 2025-09-07 19:54:50 +04:00
PThorpe92
03d5598cfb Use sieve algorithm in page cache in place of full LRU 2025-09-05 16:13:26 -04:00
Jussi Saurio
ce860b7ec9 clippy 2025-08-28 21:48:29 +03:00
Jussi Saurio
9aae3fa859 refactor: remove BTreePageInner
it wasn't used for anything. no more `page.get().get().id`.
2025-08-28 21:44:54 +03:00
Pekka Enberg
2ea4354afe Merge 'Improve integrity check' from Nikita Sivukhin
- check free list trunk and pages
- use shared hash map to check for duplicate references for pages
- properly check overflow pages

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #2816
2025-08-28 16:06:15 +03:00
Nikita Sivukhin
1c0efcfbff fix clippy 2025-08-27 23:22:21 +04:00
Nikita Sivukhin
09d4590ece fix compilation 2025-08-27 23:19:26 +04:00
Nikita Sivukhin
ae705445bf improve integrity check
- check free list trunk and pages
- use shared hash map to check for duplicate references for pages
- properly check overflow pages
2025-08-27 23:14:21 +04:00
Avinash Sajjanshetty
2c0842ff52 Set and propagate IOContext as required 2025-08-27 22:05:01 +05:30
Pekka Enberg
3176df64a2 Merge 'Fix: return NULL for rowid() when cursor's null flag is on' from Jussi Saurio
Fixes TPC-H query 13 from returning an incorrect result. In this
specific case, we were returning non-null `IdxRowid` values for the
right-hand side table even when there was no match with the left-hand
side table, meaning the join produced matches even in cases where there
shouldn't have been any.
Closes #2794

Closes #2795
2025-08-26 09:33:49 +03:00