This adds basic support for window functions. For now:
* Only existing aggregate functions can be used as window functions.
* Specialized window-specific functions (`rank`, `row_number`, etc.) are
not yet supported.
* Only the default frame definition is implemented:
`RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW EXCLUDE NO OTHERS`.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3079
Based on #3126Closes#3029Closes#3030Closes#3065Closes#3083Closes#3084Closes#3085
simple reason why mvcc update didn't work: it didn't try to update.
Closes#3127
## Problem
When a delete replaces an index interior cell, the replacement key is LT
the deleted key. Currently on the main branch, after the deletion
happens, the following call to BTreeCursor::next() stops at the replaced
interior cell.
This is incorrect - imagine the following sequence:
- We are executing a query that deletes all keys WHERE key > 5
- We delete <key=6> from an interior node, and take a replacement
<key=5> from the left subtree of that interior page
- next() is called, and we land on the interior node again, which now
has <key=5>, and we incorrectly delete it even though our WHERE
condition is key > 5.
## Solution
This PR:
- Tracks `interior_node_was_replaced` in CheckNeedsBalancing
- If no balancing is needed and a replacement occurred, advances once so
the next invocation of next() will skip the replaced cell properly
i.e. we prevent next() from landing on the replaced content and ensures
iteration continues with the next logical record.
## Details
This problem only became apparent once we started using indexes as valid
iteration cursors for DELETE operations in #2981Closes#3045
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3049
When a delete replaces an interior cell, the replacement key is LT the
deleted key. Currently on the main branch, after the deletion happens,
the following call to BTreeCursor::next() stops at the replaced interior
cell.
This is incorrect - imagine the following sequence:
- We are executing a query that deletes all keys WHERE key > 5
- We delete <key=6> from an interior node, and take a replacement
<key=5> from the left subtree of that interior page
- next() is called, and we land on the interior node again, which
now has <key=5>, and we incorrectly delete it even though our
WHERE condition is key > 5.
This PR:
- Tracks `interior_node_was_replaced` in CheckNeedsBalancing
- If no balancing is needed and a replacement occurred, advances once
so the next invocation of next() will skip the replaced cell properly
i.e. we prevent next() from landing on the replaced content and ensures iteration continues with the next logical record.
Closes#3045
Fast balancing routine for the common special case where the rightmost leaf page of a given subtree overflows (= an append).
In this case we just add a new leaf page as the right sibling of that page, and insert a new divider cell into the parent.
The high level steps are:
1. Allocate a new leaf page and insert the overflow cell payload in it.
2. Create a new divider cell in the parent - it contains the page number of the old rightmost leaf, plus the largest rowid on that page.
3. Update the rightmost pointer of the parent to point to the new leaf page.
4. Continue balance from the parent page (inserting the new divider cell may have overflowedImplement the balance_quick algorithm
Closes#1714
This PR enables the use of an index as the iteration cursor for a point
or range deletion operation. Main changes:
- Use `Delete` opcode for the index that is iterating the rows - avoids
unnecessary seeking on that index, since it's already positioned
correctly
- Fix delete balancing; details below:
### current state
- a deletion may cause a btree rebalancing operation
- to get the cursor back to the right place after a rebalancing, we must
remember what the deleted key was and seek to it
- right now we are using `SeekOp::LT` to move to one slot BEFORE the
deleted key, so that if we delete rows in a loop, the following `Next()`
call will put us back into the right place
### problem
- When we delete multiple rows, we always iterate forwards. Using
`SeekOp::LT` implies backwards iteration, but it works OK for table
btrees since the cursor never remains on an internal node, because table
internal cells do not have payloads. However: this behavior is
problematic for indexes because we can effectively end up skipping
visiting a page entirely. Honestly: despite spending some debugging the
_old_ code, I still don't remember what exactly causes this to happen.
:) It's one of the `iter_dir` specific behaviors in `indexbtree_move_to`
or `get_prev_record()`, but I'm too tired to spend more time figuring it
out. I had the reason in my head before going on vacation, but it was
evicted from the cache it seems...
### solution
use `SeekOp::GE { eq_only: true }` instead and make the next call to
`Next()` a no-op instead. This has the same effect as SeekOp::LT +
next(), but without introducing bugs due to `LT` being implied backwards
iteration.
Reviewed-by: Nikita Sivukhin (@sivukhin)
Closes#2981
The `run_once()` name is just a historical accident. Furthermore, it now
started to appear elsewhere as well, so let's just call it IO::step() as we
should have from the beginning.
- check free list trunk and pages
- use shared hash map to check for duplicate references for pages
- properly check overflow pages
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#2816
Fixes TPC-H query 13 from returning an incorrect result. In this
specific case, we were returning non-null `IdxRowid` values for the
right-hand side table even when there was no match with the left-hand
side table, meaning the join produced matches even in cases where there
shouldn't have been any.
Closes#2794Closes#2795