this is only used for returning LimboResult::Busy, and we already
have LimboError::Busy, so it only adds confusion.
Moreover, the current busy handler was not handling LimboError::Busy,
because it's returned as an error, not as Ok. So this may fix the
"busy handler not working" issue in the perf thrpt benchmark.
This PR extends the existing encryption support to include the database
header page (page 1).
Reviewed-by: Avinash Sajjanshetty (@avinassh)
Closes#3040
This adds basic support for window functions. For now:
* Only existing aggregate functions can be used as window functions.
* Specialized window-specific functions (`rank`, `row_number`, etc.) are
not yet supported.
* Only the default frame definition is implemented:
`RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW EXCLUDE NO OTHERS`.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3079
current logic can lead to a situation where:
- we call read_page(trunk_page_id)
- we assign trunk_page in the FreePageState state machine
- the page read fails and cache marks it as !locked && !loaded
- next call to Pager::free_page() asserts that the page is loaded and panics
Based on #3126Closes#3029Closes#3030Closes#3065Closes#3083Closes#3084Closes#3085
simple reason why mvcc update didn't work: it didn't try to update.
Closes#3127
This patch adds checksums to Turso DB. You may check the design here in
the [RFC](https://github.com/tursodatabase/turso/issues/2178).
1. We use reserved bytes (8 bytes) to store the checksums. On every IO
read, we verify that the checksum matches.
2. We use twox hash for checksums.
3. Checksum works only on 4K pages now. It's a small change to enable
for all other sizes, I will send another PR.
4. Right now, it's not possible to switch to different algorithm or turn
off altogether. That will be added in the future PRs.
5. Checksums can be enabled only for new dbs. For existing DBs, we will
disable it.
6. To add checksums for existing DBs, we need vacuum since it would
require rewrite of whole db.
Closes#2840
Retrying fsync() on error was historically not safe ("fsyncgate") and
Postgres still defaults to panicing on fsync(). Therefore, add a
"data_sync_retry" pragma (disabled by default) and use it to determine
whether to panic on fsync() error or not.
## Problem
When a delete replaces an index interior cell, the replacement key is LT
the deleted key. Currently on the main branch, after the deletion
happens, the following call to BTreeCursor::next() stops at the replaced
interior cell.
This is incorrect - imagine the following sequence:
- We are executing a query that deletes all keys WHERE key > 5
- We delete <key=6> from an interior node, and take a replacement
<key=5> from the left subtree of that interior page
- next() is called, and we land on the interior node again, which now
has <key=5>, and we incorrectly delete it even though our WHERE
condition is key > 5.
## Solution
This PR:
- Tracks `interior_node_was_replaced` in CheckNeedsBalancing
- If no balancing is needed and a replacement occurred, advances once so
the next invocation of next() will skip the replaced cell properly
i.e. we prevent next() from landing on the replaced content and ensures
iteration continues with the next logical record.
## Details
This problem only became apparent once we started using indexes as valid
iteration cursors for DELETE operations in #2981Closes#3045
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3049
This was causing checkpoint_seq to be 0 when we had already successfully
ran a passive checkpoint, and causing us to use improper pages from the
cache.
Fast balancing routine for the common special case where the rightmost
leaf page of a given subtree overflows such that the overflowing cell
would be the rightmost cell on the page -- i.e. an append. In this case
we just add a new leaf page as the right sibling of that page, put the
overflow cell there, and insert a new divider cell into the parent. The
high level steps are:
1. Allocate a new leaf page and insert the overflow cell payload in it.
2. Create a new divider cell in the parent - it contains the page number
of the old rightmost leaf, plus the largest rowid on that page.
3. Update the rightmost pointer of the parent to point to the new leaf
page.
4. Continue balance from the parent page (inserting the new divider cell
may have overflowed the parent
Closes#3041
When multiple tx writes happen concurrently in mvcc, max frame will be
updated. This new max_frame makes is the point of view of the other
transaction return busy because his current wal snapshot is outdated.
closes#3024
Also we snapshot the page when we determine that it's eligible, and pay a
memcpy instead of the read from disk, but this further prevents any in-memory
changes to the page/TOCTOU issues.
When a delete replaces an interior cell, the replacement key is LT the
deleted key. Currently on the main branch, after the deletion happens,
the following call to BTreeCursor::next() stops at the replaced interior
cell.
This is incorrect - imagine the following sequence:
- We are executing a query that deletes all keys WHERE key > 5
- We delete <key=6> from an interior node, and take a replacement
<key=5> from the left subtree of that interior page
- next() is called, and we land on the interior node again, which
now has <key=5>, and we incorrectly delete it even though our
WHERE condition is key > 5.
This PR:
- Tracks `interior_node_was_replaced` in CheckNeedsBalancing
- If no balancing is needed and a replacement occurred, advances once
so the next invocation of next() will skip the replaced cell properly
i.e. we prevent next() from landing on the replaced content and ensures iteration continues with the next logical record.
Closes#3045