Commit Graph

823 Commits

Author SHA1 Message Date
Pekka Enberg
55cf9c8f02 Merge 'Add async header accessor functionality' from Zaid Humayun
This PR addresses https://github.com/tursodatabase/turso/issues/1828 in
a phased manner.
Making database header access async in one PR will be complicated. This
PR ports adds an async API to `header_accessor.rs` and ports over some
of `pager.rs` to use this API.
This will allow gradual porting over of all call sites. Once all call
sites are ported over, one mechanical rename will fix everything in the
repo so we don't have any `<header_name>_async` functions.
Also, porting header accessors over from sync to async would be a good
way to get introduced to the Limbo codebase for first time contributors.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1966
2025-07-14 13:08:29 +03:00
Pekka Enberg
90532eabdf Merge 'b-tree: fix bug in case when no matching rows was found in seek in the leaf page' from Nikita Sivukhin
Current table B-Tree seek code rely on the invariant that if key `K` is
present in interior page then it also must be present in the leaf page.
This is generally not true if data was ever deleted from the table
because leaf row which key was used as a divider in the interior pages
can be deleted. Also, SQLite spec says nothing about such invariant - so
`turso-db` implementation of B-Tree should not rely on it.
This PR introduce 3 options for B-Tree `seek` result: `Found` /
`NotFound` and `TryAdvance` which is generated when leaf page have no
match for `seek_op` but DB don't know if neighbor page can have matching
data.
There is an alternative approach where we can move cursor in the `seek`
itself to the neighbor page - but I was afraid to introduce such changes
because analogue `seek` function from SQLite works exactly like current
version of the code and I think some query planner internals (for
insertion) can rely on the fact that repositioning will leave cursor at
the position of insertion:
> ** If an exact match is not found, then the cursor is always
** left pointing at a leaf page which would hold the entry if it
** were present.  The cursor might point to an entry that comes
** before or after the key.
Also, this PR introduces new B-tree fuzz tests which generate table
B-tree from scratch and execute opreations over it. This can help to
reach some non trivial states and also generate huge DBs faster (that's
how this bug was discovered)

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #2065
2025-07-14 12:57:09 +03:00
Pekka Enberg
1a0d618a41 Merge 'Assert I/O read and write sizes' from Pere Diaz Bou
Let's assert **for now** that we do not read/write less bytes than
expected. This should be fixed to retrigger several reads/writes if we
couldn't read/write enough but for now let's assert.

Closes #2078
2025-07-14 12:22:18 +03:00
Nikita Sivukhin
5bd3287826 add comments 2025-07-14 13:01:15 +04:00
Nikita Sivukhin
6e2ccdff20 add btree fuzz tests which generate seed file from scratch 2025-07-14 13:01:15 +04:00
Nikita Sivukhin
fc400906d5 handle case when target seek page has no matching entries 2025-07-14 13:01:15 +04:00
Nikita Sivukhin
03b2725cc7 return SeekResult from seek operation
- Apart from regular states Found/NotFound seek result has TryAdvance
  value which tells caller to advance the cursor in necessary direction
  because the leaf page which would hold the entry if it was present
  actually has no matching entry (but neighbouring page can have match)
2025-07-14 13:01:15 +04:00
Pere Diaz Bou
340391538a io: change comment for assert 2025-07-14 10:36:06 +02:00
Pere Diaz Bou
88ff218810 io: assert small I/O
Let's assert **for now** that we do not read/write less bytes than
expected. This should be fixed to retrigger several reads/writes if we
couldn't read/write enough but for now let's assert.
2025-07-14 10:19:41 +02:00
Nikita Sivukhin
f61d733dd3 make new functions dependend on "json" Cargo feature 2025-07-14 11:26:51 +04:00
Nikita Sivukhin
c9e7271eaf properly pass subtype 2025-07-14 11:20:49 +04:00
Nikita Sivukhin
81cd04dd65 add bin_record_json_object and table_columns_json_array functions 2025-07-14 11:19:45 +04:00
Krishna Vishal
ea4a4708ea - Address some review comments
- Add docs for `RecordCursor`
2025-07-14 03:28:55 +05:30
Krishna Vishal
b1f27cad94 chore: fix clippy 2025-07-14 03:28:55 +05:30
Krishna Vishal
d3368a28bc fix merge conflicts 2025-07-14 03:28:55 +05:30
Krishna Vishal
e7e5f28c0a chore: Clippy chill 2025-07-14 03:28:54 +05:30
Krishna Vishal
9b315d1d7e Manually inline the record deserialization code for performance.
This is done because the compiler is refusing to inline even after
adding inline hint.
- Get refvalues from directly from registers without using
`make_record`
2025-07-14 03:28:54 +05:30
Krishna Vishal
35ed279644 Clean up indexbtree_move_to 2025-07-14 03:28:54 +05:30
Krishna Vishal
f0e8e5871b Replace compare_immutable with compare_records_generic 2025-07-14 03:28:54 +05:30
Krishna Vishal
860de412d9 Add num_columns to BTreeCursor so we can initialize
`Vecs` inside `RecordCursor` to their appropriate to reduce
allocations.
2025-07-14 03:28:54 +05:30
Krishna Vishal
ef147181c2 Working version of the incremental column and "optimal" record
compare functions. Now we optimize them
2025-07-14 03:28:54 +05:30
Krishna Vishal
692f0413eb Stash 2025-07-14 03:28:54 +05:30
Krishna Vishal
0b1ed44c1f Add optimized index record compare methods
compare_records_int
compare_records_string
compare_records_generic

comapre_records_generic will still be more efficient than compare-
_immutable because it deserializes the record column by column
2025-07-14 03:28:54 +05:30
Krishna Vishal
c7aa3c3d93 Fix btree to invalidate RecordCursor
Use `read_value` instead of `deserialize_column_data`
Add `sqlite_int_float_compare` which takes care of out of range
floats
2025-07-14 03:28:54 +05:30
Krishna Vishal
2323763a4f Integrate incremental column parsing into btree.rs 2025-07-14 03:28:54 +05:30
Jussi Saurio
a48b6d049a Another post-rebase clippy round with 1.88.0 2025-07-12 19:10:56 +03:00
Nils Koch
1a91966c7e fix clippy errors for rust 1.88.0 (manual fix) 2025-07-12 18:58:55 +03:00
Nils Koch
828d4f5016 fix clippy errors for rust 1.88.0 (auto fix) 2025-07-12 18:58:41 +03:00
Zaid Humayun
2f0ff89e28 resolved jussi's comment https://github.com/tursodatabase/turso/pull/1966#discussion_r2201864782
this commit removes !page.is_loaded() from header_accessor cause it's not required
2025-07-12 09:44:31 +05:30
Zaid Humayun
90a5a53b0e Added Async Header Accessor API's
This commit introduces async header accessor API's in addition to the sync ones. Allows gradual porting instead of one big shot PR.
2025-07-12 09:38:18 +05:30
Jussi Saurio
0009683566 Merge 'btree/balance/validation: fix divider cell insert validation' from Jussi Saurio
Closes #2047
the validation code was assuming that:
- if the parent has overflow cells after inserting a divider cell
- the exact divider we are validating MUST be in those overflow cells
However, this is not necessarily the case. Imagine:
- First divider gets inserted at index `n`. It is too large to fit, so
it gets pushed to `parent.overflow_cells()`. Parent usable space does
not decrease.
- Second divider gets inserted at index `n+1`. It is smaller, so it
still fits in usable space.
Hence:
Provide information to the validation function about whether the
inserted cell overflowed, and use that to find the left pointer and
assert accordingly.

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #2050
2025-07-11 15:09:19 +03:00
Pekka Enberg
7b646da82c Merge 'Btree: more balance docs' from Jussi Saurio
Continuation from #2031 -- some variable extractions and comments

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #2035
2025-07-11 15:06:45 +03:00
Jussi Saurio
c81522c20f btree/balance/validation: fix divider cell insert validation
the validation code was assuming that:

- if the parent has overflow cells after a inserting divider cell
- the exact divider we are validating MUST be in those overflow cells

However, this is not necessarily the case. Imagine:

- First divider gets inserted at index `n`. It is too large to fit,
  so it gets pushed to `parent.overflow_cells()`. Parent usable space
  does not decrease.
- Second divider gets inserted at index `n+1`. It is smaller, so it
  still fits in usable space.

Hence:

Provide information to the validation function about whether the inserted
cell overflowed, and use that to find the left pointer and assert accordingly.
2025-07-11 11:21:05 +03:00
Jussi Saurio
99df69f603 btree/balance/validation: fix use-after-free of rightmost ptr validation
We can use `right_page_id` directly to perform the validation instead of
carrying a raw pointer around which might be invalidated by the time we
do the validation.
2025-07-10 18:23:54 +03:00
Jussi Saurio
a403e55319 btree: add comment about when left-to-right size balancing is stopped 2025-07-10 15:52:18 +03:00
Jussi Saurio
bc328f9738 btree: one more is_last_sibling doc variable 2025-07-10 15:39:22 +03:00
Jussi Saurio
fc27c08e11 clippy 2025-07-10 15:36:46 +03:00
Jussi Saurio
fba05b1998 btree: add named range variables to make cell movement double-pass clearer 2025-07-10 15:34:41 +03:00
Jussi Saurio
8f1109692f btree: replace a bunch of 'count-1' conditions with is_last_sibling variables 2025-07-10 15:31:53 +03:00
Jussi Saurio
a81d81685f btree: rename divider_cells to divider_cell_payloads for clarity 2025-07-10 15:29:01 +03:00
Jussi Saurio
5d0b410e70 btree: rename constant to mention siblings 2025-07-10 15:28:11 +03:00
Jussi Saurio
78a249c6d0 btree: add MAX_SIBLING_PAGES_TO_BALANCE constant and use it 2025-07-10 15:27:37 +03:00
Jussi Saurio
6ff13113ce btree: add MAX_NEW_PAGES_AFTER_BALANCE constant and use it 2025-07-10 15:24:32 +03:00
Jussi Saurio
10d301c53c btree: use size-related new constants instead of literal numbers 2025-07-10 15:21:00 +03:00
Jussi Saurio
59b4f1310b sqlite3_ondisk: add some size constants 2025-07-10 15:16:37 +03:00
Jussi Saurio
925a252815 Merge 'btree: Improve balance non root docs' from Jussi Saurio
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #2031
2025-07-10 15:07:18 +03:00
Jussi Saurio
0b8c5f7c91 btree/balance: extra doc context for CellArray::cell_payloads 2025-07-10 15:06:27 +03:00
Jussi Saurio
475bced4f7 btree/balance: remove obsolete todo 2025-07-10 14:58:00 +03:00
Jussi Saurio
0316b5a517 btree/balance: rename CellArray::cell_data to cell_payloads 2025-07-10 14:57:45 +03:00
Jussi Saurio
0d973d78a9 btree/balance: add a diagram about divider cell assignment and some comments 2025-07-10 14:56:59 +03:00