Different scan parameters are required for different table types.
Currently, index and iteration direction are only used by B-tree tables,
while the remaining table types don’t require any parameters. Planning
access to virtual tables, however, will require passing additional
information from the planner, such as the virtual table index (distinct
from a B-tree index) and the constraints that must be forwarded to the
`filter` method.
Previously, AccessMethod stored fields like `iter_dir`, `index`, and
`constraint_refs` directly, but these only applied to BTree tables.
Other table types (virtual tables, subqueries) either ignored these
fields or required different parameters entirely.
This change prepares the planner to handle virtual table access methods
with their own specialized parameters.
This matches SQLite’s behavior and will help in the future to
differentiate between an invalid function invocation (missing argument,
not provided by the user) and an invalid combination of constraints
proposed by the planner.
No new integration tests are added, since this case was already covered
by the `filter` method. With the ability to return result codes from
`best_index`, we can now detect this error earlier.
The `best_index` implementation now returns a ResultCode along with the
IndexInfo. This allows it to signal specific outcomes, such as errors or
constraint violations. This change aligns better with SQLite’s xBestIndex
contract, where cases like missing constraints or invalid combinations of
constraints must not result in a valid plan.
The `filter` methods for extensions affected by this fix expect arguments
to be passed in a specific order. For example, `generate_series` assumes
that if the `start` argument exists, it is always passed to `filter`
first. If `start` does not exist, then `stop` is passed first — but
`stop` must never come before `start`.
Previously, this was not guaranteed: `best_index` relied on constraints
being passed in the order matching `filter`'s expectations.
Previously, there were two ways to indicate that a constraint should not
be passed to the filter function: setting `argv_index` to `None` or to
a value less than 1. This was redundant, so now only `None` is used.
Additional changes:
- Update IndexInfo documentation to clarify that constraint_usages must
have exact 1:1 correspondence with input ConstraintInfo array. The code
translating constraints into VFilter arguments heavily relies on this.
- Fix best_index implementation in test extension to comply with new
validation requirements by returning usage entry for each constraint
Closes#2421
## Background
We have some kind of transaction-local hack (`start_pages_in_frames`)
for bookkeeping how many pages are currently in the in-memory WAL frame
cache, I assume for performance reasons or whatever.
`wal.rollback()` clears all the frames from `shared.frame_cache` that
the rollbacking tx is allowed to clear, and then truncates
`shared.pages_in_frames` to however much its local
`start_pages_in_frames` value was.
## Problem
In `complete_append_frame`, we check if `frame_cache` has that key
(page) already, and if not, we add it to `pages_in_frames`.
However, `wal.rollback()` never _removes_ the key (page) if its value is
empty, so we can end up in a scenario where the `frame_cache` key for
`page P` exists but has no frames, and so `page P` does not get added to
`pages_in_frames` in `complete_append_frame`.
This leads to a checkpoint data loss scenario:
- transaction rolls back, has start_pages_in_frames=0, so truncates
shared pages_in_frames to an empty vec. let's say `page P` key in
`frame_cache` still remains but it has no frames.
- The next time someone commits a frame for `page P`, it does NOT get
added to `pages_in_frames` because `frame_cache` has that key (although
the value vector is empty)
- At some point, a checkpoint checkpoints `n` frames, but since
`pages_in_frames` does not have `page P`, it doesn't actually checkpoint
it and all the "checkpointed" frames are simply thrown away
- very similar to the scenario in #2366
## Fix
Remove the `start_pages_in_frames` hack entirely and just make
`pages_in_frames` effectively the same as `frame_cache.keys`. I think we
could also just get rid of `pages_in_frames` and just use
`frame_cache.contains_key(p)` but maybe Pere can chime in here
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#2422
Resolves#2378.
```
`ALTER TABLE _ RENAME TO _`/limbo_rename_table/
time: [15.645 ms 15.741 ms 15.850 ms]
Found 12 outliers among 100 measurements (12.00%)
8 (8.00%) high mild
4 (4.00%) high severe
`ALTER TABLE _ RENAME TO _`/sqlite_rename_table/
time: [34.728 ms 35.260 ms 35.955 ms]
Found 15 outliers among 100 measurements (15.00%)
8 (8.00%) high mild
7 (7.00%) high severe
```
<img width="1000" height="199" alt="image" src="https://github.com/user-
attachments/assets/ad943355-b57d-43d9-8a84-850461b8af41" />
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2399
Closes#2431
Discovered while fuzzing #2086
## What
We update `schema_version` whenever the schema changes
## Problem
Probably unintentionally, we were calling `SetCookie` in a loop for each
row in the target table, instead of only once at the end. This means 2
things:
- For large `n`, this is a lot of unnecessary instructions
- For `n==0`, `SetCookie` doesn't get called at all -> the schema won't
get marked as having been updated -> conns can operate on a stale schema
## Fix
Lift `SetCookie` out of the loop
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#2432
WAL API shouldn't be exposed by default because this is relatively
dangerous API which we use internally and ordinary users shouldn't not
be interested in it.
Reviewed-by: Pekka Enberg <penberg@iki.fi>
Closes#2424
We have some kind of transaction-local hack (`start_pages_in_frames`) for bookkeeping
how many pages are currently in the in-memory WAL frame cache,
I assume for performance reasons or whatever.
`wal.rollback()` clears all the frames from `shared.frame_cache` that the rollbacking tx is
allowed to clear, and then truncates `shared.pages_in_frames` to however much its local
`start_pages_in_frames` value was.
In `complete_append_frame`, we check if `frame_cache` has that key (page) already, and if not,
we add it to `pages_in_frames`.
However, `wal.rollback()` never _removes_ the key (page) if its value is empty, so we can end
up in a scenario where the `frame_cache` key for `page P` exists but has no frames, and so `page P`
does not get added to `pages_in_frames` in `complete_append_frame`.
This leads to a checkpoint data loss scenario:
- transaction rolls back, has start_pages_in_frames=0, so truncates
shared pages_in_frames to an empty vec. let's say `page P` key in `frame_cache` still remains
but it has no frames.
- The next time someone commits a frame for `page P`, it does NOT get added to `pages_in_frames`
because `frame_cache` has that key
- At some point, a PASSIVE checkpoint checkpoints `n` frames, but since `pages_in_frames` does not have
`page P`, it doesn't actually checkpoint it and all the "checkpointed" frames are simply thrown away
- very similar to the scenario in #2366
Remove the `start_pages_in_frames` hack entirely and just make `pages_in_frames` effectively
the same as `frame_cache.keys`. I think we could also just get rid of `pages_in_frames` and just use
`frame_cache.contains_key(p)` but maybe Pere can chime in here
## Background
When we get a new rowid using `op_new_rowid()`, we move to the end of
the btree to look at what the maximum rowid currently is, and then
increment it by one.
This requires a btree seek.
## Problem
If we were already on the rightmost page, this is a lot of unnecessary
work, including potentially a few page reads from disk (although to be
fair the ancestor pages are very likely to be in cache at this point.)
## Fix
Cache the rightmost page id whenever we enter it in
`move_to_rightmost()`, and invalidate it whenever we do a balancing
operation.
## Local benchmark results
```sql
Insert rows in batches/limbo_insert_1_rows
time: [23.333 µs 27.718 µs 35.801 µs]
change: [-7.7924% +0.8805% +12.841%] (p = 0.91 > 0.05)
No change in performance detected.
Insert rows in batches/limbo_insert_10_rows
time: [38.204 µs 38.381 µs 38.568 µs]
change: [-8.7188% -7.4786% -6.1955%] (p = 0.00 < 0.05)
Performance has improved.
Insert rows in batches/limbo_insert_100_rows
time: [158.39 µs 165.06 µs 178.37 µs]
change: [-21.000% -18.789% -15.666%] (p = 0.00 < 0.05)
Performance has improved.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#2409
This should be safe to do as:
1. page cache is private per connection
2. since this connection wrote the flushed pages/frames, they are up to
date from its perspective
3. multiple concurrent statements inside one connection are not
snapshot-transactional even in sqlite
Reviewed-by: Pekka Enberg <penberg@iki.fi>
Closes#2407
https://github.com/tursodatabase/turso/pull/1256 switched cargo-dist to
Astral's forked version, but, recently, the official repository got a
new maintainer and started to be maintained again.
Their latest release, [v0.29.0](https://github.com/axodotdev/cargo-
dist/releases/tag/v0.29.0), now includes the features originally added
to Astral's version. So, probably it's a good time to switch back to the
official cargo-dist. That said, as there's no significant changes from
Astral's version, it's also fine to hold the current one.
Closes#2398
While working on #2151 I saw myself forced to do things like:
```rust
assert_eq!(
6,
*result
.next()
.await?
.unwrap()
.get_value(0)?
.as_integer()
.unwrap()
);
```
Just to get a simple value from a row, now with this PR users can just
do:
```rust
assert_eq!(6, result.get::<i32>(0)?);
```
(Thanks libsql devs, this is so much better!)
Closes#2377
This will save some work when yielding to IO. Previously, on every
invocation, if the record was a packed record, we parsed it and iterated
through the values to check for nulls. Now, the pre-seeking work is done
only once.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#2394