This PR implements basic support for partial sync. Right now the scope
is limited to only `:memory:` IO and later will be properly expanded to
the file based IO later.
The main addition is `PartialDatabaseStorage` which make request to the
remote server for missing local pages on demand.
The main change is that now tursodatabase JS bindings accept optional
"external" IO event loop which in case of sync will drive `ProtocolIo`
internal work associated with remote page fetching tasks.
Closes#3931
# Fix: Clean up DBSP state table when dropping materialized views
## Problem
When dropping a materialized view, the internal DBSP state table (e.g.,
`__turso_internal_dbsp_state_v1_view_name`) and its automatic primary
key index were not being properly cleaned up. This caused two issues:
1. **Persistent schema entries**: The DBSP table and index entries
remained in `sqlite_schema` after dropping the view
2. **In-memory schema inconsistency**: The DBSP table remained in the
in-memory schema's `tables` HashMap, causing "table already exists"
errors when trying to recreate a materialized view with the same name
## Root Cause
The issue had two parts:
1. **Missing sqlite_schema cleanup**: The `translate_drop_view` function
deleted the view entry from `sqlite_schema` but didn't delete the
associated DBSP state table and index entries
2. **Missing in-memory schema cleanup**: The `remove_view` function
removed the materialized view from the in-memory schema but didn't
remove the DBSP state table and its indexes
## Solution
### Changes in `core/translate/view.rs`
- Added a second pass loop in `translate_drop_view` to scan
`sqlite_schema` and delete DBSP table and index entries
- The loop checks for entries matching the DBSP table name pattern
(`__turso_internal_dbsp_state_v{version}_{view_name}`) and the automatic
index name pattern (`sqlite_autoindex___turso_internal_dbsp_state_v{vers
ion}_{view_name}_1`)
- Registers for comparison values are allocated outside the loop for
efficiency
- Column registers are reused across loop iterations
### Changes in `core/schema.rs`
- Updated `remove_view` to also remove the DBSP state table and its
indexes from the in-memory schema's `tables` HashMap and `indexes`
collection
- This ensures consistency between the persistent schema
(`sqlite_schema`) and the in-memory schema
### Tests Added
Added two new test cases in `testing/materialized_views.test`:
1. **`matview-drop-cleans-up-dbsp-table`**: Explicitly verifies that
after dropping a materialized view:
- The view entry is removed from `sqlite_schema`
- The DBSP state table entry is removed from `sqlite_schema`
- The DBSP state index entry is removed from `sqlite_schema`
2. **`matview-recreate-after-drop`**: Verifies that a materialized view
can be successfully recreated after being dropped, which implicitly
tests that all underlying resources (including DBSP tables) are properly
cleaned up
## Testing
- All existing materialized view tests pass
- New tests specifically verify the cleanup behavior
- Manual testing confirms that materialized views can be dropped and
recreated without errors
## Related
This fix ensures that materialized views can be safely dropped and
recreated, resolving issues where the DBSP state table would persist and
cause conflicts.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3928
## What
Rowsets are used in SQLite for two purposes:
1. for membership tests on a set of `i64`s,
2. for in-order iteration of a set of `i64`s,
Both in cases where we can just use rowids (which are `i64`) instead of
building an entire ephemeral btree from a table's contents.
For example, in cases where a `DELETE FROM tbl WHERE ...` is performed
on a table that has any `BEFORE DELETE` triggers, SQLite collects the
table's rowids into a RowSet before actually performing the deletion.
This is similar to how an UPDATE that modifies rowids (or the index used
to iterate the UPDATE loop) will first collect the rows into an
ephemeral index, and same with `INSERT INTO ... SELECT`.
## Details
RowSet uses a "batch" concept where insertions of a given batch must be
guaranteed by caller to contain no duplicates and will be pushed into a
vector for O(1). When a new batch is started, the previous batch is
folded into a `BTreeSet` so that membership tests can be performed in
O(logn). As far as I can tell, the "in-order iteration" use case doesn't
use this batch logic at all.
## AI disclosure
This entire PR description was written by me - no AIs were harmed in the
production of it. However, the code itself was mostly vibecoded using
two agents in Cursor:
- Composer 1: given the SQLite opcode documentation and rowset.c source
code, and asked to implement the VDBE instructions and the RowSet
module.
- GPT-5: given the same SQLite docs and source code, and asked to review
Composer 1's work and write feedback into a separate markdown file.
This loop was run for roughly 4-5 iterations, where each time GPT-5's
feedback was given to Composer 1, until GPT-5 found nothing to comment
anymore.
After this, I instructed Composer 1 to improve the documentation to be
less stupid.
After that, I made a manual editing pass over the runtime code to e.g.
change boolean flags to a `RowSetMode` enum to make clearer that the
rowset has two distinct mutually exclusive purposes (membership tests
and in-order iteration), plus cleaned up some other dumb shit and added
comments.
I am still not sure if this saved time or not.
Closes#3938
When building text values, we could not pass ownership of newly created
strings, which meant a lot of the times we were double cloning strings,
one to transform, and one to build the Value
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#3932
Right now tursodb treat parameters/variable as non-constant. But
actually they are constant in a sense that parameters/variables has
fixed value during query execution which never changes.
This PR makes tursodb to treat parameters as constant and evaluate
expressions related to them only once.
One real-world scenario where this can be helpful is vector search
query:
```sql
SELECT id, vector_distance_jaccard(embedding, vector32_sparse(?)) as distance
FROM vectors
ORDER BY distance ASC
LIMIT ?
```
Without constant optimization, `vector32_sparse` function will be
executed for every row - which is very inefficient and query can be 100x
slower due to that (but there is no need to evaluate this function for
every query as we can transform text representation to binary only once)
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3936
Rowsets are used in SQLite for two purposes:
1. for membership tests on a set of `i64`s,
2. for in-order iteration of a set of `i64`s,
Both in cases where we can just use rowids (which are `i64`) instead of building an entire ephemeral btree from a table's contents.
For example, in cases where a `DELETE FROM tbl WHERE ...` is performed on a table that has any `BEFORE DELETE` triggers, SQLite collects the table's rowids into a RowSet before actually performing the deletion. This is similar to how an UPDATE that modifies rowids (or the index used to iterate the UPDATE loop) will first collect the rows into an ephemeral index, and same with `INSERT INTO ... SELECT`.
This entire PR description was written by me - no AIs were harmed in the production of it. However, the code itself was mostly vibecoded using two agents in Cursor:
- Composer 1: given the SQLite opcode documentation and rowset.c source code, and asked to implement the VDBE instructions and the RowSet module.
- GPT-5: given the same SQLite docs and source code, and asked to review Composer 1's work and write feedback into a separate markdown file.
This loop was run for roughly 4-5 iterations, where each time GPT-5's feedback was given to Composer 1, until GPT-5 found nothing to comment anymore.
After this, I instructed Composer 1 to improve the documentation to be less stupid.
After that, I made a manual editing pass over the runtime code to e.g. change boolean flags to a `RowSetMode` enum to make clearer that the rowset has two distinct mutually exclusive purposes (membership tests and in-order iteration), plus cleaned up some other dumb shit and added comments.
I am still not sure if this saved time or not.
This PR makes Completion to be `Send` and also force internal callbacks
to be `Send`.
The reasons for that is following:
1. `io_uring` right now can execute completion at any moment potentially
on arbitrary thread, so we already implicitly rely on that property of
`Completion` and its callbacks
2. In case of partial sync
(https://github.com/tursodatabase/turso/pull/3931), there will be an
additional requirement for Completion to be Send as it will be put in
the separate queue associated with `DatabaseStorage` (which is Send +
Sync) processed in parallel with main IO
3. Generally, it sounds pretty natural in the context of async io to
have `Send` Completion so it can be safely transferred between threads
The approach in the PR is hacky as `Completion` made `Send` in a pretty
unsafe way. The main reason why Rust can't derive `Send` automatically
is following:
1. Many completions holds `Arc<Buffer>` internally which needs to be
marked with unsafe traits explicitly as it holds `ptr: NonNull<u8>`
2. `Completion` holds `CompletionInner` as `Arc` which internally holds
completion callback as `Box<XXXComplete>`, but because it's guarded by
`Arc` - Rust forces completion callback to also be Sync (not only Send)
and as we usually move Completion in the callback - we get a cycle here
and with current code Send for Completion implies Sync for Completion.
So, in order to fix this, PR marks `ArenaBuffer` as Send + Sync and
forces completion callbacks to be Send + Sync too. It's seems like
`Sync` requirement is theoretically unnecessary and `Send` should be
enough - but with current code organization Send + Sync looks like the
simplest approach.
Making `ArenaBuffer` Sync sounds almost correct, although I am worried
about read/write access to it as internally `ArenaBuffer` do not
introduce any synchronization of its reads/writes - so potentially we
already can hit some multi-threading bugs with io_uring do to
`ArenaBuffer` used from different threads (or maybe there are some
implicit memory barriers in another parts of the code which can
guarantee us that we will properly use `ArenaBuffer` - but this sounds
like a pure luck)
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3935
The current implementation is simple, we have a pointer called
`CursorPosition::Loaded` that points to a rowid and if it's poiting to
either btree or mvcc.
Moving with `next` will `peek` both btree and mvcc to ensure we load the
correct next value. This draws some inefficiencies for now as we could
simply skip one or other in different cases.
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Combine MVCC index with a BTree-backed lazy cursor (including rootpage
mapping) and add row-version state checks, updating VDBE open paths and
tests.
>
> - **MVCC Cursor (`core/mvcc/cursor.rs`)**:
> - Introduce hybrid cursor that merges MVCC index with `BTreeCursor`;
enhanced `CursorPosition` (tracks `in_btree`/`btree_consumed`).
> - Implement state machine for `next`, coordinating MVCC/BTree
iteration and filtering via `RowVersionState`.
> - `current_row()` now yields immutable records from BTree or MVCC;
add `read_mvcc_current_row`.
> - Update `rowid`, `seek`, `rewind`, `last`, `seek_to_last`,
`exists`, `insert` to honor hybrid positioning.
> - **MVCC Store (`core/mvcc/database/mod.rs`)**:
> - Add `RowVersionState` and `find_row_last_version_state`.
> - Remove eager table initialization/scan helpers and `loaded_tables`
tracking.
> - Add `get_real_table_id` for mapping negative IDs to physical root
pages.
> - **VDBE (`core/vdbe/execute.rs`)**:
> - Route BTree cursor creation through
`maybe_transform_root_page_to_positive` and promote to `MvCursor`
without pager arg.
> - Apply mapping in `OpenRead`, `OpenWrite`, `OpenDup`, and index
open paths.
> - **Tests (`core/mvcc/database/tests.rs`)**:
> - Adjust to new cursor API; add coverage for BTree+MVCC iteration
and gaps after checkpoint/restart.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
b581519be4. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
Closes#3829