Commit Graph

6076 Commits

Author SHA1 Message Date
Pavan-Nambi
8d2b06e6bf remove stupid files,clippy and tcl-syntax 2025-11-14 08:24:01 +05:30
Pavan-Nambi
eaa8edb6f7 don't overwrite col mappings 2025-11-14 07:41:41 +05:30
Pekka Enberg
1f79fbc22c Merge 'Partial sync basic' from Nikita Sivukhin
This PR implements basic support for partial sync. Right now the scope
is limited to only `:memory:` IO and later will be properly expanded to
the file based IO later.
The main addition is `PartialDatabaseStorage` which make request to the
remote server for missing local pages on demand.
The main change is that now tursodatabase JS bindings accept optional
"external" IO event loop which in case of sync will drive `ProtocolIo`
internal work associated with remote page fetching tasks.

Closes #3931
2025-11-13 16:38:04 +02:00
Pere Diaz Bou
7c96b6d9f9 Merge 'Fix: Drop internal DBSP table when dropping materialized view' from Martin Mauch
# Fix: Clean up DBSP state table when dropping materialized views
## Problem
When dropping a materialized view, the internal DBSP state table (e.g.,
`__turso_internal_dbsp_state_v1_view_name`) and its automatic primary
key index were not being properly cleaned up. This caused two issues:
1. **Persistent schema entries**: The DBSP table and index entries
remained in `sqlite_schema` after dropping the view
2. **In-memory schema inconsistency**: The DBSP table remained in the
in-memory schema's `tables` HashMap, causing "table already exists"
errors when trying to recreate a materialized view with the same name
## Root Cause
The issue had two parts:
1. **Missing sqlite_schema cleanup**: The `translate_drop_view` function
deleted the view entry from `sqlite_schema` but didn't delete the
associated DBSP state table and index entries
2. **Missing in-memory schema cleanup**: The `remove_view` function
removed the materialized view from the in-memory schema but didn't
remove the DBSP state table and its indexes
## Solution
### Changes in `core/translate/view.rs`
- Added a second pass loop in `translate_drop_view` to scan
`sqlite_schema` and delete DBSP table and index entries
- The loop checks for entries matching the DBSP table name pattern
(`__turso_internal_dbsp_state_v{version}_{view_name}`) and the automatic
index name pattern (`sqlite_autoindex___turso_internal_dbsp_state_v{vers
ion}_{view_name}_1`)
- Registers for comparison values are allocated outside the loop for
efficiency
- Column registers are reused across loop iterations
### Changes in `core/schema.rs`
- Updated `remove_view` to also remove the DBSP state table and its
indexes from the in-memory schema's `tables` HashMap and `indexes`
collection
- This ensures consistency between the persistent schema
(`sqlite_schema`) and the in-memory schema
### Tests Added
Added two new test cases in `testing/materialized_views.test`:
1. **`matview-drop-cleans-up-dbsp-table`**: Explicitly verifies that
after dropping a materialized view:
   - The view entry is removed from `sqlite_schema`
   - The DBSP state table entry is removed from `sqlite_schema`
   - The DBSP state index entry is removed from `sqlite_schema`
2. **`matview-recreate-after-drop`**: Verifies that a materialized view
can be successfully recreated after being dropped, which implicitly
tests that all underlying resources (including DBSP tables) are properly
cleaned up
## Testing
- All existing materialized view tests pass
- New tests specifically verify the cleanup behavior
- Manual testing confirms that materialized views can be dropped and
recreated without errors
## Related
This fix ensures that materialized views can be safely dropped and
recreated, resolving issues where the DBSP state table would persist and
cause conflicts.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3928
2025-11-12 17:16:04 +01:00
Nikita Sivukhin
3d14092679 fix 2025-11-12 16:38:04 +04:00
Nikita Sivukhin
54cb7758ef fix formatting 2025-11-12 16:14:26 +04:00
Jussi Saurio
16097e7355 Merge 'Add RowSet<Add/Read/Test> instructions and rowset implementation' from Jussi Saurio
## What
Rowsets are used in SQLite for two purposes:
1. for membership tests on a set of `i64`s,
2. for in-order iteration of a set of `i64`s,
Both in cases where we can just use rowids (which are `i64`) instead of
building an entire ephemeral btree from a table's contents.
For example, in cases where a `DELETE FROM tbl WHERE ...` is performed
on a table that has any `BEFORE DELETE` triggers, SQLite collects the
table's rowids into a RowSet before actually performing the deletion.
This is similar to how an UPDATE that modifies rowids (or the index used
to iterate the UPDATE loop) will first collect the rows into an
ephemeral index, and same with `INSERT INTO ... SELECT`.
## Details
RowSet uses a "batch" concept where insertions of a given batch must be
guaranteed by caller to contain no duplicates and will be pushed into a
vector for O(1). When a new batch is started, the previous batch is
folded into a `BTreeSet` so that membership tests can be performed in
O(logn). As far as I can tell, the "in-order iteration" use case doesn't
use this batch logic at all.
## AI disclosure
This entire PR description was written by me - no AIs were harmed in the
production of it. However, the code itself was mostly vibecoded using
two agents in Cursor:
- Composer 1: given the SQLite opcode documentation and rowset.c source
code, and asked to implement the VDBE instructions and the RowSet
module.
- GPT-5: given the same SQLite docs and source code, and asked to review
Composer 1's work and write feedback into a separate markdown file.
This loop was run for roughly 4-5 iterations, where each time GPT-5's
feedback was given to Composer 1, until GPT-5 found nothing to comment
anymore.
After this, I instructed Composer 1 to improve the documentation to be
less stupid.
After that, I made a manual editing pass over the runtime code to e.g.
change boolean flags to a `RowSetMode` enum to make clearer that the
rowset has two distinct mutually exclusive purposes (membership tests
and in-order iteration), plus cleaned up some other dumb shit and added
comments.
I am still not sure if this saved time or not.

Closes #3938
2025-11-12 13:02:00 +02:00
Jussi Saurio
933c3112f9 Merge 'Use AsValueRef in more functions' from Pedro Muniz
Depends on #3932
Converting more functions to use `AsValueRef`

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3934
2025-11-12 12:54:39 +02:00
Jussi Saurio
65a7dd40b3 Merge 'Change Value::Text and ValueRef::Text to use Cow<'static, str> and &str to avoid allocations' from Pedro Muniz
When building text values, we could not pass ownership of newly created
strings, which meant a lot of the times we were double cloning strings,
one to transform, and one to build the Value

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3932
2025-11-12 12:54:16 +02:00
Jussi Saurio
a63e12f793 Merge 'treat parameters as "constant" within a query' from Nikita Sivukhin
Right now tursodb treat parameters/variable as non-constant. But
actually they are constant in a sense that parameters/variables has
fixed value during query execution which never changes.
This PR makes tursodb to treat parameters as constant and evaluate
expressions related to them only once.
One real-world scenario where this can be helpful is vector search
query:
```sql
    SELECT id, vector_distance_jaccard(embedding, vector32_sparse(?)) as distance
    FROM vectors
    ORDER BY distance ASC
    LIMIT ?
```
Without constant optimization, `vector32_sparse` function will be
executed for every row - which is very inefficient and query can be 100x
slower due to that (but there is no need to evaluate this function for
every query as we can transform text representation to binary only once)

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3936
2025-11-12 11:46:10 +02:00
Jussi Saurio
cdf2f0d3c5 Fix comment about DELETE ... RETURNING 2025-11-12 11:43:06 +02:00
Jussi Saurio
da92982f41 Add RowSet<Add/Read/Test> instructions and rowset implementation
Rowsets are used in SQLite for two purposes:

1. for membership tests on a set of `i64`s,
2. for in-order iteration of a set of `i64`s,

Both in cases where we can just use rowids (which are `i64`) instead of building an entire ephemeral btree from a table's contents.

For example, in cases where a `DELETE FROM tbl WHERE ...` is performed on a table that has any `BEFORE DELETE` triggers, SQLite collects the table's rowids into a RowSet before actually performing the deletion. This is similar to how an UPDATE that modifies rowids (or the index used to iterate the UPDATE loop) will first collect the rows into an ephemeral index, and same with `INSERT INTO ... SELECT`.

This entire PR description was written by me - no AIs were harmed in the production of it. However, the code itself was mostly vibecoded using two agents in Cursor:

- Composer 1: given the SQLite opcode documentation and rowset.c source code, and asked to implement the VDBE instructions and the RowSet module.
- GPT-5: given the same SQLite docs and source code, and asked to review Composer 1's work and write feedback into a separate markdown file.

This loop was run for roughly 4-5 iterations, where each time GPT-5's feedback was given to Composer 1, until GPT-5 found nothing to comment anymore.

After this, I instructed Composer 1 to improve the documentation to be less stupid.

After that, I made a manual editing pass over the runtime code to e.g. change boolean flags to a `RowSetMode` enum to make clearer that the rowset has two distinct mutually exclusive purposes (membership tests and in-order iteration), plus cleaned up some other dumb shit and added comments.

I am still not sure if this saved time or not.
2025-11-12 11:39:40 +02:00
Nikita Sivukhin
be12ca01aa add is_hole / punch_hole optional methods to IO trait and remove is_hole method from Database trait 2025-11-12 12:04:42 +04:00
Nikita Sivukhin
d519945098 make ArenaBuffer unsafe Send + Sync 2025-11-12 10:54:40 +04:00
Nikita Sivukhin
f3dc19cb00 UNSAFE: make Completion to be Send + Sync 2025-11-12 10:53:25 +04:00
Nikita Sivukhin
95f31067fa add has_hole API in the DatabaseStorage trait 2025-11-12 10:53:25 +04:00
Nikita Sivukhin
e1f77d8776 do not treat registers as constant 2025-11-12 10:51:51 +04:00
Nikita Sivukhin
b3380bc398 treat parameters as "constant" within a query 2025-11-12 02:30:15 +04:00
pedrocarlo
bc06bb0415 have RecordCursor::get_values return an Iterator for actual lazy deserialization. Unfortunately we won't see much improvement yet as we do not store the RecordCursor when calling ImmutableRecord::get_values 2025-11-11 16:11:46 -03:00
pedrocarlo
60db10cc02 consolidate Value PartialEq and PartialOrd to use the same implementation as ValueRef 2025-11-11 16:11:46 -03:00
pedrocarlo
e1d36a2221 clippy fix 2025-11-11 16:11:46 -03:00
pedrocarlo
4a94ce89e3 Change ValueRef::Text to use a &str instead of &[u8] 2025-11-11 16:11:46 -03:00
pedrocarlo
84268c155b convert json functions to use AsValueRef 2025-11-11 16:11:46 -03:00
pedrocarlo
1db13889e3 Change Value::Text to use a Cow<'static, str> instead of Vec<u8> 2025-11-11 16:11:46 -03:00
pedrocarlo
98d268cdc6 change datetime functions to accept AsValueRef and not registers 2025-11-11 16:11:46 -03:00
pedrocarlo
505a6ba5ea convert vector functions to use AsValueRef 2025-11-11 16:11:46 -03:00
Nikita Sivukhin
78b6eeae80 cargo fmt 2025-11-11 22:47:25 +04:00
Nikita Sivukhin
5e09c4f0c0 make completion send + sync 2025-11-11 22:42:20 +04:00
Nikita Sivukhin
9a9aacaf32 fix compilation 2025-11-11 22:22:34 +04:00
Nikita Sivukhin
6e3b364bb5 make completion callbacks Send
- IO uring already use this because it can invoke callback on another thread
2025-11-11 21:44:12 +04:00
Pere Diaz Bou
b581519be4 more clippy 2025-11-10 17:20:15 +01:00
Pere Diaz Bou
32469bad10 clippy mvcc 2025-11-10 17:13:34 +01:00
Pere Diaz Bou
a08b5f2239 core/mvcc: next and rewind skip btree rows that are in should be updated/deleted in mvcc 2025-11-10 16:51:01 +01:00
Pere Diaz Bou
2fd4407a03 core/execute: map negative root page to positive if we can 2025-11-10 16:51:01 +01:00
Pere Diaz Bou
9004d4f3f1 core/mvcc: remove intialize of mvcc table 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
58f5b9c018 core/mvcc: is_btree_allocated fix 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
420447d6bd core/mvcc/tests: fix use read_mvcc_current_row 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
198e0434d0 core/mvcc/cursor: current_row return either btree or mvcc 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
e78590b948 core/mvcc: add is_btree_allocated to MvccId 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
4b616d1fd8 core/mvcc/cursor: next use both btree cursor and mvcc cursor to decide on row 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
7b7bf6738c core/mvcc/tests: test mixed btree mvcc cursor 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
7d930e3df3 core/mvcc/test: add test for restart after checkpoint 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
724bc94f96 core/mvcc/cursor: rewind with btree 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
a7614267af core/mvcc/cursor: next with btree 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
38f6d20def core/mvcc/cursor: CursorPosition::Loaded include if points to btree 2025-11-10 16:48:13 +01:00
Jussi Saurio
d0da6b5d16 Merge 'Fix seek not applying correct affinity to seek expr' from Pedro Muniz
Depends on #3923 .
To have similar semantics to how `op_compare` works, we need to apply an
affinity to the values referenced in the `SeekKey` that is used for
seeking. This means keeping some affinity metadata for the `WhereTerms`
in the optimization phase, then before seeking, we emit an affinity
conversion.  Had to dig deep in the sqlite code to understand this
better.
Unfortunately, we cannot have just one compare function to rule them all
here, as we have a specialized/optimized compare code to handle records
that have not yet been deserialized.
Closes #3707

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3925
2025-11-10 11:28:29 +02:00
Jussi Saurio
b024fdb17d Merge 'core: update aegis' from Daeho Ro
It seems that the build on macos arm is failing with `aegis` v0.9.0.
So, here I update `aegis`.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3561
2025-11-10 11:27:01 +02:00
pedrocarlo
32535ef4ed only emit affinity check on index seek + check if affinity is necessary at all 2025-11-10 11:15:54 +02:00
pedrocarlo
27e234f949 add affinity of the expr in the seek key, and emit affinity instruction before seeking 2025-11-10 11:15:54 +02:00
Pekka Enberg
b74ddf30f9 Merge 'extensions/vtabs: implement remaining opcodes' from Preston Thorpe
The only real benefit right now here is the ability to rename virtual
tables.
Then this now properly calls `VBegin` at the start of a vtab write
transaction, despite none of our extensions needing or implementing
transactions at this point.
```console
explain insert into t values ('key','value');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     10    0                    0   Start at 10
1     VOpen              0     0     0                    0    t
2     VBegin             0     0     0                    0   
3     Null               0     1     0                    0   r[1]=NULL
4     Null               0     3     0                    0   r[3]=NULL
5     String8            0     4     0     key            0   r[4]='key'
6     String8            0     5     0     value          0   r[5]='value'
7     VUpdate            0     5     1                    0   args=r[1..5]
8     Close              0     0     0                    0   
9     Halt               0     0     0                    0   
10    Transaction        0     2     1                    0   iDb=0 tx_mode=Write
11    Goto               0     1     0                    0   
Exiting Turso SQL Shell.
```

Closes #3930
2025-11-10 09:03:07 +02:00