Commit Graph

6068 Commits

Author SHA1 Message Date
Jussi Saurio
eee7fa5f95 Refactor RETURNING to support arbitrary expressions
Currently RETURNING was a bit of a hack since it had a special
translate_expr_for_returning() function that only supported a subset
of expressions.

Instead, we can store the columns of the target table of the INSERT/UPDATE/DELETE
we are RETURNING from in `Resolver::expr_to_reg_cache` and make those columns point
to the registers that hold the OLD/NEW column values (depending on the operation).
2025-11-13 10:32:38 +02:00
Jussi Saurio
50fbd9a3a2 Store owned strings in InsertEmitCtx for borrow-checker reasons 2025-11-13 09:35:09 +02:00
Jussi Saurio
34978d0fde Store Cow<&Expr> in expr_to_reg_cache
We will be storing owned expressions in it for RETURNING in a later commit.
2025-11-13 09:32:37 +02:00
Jussi Saurio
16097e7355 Merge 'Add RowSet<Add/Read/Test> instructions and rowset implementation' from Jussi Saurio
## What
Rowsets are used in SQLite for two purposes:
1. for membership tests on a set of `i64`s,
2. for in-order iteration of a set of `i64`s,
Both in cases where we can just use rowids (which are `i64`) instead of
building an entire ephemeral btree from a table's contents.
For example, in cases where a `DELETE FROM tbl WHERE ...` is performed
on a table that has any `BEFORE DELETE` triggers, SQLite collects the
table's rowids into a RowSet before actually performing the deletion.
This is similar to how an UPDATE that modifies rowids (or the index used
to iterate the UPDATE loop) will first collect the rows into an
ephemeral index, and same with `INSERT INTO ... SELECT`.
## Details
RowSet uses a "batch" concept where insertions of a given batch must be
guaranteed by caller to contain no duplicates and will be pushed into a
vector for O(1). When a new batch is started, the previous batch is
folded into a `BTreeSet` so that membership tests can be performed in
O(logn). As far as I can tell, the "in-order iteration" use case doesn't
use this batch logic at all.
## AI disclosure
This entire PR description was written by me - no AIs were harmed in the
production of it. However, the code itself was mostly vibecoded using
two agents in Cursor:
- Composer 1: given the SQLite opcode documentation and rowset.c source
code, and asked to implement the VDBE instructions and the RowSet
module.
- GPT-5: given the same SQLite docs and source code, and asked to review
Composer 1's work and write feedback into a separate markdown file.
This loop was run for roughly 4-5 iterations, where each time GPT-5's
feedback was given to Composer 1, until GPT-5 found nothing to comment
anymore.
After this, I instructed Composer 1 to improve the documentation to be
less stupid.
After that, I made a manual editing pass over the runtime code to e.g.
change boolean flags to a `RowSetMode` enum to make clearer that the
rowset has two distinct mutually exclusive purposes (membership tests
and in-order iteration), plus cleaned up some other dumb shit and added
comments.
I am still not sure if this saved time or not.

Closes #3938
2025-11-12 13:02:00 +02:00
Jussi Saurio
933c3112f9 Merge 'Use AsValueRef in more functions' from Pedro Muniz
Depends on #3932
Converting more functions to use `AsValueRef`

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3934
2025-11-12 12:54:39 +02:00
Jussi Saurio
65a7dd40b3 Merge 'Change Value::Text and ValueRef::Text to use Cow<'static, str> and &str to avoid allocations' from Pedro Muniz
When building text values, we could not pass ownership of newly created
strings, which meant a lot of the times we were double cloning strings,
one to transform, and one to build the Value

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3932
2025-11-12 12:54:16 +02:00
Jussi Saurio
a63e12f793 Merge 'treat parameters as "constant" within a query' from Nikita Sivukhin
Right now tursodb treat parameters/variable as non-constant. But
actually they are constant in a sense that parameters/variables has
fixed value during query execution which never changes.
This PR makes tursodb to treat parameters as constant and evaluate
expressions related to them only once.
One real-world scenario where this can be helpful is vector search
query:
```sql
    SELECT id, vector_distance_jaccard(embedding, vector32_sparse(?)) as distance
    FROM vectors
    ORDER BY distance ASC
    LIMIT ?
```
Without constant optimization, `vector32_sparse` function will be
executed for every row - which is very inefficient and query can be 100x
slower due to that (but there is no need to evaluate this function for
every query as we can transform text representation to binary only once)

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3936
2025-11-12 11:46:10 +02:00
Jussi Saurio
cdf2f0d3c5 Fix comment about DELETE ... RETURNING 2025-11-12 11:43:06 +02:00
Jussi Saurio
da92982f41 Add RowSet<Add/Read/Test> instructions and rowset implementation
Rowsets are used in SQLite for two purposes:

1. for membership tests on a set of `i64`s,
2. for in-order iteration of a set of `i64`s,

Both in cases where we can just use rowids (which are `i64`) instead of building an entire ephemeral btree from a table's contents.

For example, in cases where a `DELETE FROM tbl WHERE ...` is performed on a table that has any `BEFORE DELETE` triggers, SQLite collects the table's rowids into a RowSet before actually performing the deletion. This is similar to how an UPDATE that modifies rowids (or the index used to iterate the UPDATE loop) will first collect the rows into an ephemeral index, and same with `INSERT INTO ... SELECT`.

This entire PR description was written by me - no AIs were harmed in the production of it. However, the code itself was mostly vibecoded using two agents in Cursor:

- Composer 1: given the SQLite opcode documentation and rowset.c source code, and asked to implement the VDBE instructions and the RowSet module.
- GPT-5: given the same SQLite docs and source code, and asked to review Composer 1's work and write feedback into a separate markdown file.

This loop was run for roughly 4-5 iterations, where each time GPT-5's feedback was given to Composer 1, until GPT-5 found nothing to comment anymore.

After this, I instructed Composer 1 to improve the documentation to be less stupid.

After that, I made a manual editing pass over the runtime code to e.g. change boolean flags to a `RowSetMode` enum to make clearer that the rowset has two distinct mutually exclusive purposes (membership tests and in-order iteration), plus cleaned up some other dumb shit and added comments.

I am still not sure if this saved time or not.
2025-11-12 11:39:40 +02:00
Nikita Sivukhin
e1f77d8776 do not treat registers as constant 2025-11-12 10:51:51 +04:00
Nikita Sivukhin
b3380bc398 treat parameters as "constant" within a query 2025-11-12 02:30:15 +04:00
pedrocarlo
bc06bb0415 have RecordCursor::get_values return an Iterator for actual lazy deserialization. Unfortunately we won't see much improvement yet as we do not store the RecordCursor when calling ImmutableRecord::get_values 2025-11-11 16:11:46 -03:00
pedrocarlo
60db10cc02 consolidate Value PartialEq and PartialOrd to use the same implementation as ValueRef 2025-11-11 16:11:46 -03:00
pedrocarlo
e1d36a2221 clippy fix 2025-11-11 16:11:46 -03:00
pedrocarlo
4a94ce89e3 Change ValueRef::Text to use a &str instead of &[u8] 2025-11-11 16:11:46 -03:00
pedrocarlo
84268c155b convert json functions to use AsValueRef 2025-11-11 16:11:46 -03:00
pedrocarlo
1db13889e3 Change Value::Text to use a Cow<'static, str> instead of Vec<u8> 2025-11-11 16:11:46 -03:00
pedrocarlo
98d268cdc6 change datetime functions to accept AsValueRef and not registers 2025-11-11 16:11:46 -03:00
pedrocarlo
505a6ba5ea convert vector functions to use AsValueRef 2025-11-11 16:11:46 -03:00
Nikita Sivukhin
78b6eeae80 cargo fmt 2025-11-11 22:47:25 +04:00
Nikita Sivukhin
5e09c4f0c0 make completion send + sync 2025-11-11 22:42:20 +04:00
Nikita Sivukhin
9a9aacaf32 fix compilation 2025-11-11 22:22:34 +04:00
Nikita Sivukhin
6e3b364bb5 make completion callbacks Send
- IO uring already use this because it can invoke callback on another thread
2025-11-11 21:44:12 +04:00
Pere Diaz Bou
b581519be4 more clippy 2025-11-10 17:20:15 +01:00
Pere Diaz Bou
32469bad10 clippy mvcc 2025-11-10 17:13:34 +01:00
Pere Diaz Bou
a08b5f2239 core/mvcc: next and rewind skip btree rows that are in should be updated/deleted in mvcc 2025-11-10 16:51:01 +01:00
Pere Diaz Bou
2fd4407a03 core/execute: map negative root page to positive if we can 2025-11-10 16:51:01 +01:00
Pere Diaz Bou
9004d4f3f1 core/mvcc: remove intialize of mvcc table 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
58f5b9c018 core/mvcc: is_btree_allocated fix 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
420447d6bd core/mvcc/tests: fix use read_mvcc_current_row 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
198e0434d0 core/mvcc/cursor: current_row return either btree or mvcc 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
e78590b948 core/mvcc: add is_btree_allocated to MvccId 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
4b616d1fd8 core/mvcc/cursor: next use both btree cursor and mvcc cursor to decide on row 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
7b7bf6738c core/mvcc/tests: test mixed btree mvcc cursor 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
7d930e3df3 core/mvcc/test: add test for restart after checkpoint 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
724bc94f96 core/mvcc/cursor: rewind with btree 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
a7614267af core/mvcc/cursor: next with btree 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
38f6d20def core/mvcc/cursor: CursorPosition::Loaded include if points to btree 2025-11-10 16:48:13 +01:00
Jussi Saurio
d0da6b5d16 Merge 'Fix seek not applying correct affinity to seek expr' from Pedro Muniz
Depends on #3923 .
To have similar semantics to how `op_compare` works, we need to apply an
affinity to the values referenced in the `SeekKey` that is used for
seeking. This means keeping some affinity metadata for the `WhereTerms`
in the optimization phase, then before seeking, we emit an affinity
conversion.  Had to dig deep in the sqlite code to understand this
better.
Unfortunately, we cannot have just one compare function to rule them all
here, as we have a specialized/optimized compare code to handle records
that have not yet been deserialized.
Closes #3707

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3925
2025-11-10 11:28:29 +02:00
Jussi Saurio
b024fdb17d Merge 'core: update aegis' from Daeho Ro
It seems that the build on macos arm is failing with `aegis` v0.9.0.
So, here I update `aegis`.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3561
2025-11-10 11:27:01 +02:00
pedrocarlo
32535ef4ed only emit affinity check on index seek + check if affinity is necessary at all 2025-11-10 11:15:54 +02:00
pedrocarlo
27e234f949 add affinity of the expr in the seek key, and emit affinity instruction before seeking 2025-11-10 11:15:54 +02:00
Pekka Enberg
b74ddf30f9 Merge 'extensions/vtabs: implement remaining opcodes' from Preston Thorpe
The only real benefit right now here is the ability to rename virtual
tables.
Then this now properly calls `VBegin` at the start of a vtab write
transaction, despite none of our extensions needing or implementing
transactions at this point.
```console
explain insert into t values ('key','value');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     10    0                    0   Start at 10
1     VOpen              0     0     0                    0    t
2     VBegin             0     0     0                    0   
3     Null               0     1     0                    0   r[1]=NULL
4     Null               0     3     0                    0   r[3]=NULL
5     String8            0     4     0     key            0   r[4]='key'
6     String8            0     5     0     value          0   r[5]='value'
7     VUpdate            0     5     1                    0   args=r[1..5]
8     Close              0     0     0                    0   
9     Halt               0     0     0                    0   
10    Transaction        0     2     1                    0   iDb=0 tx_mode=Write
11    Goto               0     1     0                    0   
Exiting Turso SQL Shell.
```

Closes #3930
2025-11-10 09:03:07 +02:00
Pekka Enberg
7891be96fd Merge 'Refactor affinity conversions for reusability' from Pedro Muniz
Depends on #3920
Moves some code around so it is easier to reuse and less cluttered in
`execute.rs`, and changes how `compare` works. Instead of mutating some
register, we now just return the possible `ValueRef` representation of
that affinity. This allows other parts of the codebase to reuse this
logic without needing to have an owned `Value` or a `&mut Register`

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3923
2025-11-10 09:02:22 +02:00
Pekka Enberg
2be515247f Merge 'Create AsValueRef trait to allow us to be agnostic over ownership of Value or ValueRef' from Pedro Muniz
Depends on #3919
Also change `op_compare` to reuse the same compare_immutable logic
First step to finish #2304

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3920
2025-11-10 09:01:59 +02:00
Pekka Enberg
4bb0edac5e Merge 'Move value functions to separate file' from Pedro Muniz
Makes it easier to visualize what is related to Value and what is
related to opcodes. This will also facilitate in my next PR to
generalize certain function over `Value` and `ValueRef` as listed in
#2304

Closes #3919
2025-11-10 09:01:29 +02:00
PThorpe92
5c207618a7 Fix extensions py test 2025-11-09 11:35:57 -05:00
PThorpe92
b443b09516 Remove VRollback and VCommit as they are unused opcodes in sqlite 2025-11-09 11:27:09 -05:00
PThorpe92
94b6d254a9 Fix comment on vtab_txn_states 2025-11-09 11:08:52 -05:00
PThorpe92
993c9d34b4 Rollback vtab txns when when err code is present in Halt 2025-11-09 11:07:43 -05:00