Commit Graph

1560 Commits

Author SHA1 Message Date
Pekka Enberg
b74ddf30f9 Merge 'extensions/vtabs: implement remaining opcodes' from Preston Thorpe
The only real benefit right now here is the ability to rename virtual
tables.
Then this now properly calls `VBegin` at the start of a vtab write
transaction, despite none of our extensions needing or implementing
transactions at this point.
```console
explain insert into t values ('key','value');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     10    0                    0   Start at 10
1     VOpen              0     0     0                    0    t
2     VBegin             0     0     0                    0   
3     Null               0     1     0                    0   r[1]=NULL
4     Null               0     3     0                    0   r[3]=NULL
5     String8            0     4     0     key            0   r[4]='key'
6     String8            0     5     0     value          0   r[5]='value'
7     VUpdate            0     5     1                    0   args=r[1..5]
8     Close              0     0     0                    0   
9     Halt               0     0     0                    0   
10    Transaction        0     2     1                    0   iDb=0 tx_mode=Write
11    Goto               0     1     0                    0   
Exiting Turso SQL Shell.
```

Closes #3930
2025-11-10 09:03:07 +02:00
Pekka Enberg
7891be96fd Merge 'Refactor affinity conversions for reusability' from Pedro Muniz
Depends on #3920
Moves some code around so it is easier to reuse and less cluttered in
`execute.rs`, and changes how `compare` works. Instead of mutating some
register, we now just return the possible `ValueRef` representation of
that affinity. This allows other parts of the codebase to reuse this
logic without needing to have an owned `Value` or a `&mut Register`

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3923
2025-11-10 09:02:22 +02:00
Pekka Enberg
2be515247f Merge 'Create AsValueRef trait to allow us to be agnostic over ownership of Value or ValueRef' from Pedro Muniz
Depends on #3919
Also change `op_compare` to reuse the same compare_immutable logic
First step to finish #2304

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3920
2025-11-10 09:01:59 +02:00
Pekka Enberg
4bb0edac5e Merge 'Move value functions to separate file' from Pedro Muniz
Makes it easier to visualize what is related to Value and what is
related to opcodes. This will also facilitate in my next PR to
generalize certain function over `Value` and `ValueRef` as listed in
#2304

Closes #3919
2025-11-10 09:01:29 +02:00
PThorpe92
5c207618a7 Fix extensions py test 2025-11-09 11:35:57 -05:00
PThorpe92
b443b09516 Remove VRollback and VCommit as they are unused opcodes in sqlite 2025-11-09 11:27:09 -05:00
PThorpe92
993c9d34b4 Rollback vtab txns when when err code is present in Halt 2025-11-09 11:07:43 -05:00
PThorpe92
f35ccfba17 Add support for renaming virtual tables 2025-11-09 11:07:42 -05:00
PThorpe92
e09d9eb720 Add VBegin, VRename, VRollback and VCommit opcodes 2025-11-09 11:07:42 -05:00
PThorpe92
a012e98bfa core/translate remove unused ParamState and some minor refactoring 2025-11-07 19:18:10 -05:00
pedrocarlo
9007340e99 change convert function to accept 1 value 2025-11-07 12:47:39 -03:00
pedrocarlo
9f350f7fd9 change Text variant in ValueRef to hold a TextRef that can automatically convert to &str avoiding string allocations everywhere 2025-11-07 12:47:39 -03:00
pedrocarlo
5cfc898049 clippy 2025-11-07 12:47:39 -03:00
pedrocarlo
af05d9ba10 move more affinity logic to separate file and avoid more clones 2025-11-07 12:47:39 -03:00
pedrocarlo
61036d5f51 move affinity handling to separate file 2025-11-07 12:47:39 -03:00
pedrocarlo
99c596d340 separate part of comparison logic for reuse later with seek operations 2025-11-07 12:47:39 -03:00
pedrocarlo
ce3527df40 chnage RecordCompare::compare to use an iterator 2025-11-07 12:47:39 -03:00
pedrocarlo
e5e97a5b0a for op_compare reuse compare_immutable 2025-11-07 12:44:57 -03:00
pedrocarlo
9c2324cbd8 move some more functions to be scoped under Value 2025-11-07 12:10:27 -03:00
pedrocarlo
44cab91722 move Value functions to separate file 2025-11-07 12:10:27 -03:00
Preston Thorpe
4e8b4c96d3 Merge 'use dyn DatabaseStorage instead of DatabaseFile' from Nikita Sivukhin
Partial sync for sync engine will need to implement its own version of
`DatabaseStorage` which willl load database pages on demand

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3922
2025-11-06 15:11:13 -05:00
Nikita Sivukhin
da61fa32b4 use dyn DatabaseStorage instead of DatabaseFile 2025-11-06 17:42:03 +04:00
PThorpe92
c5a3e590f7 Fix rewriting sql to persist for foreign keys in alter table func 2025-11-03 09:47:28 -05:00
PThorpe92
ef24911824 Handle renaming child foreign keys on op_rename_table 2025-11-03 09:47:28 -05:00
PThorpe92
481d86f567 Optimize and refactor schema::Column type 2025-11-02 20:46:02 -05:00
PThorpe92
23496f0bea Fix incorrect unreachable precondition for affinity char in op_seek_rowid 2025-11-01 20:43:44 -04:00
Nikita Sivukhin
4c98861590 adjust logs 2025-10-29 16:24:05 +04:00
Nikita Sivukhin
a2d11f9263 reset cursors when statement is reseted 2025-10-29 15:13:00 +04:00
Jussi Saurio
ad723b615f Merge 'index_method: fully integrate into query planner' from Nikita Sivukhin
This PR completely integrate custom indices to the query planner.
In order to do that new `Cursor::IndexMethod` is introduced with few
correlated changes in the VM implementation:
1. Added special `IndexMethod{Create,Destroy,Query}` opcodes to handle
index method creation, deletion and query
2. `Next` , `IdxRowid` , `IdxInsert`, `IdxDelete` opcodes updated to
properly handle new cursor case

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3827
2025-10-29 09:42:37 +02:00
Pekka Enberg
810ed8ad60 Merge 'Don't allow autovacuum to be flipped on non-empty databases' from Pavan Nambi
Turso incorrectly creates the first table in an autovacuumed table in
page 2.
(Note: this is on collaboration with @LeMikaelF)
SQLite does not allow enabling or disabling auto-vacuum after the first
table has been created
(https://sqlite.org/pragma.html#pragma_auto_vacuum). This is because the
sequence of the pages in the databases is different when auto-vacuum is
enabled, because the first b-tree page must be page 3 instead of 2, to
make room for the first [Pointer Map
page](https://sqlite.org/fileformat.html#pointer_map_or_ptrmap_pages).
But Turso doesn't currently consider this, which can lead to data loss.
The simplest way to reproduce this is to create an autovacuumed
databases with either `pragma auto_vacuum=full` so that autovacuum runs
on each commit, and then create a table with some data. Turso will
incorrectly create the new table on page 2. After this, every time a new
page is created, either through a page split or because a new table is
created, Turso will write a 5-byte pointer in page 2, starting from the
top of the page, thereby overwriting existing data.
For example, let's start with a clean database and the first bytes of
page 2. It starts with `0d`, the discriminator for a leaf page
([source](https://www.sqlite.org/fileformat.html#b_tree_pages)). The
next interesting number is the number of cells contained in this page
(`01`) at offset 5.
```
$ cargo run -- /tmp/a.db
turso> create table t(a);
turso> insert into t values ('myvalue');

$ dbtotxt /tmp/a.db
| size 8192 pagesize 4096 filename a.db
| page 1 offset 0
# ...snip...
| page 2 offset 4096
|      0: 0d 00 00 00 01 0f f5 00 0f f5 00 00 00 00 00 00   ................
|   4080: 00 00 00 00 00 09 01 02 1b 6d 79 76 61 6c 75 65   .........myvalue
| end a.db
```
Pointer map pages are located every N pages, starting from page 2, and
contain a list of 5-byte pointers that represent the parent page of a
certain page. So whenever Turso or SQLite needs to add a page, it will
overwrite 5 bytes of page 2. This means that for data loss to occur, it
is sufficient to add a single page to the database, for example by
creating a table. Offset 5 will then be zeroed out:
```
$ cargo run -- /tmp/a.db
turso> create table t(a);
turso> insert into t values ('myvalue');
turso> pragma auto_vacuum=full;
turso> create table tt(a);

$ dbtotxt /tmp/a.db
| size 12288 pagesize 4096 filename a.db
| page 1 offset 0
# ...snip...
| page 2 offset 4096
|      0: 01 00 00 00 00 0f f5 00 0f f5 00 00 00 00 00 00   ................
|   4080: 00 00 00 00 00 09 01 02 1b 6d 79 76 61 6c 75 65   .........myvalue
```
Creating more tables, or adding more B-tree pages, will keep overwriting
the rest of the page, until the cells themselves are also overwritten.
## Reproducing the issue in the simulator
We have been unable to reproduce this exact corruption mode in the
simulator, but patching it shows many failure modes, all of which don't
occur with the unpatched simulator. The following seeds are failing. The
following seeds are showing the issue when the patched simulator is ran
against `main`:
- `11522841279124073062`, with "Assertion 'table inquisitive_graham_159
should contain all of its expected values' failed: table
inquisitive_graham_159 does not contain the expected values, the
simulator model has more rows than the database"
- `7057400018220918989`, `16028085350691325843`, `7721542713659053944`,
and `203017821863546118`, with "Failed to read ptrmap key=XXX"
- `12533694709304969540`, `18357088553315413457`, `3108945730906932377`,
with "Integrity Check Failed: Cell N in page 2 is out of range."
    - `4757352625344646473`, with "dirty pages should be empty for read
txn"
- `7083498604824302257`, with "header_size: 6272, header_len_bytes: 2,
payload.len(): 13"
- `17881876827470741581`, with "ParseError("no such table:
focused_historians_416")"
- `2092231500503735693`, with "range end index 4789 out of range for
slice of length 4096"
- `7555257419378470845`, with malformed database schema
(imaginative_ontivero\u{1})"
- `12905270229511147245`, with "index out of bounds: the len is 4096 but
the index is 4096"
## Fixing the issue
- When DB is opened, we read the `auto_vacuum` state, instead of
assuming `auto_vacuum=none`.
- Don't allow auto_vacuum to be flipped on non-empty databases as if we
allow this it could cause overlap with existing bits.(ptrmap could
overwrite existing data)
- Modify integrity check to avoid reporting that page 2 is orphaned in
auto-vacuumed databases.
Fixes #3752

Closes #3830
2025-10-28 14:48:35 +02:00
Nikita Sivukhin
8ea733f917 fix bug with cursor allocation 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
8acbe3de66 make query_start method to return bool - if result will have some rows or not 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
67c1855ba8 fix bug 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
6206294584 fix clippy 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
d6972a9cf3 fix explain 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
8dd2644c07 add support for new cursor type in existing op codes and also implement new opcodes in the VM 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
b994e2cbd8 add new Cursor type 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
5af10e6ccb add IndexMethod specific VM instructions 2025-10-28 11:27:35 +04:00
Jussi Saurio
9c87b20cb2 Merge 'Where clause subquery support' from Jussi Saurio
Closes #1282
# Support for WHERE clause subqueries
This PR implements support for subqueries that appear in the WHERE
clause of SELECT statements.
## What are those lol
1. **EXISTS subqueries**: `WHERE EXISTS (SELECT ...)`
2. **Row value subqueries**: `WHERE x = (SELECT ...)` or `WHERE (x, y) =
(SELECT ...)`. The latter are not yet supported - only the single-column
("scalar subquery") case is.
3. **IN subqueries**: `WHERE x IN (SELECT ...)` or `WHERE (x, y) IN
(SELECT ...)`
## Correlated vs Uncorrelated Subqueries
- **Uncorrelated subqueries** reference only their own tables and can be
evaluated once.
- **Correlated subqueries** reference columns from the outer query
(e.g., `WHERE EXISTS (SELECT * FROM t2 WHERE t2.id = t1.id)`) and must
be re-evaluated for each row of the outer query
## Implementation
### Planning
During query planning, the WHERE clause is walked to find subquery
expressions (`Expr::Exists`, `Expr::Subquery`, `Expr::InSelect`). Each
subquery is:
1. Assigned a unique internal ID
2. Compiled into its own `SelectPlan` with outer query tables provided
as available references
3. Replaced in the AST with an `Expr::SubqueryResult` node that
references the subquery with its internal ID
4. Stored in a `Vec<NonFromClauseSubquery>` on the `SelectPlan`
For IN subqueries, an ephemeral index is created to store the subquery
results; for other kinds, the results are stored in register(s).
### Translation
Before emitting bytecode, we need to determine when each subquery should
be evaluated:
- **Uncorrelated**: Evaluated once before opening any table cursors
- **Correlated**: Evaluated at the appropriate nested loop depth after
all referenced outer tables are in scope
This is calculated by examining which outer query tables the subquery
references and finding the right-most (innermost) loop that opens those
tables - using similar mechanisms that we use for figuring out when to
evaluate other `WhereTerm`s too.
### Code Generation
- **EXISTS**: Sets a register to 1 if any row is produced, 0 otherwise.
Has new `QueryDestination::ExistsSubqueryResult` variant.
- **IN**: Results stored in an ephemeral index and the index is probed.
- **RowValue**: Results stored in a range of registers. Has new
`QueryDestination::RowValueSubqueryResult` variant.
## Annoying details
### Which cursor to read from in a subquery?
Sometimes a query will use a covering index, i.e. skip opening the table
cursor at all if the index contains All The Needed Stuff.
Correlated subqueries reading columns from outer tables is a bit
problematic in this regard: with our current translation code, the
subquery doesn't know whether the outer query opened a table cursor,
index cursor, or both. So, for now, we try to find a table cursor first,
then fall back to finding any index cursor for that table.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3847
2025-10-28 06:36:55 +02:00
Jussi Saurio
e7aa7ee2ff ProgramBuilder: add a few utility methods needed for correlated subqueries 2025-10-27 14:03:41 +02:00
Nikita Sivukhin
408ca235d1 small refactoring 2025-10-27 12:43:38 +04:00
Nikita Sivukhin
906bbdd1c4 support deep nestedness 2025-10-27 11:37:42 +04:00
Pekka Enberg
7d035f27d8 Merge 'Strict numeric cast for op_must_be_int' from bit-aloo
closes: #3302

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3771
2025-10-26 16:42:35 +02:00
Pekka Enberg
6603f5318a Merge 'core/vdbe: Reuse cursor in op_open_write()' from Pekka Enberg
This optimization reuses an existing cursor when op_open_write() is
called on the same table/index (same root_page). This is safe because
the cursor position doesn't matter - op_rewind() is always called after
op_open_write() to position the cursor at the beginning of the
table/index before any operations are performed.
This change speeds up op_open_write() by avoiding unnecessary cursor re-
initialization.

Closes #3815
2025-10-26 12:29:20 +02:00
Pekka Enberg
ca073b5ecd Merge 'core: Switch RwLock<Arc<Pager>> to ArcSwap<Pager>' from Pekka Enberg
We don't actually need the RwLock locking capabilities, just the ability
to swap the instance.

Closes #3814
2025-10-26 12:29:11 +02:00
Pavan-Nambi
277a989a71 fmt 2025-10-24 21:34:17 +05:30
Pavan-Nambi
7dda783006 clippy - gotta feature autovaccuum n ptrmaps 2025-10-24 21:30:34 +05:30
Pavan-Nambi
8d0ae362da Merge branch 'main' of github.com:tursodatabase/turso into avcm 2025-10-24 18:58:30 +05:30
Pekka Enberg
c3fb867173 core: Switch RwLock<Arc<Pager>> to ArcSwap<Pager>
We don't actually need the RwLock locking capabilities, just the ability
to swap the instance.
2025-10-24 14:10:08 +03:00
bit-aloo
64bbca9e12 Fix op_must_be_int to use strict numeric cast 2025-10-24 16:08:15 +05:30