Commit Graph

1880 Commits

Author SHA1 Message Date
pedrocarlo
98d268cdc6 change datetime functions to accept AsValueRef and not registers 2025-11-11 16:11:46 -03:00
pedrocarlo
1db13889e3 Change Value::Text to use a Cow<'static, str> instead of Vec<u8> 2025-11-11 16:11:46 -03:00
pedrocarlo
32535ef4ed only emit affinity check on index seek + check if affinity is necessary at all 2025-11-10 11:15:54 +02:00
pedrocarlo
27e234f949 add affinity of the expr in the seek key, and emit affinity instruction before seeking 2025-11-10 11:15:54 +02:00
Pekka Enberg
b74ddf30f9 Merge 'extensions/vtabs: implement remaining opcodes' from Preston Thorpe
The only real benefit right now here is the ability to rename virtual
tables.
Then this now properly calls `VBegin` at the start of a vtab write
transaction, despite none of our extensions needing or implementing
transactions at this point.
```console
explain insert into t values ('key','value');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     10    0                    0   Start at 10
1     VOpen              0     0     0                    0    t
2     VBegin             0     0     0                    0   
3     Null               0     1     0                    0   r[1]=NULL
4     Null               0     3     0                    0   r[3]=NULL
5     String8            0     4     0     key            0   r[4]='key'
6     String8            0     5     0     value          0   r[5]='value'
7     VUpdate            0     5     1                    0   args=r[1..5]
8     Close              0     0     0                    0   
9     Halt               0     0     0                    0   
10    Transaction        0     2     1                    0   iDb=0 tx_mode=Write
11    Goto               0     1     0                    0   
Exiting Turso SQL Shell.
```

Closes #3930
2025-11-10 09:03:07 +02:00
Pekka Enberg
7891be96fd Merge 'Refactor affinity conversions for reusability' from Pedro Muniz
Depends on #3920
Moves some code around so it is easier to reuse and less cluttered in
`execute.rs`, and changes how `compare` works. Instead of mutating some
register, we now just return the possible `ValueRef` representation of
that affinity. This allows other parts of the codebase to reuse this
logic without needing to have an owned `Value` or a `&mut Register`

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3923
2025-11-10 09:02:22 +02:00
PThorpe92
f35ccfba17 Add support for renaming virtual tables 2025-11-09 11:07:42 -05:00
RS2007
3a562f734c feat: alter table disallow generated columns, support foreign keys for alter table 2025-11-08 13:45:17 +05:30
PThorpe92
dd2e3e8e16 Fix clippy warning 2025-11-07 20:04:57 -05:00
PThorpe92
a012e98bfa core/translate remove unused ParamState and some minor refactoring 2025-11-07 19:18:10 -05:00
pedrocarlo
61036d5f51 move affinity handling to separate file 2025-11-07 12:47:39 -03:00
Pekka Enberg
a6593d109e Merge 'Toy index improvements' from Nikita Sivukhin
This PR implements more sophisticated algorithm in the toy vector sparse
index: now we enumerate components based on the frequency (in order to
check unpopular "features" first) and also estimate length threshold
which can give us better results compared with current top-k set.
Also, this PR adds optional `delta` parameter which can enable
approximate search which will return results with score not more than
`delta` away from the optimal.
In order to implement this index method - index code were slightly
adjusted in order to allow to store some non-key payload in the index
rows. So, now index can hold N columns where first K <= N columns will
be used as identity (before that K always was equal to N).

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3862
2025-11-07 08:29:47 +02:00
Preston Thorpe
29c5271c44 Merge 'Prevent DROP TABLE when table is referenced by foreign keys' from Joao Faria
## Related issue
- closes #3885
## Description
Add a check to reject dropping a table when PRAGMA foreign_keys=ON and
the table is referenced by foreign keys

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3913
2025-11-06 15:10:05 -05:00
Nikita Sivukhin
a64aef780d Merge branch 'main' into toy-index-improvements 2025-11-05 20:48:16 +04:00
joao.faria
2ba643cd68 fix: prevent DROP TABLE when table is referenced by foreign keys
Add foreign key constraint check in translate_drop_table to reject
dropping tables that are referenced by foreign keys when
PRAGMA foreign_keys=ON
2025-11-04 12:32:19 -03:00
Duy Dang
d4b874cc40 Fix EXISTS on LEFT JOIN null rows 2025-11-04 22:01:18 +07:00
Pekka Enberg
2bf5eb84cf Merge 'Prevent misuse of subqueries that return multiple columns' from Jussi Saurio
Closes #3892
Closes #3888
Stuff like:
```sql
turso>     create table t1(x);
    create table t2(y, z);
    insert into t1 values (1);
    insert into t2 values (1, 2);
    select case (select y, z from t2) when 1 then 'one' else 'other' end from t1;
  × Parse error: base expression in CASE must return 1 value

turso>     create table t(x, y);
    insert into t values (1, 2);
    select (select x, y from t) as result;
  × Parse error: result column must return 1 value, got 2

turso>     create table t1(x,y);
    create table t2(y);
    insert into t1 values (1,1);
    insert into t2 values (1);
    select * from t2 where y = (select x,y from t1);
  × Parse error: all arguments to binary operator = must return the same number of
  │ values. Got: (1) = (2)

turso>     create table orders(customer_id, amount);
    create table thresholds(min_amount, max_amount);
    insert into orders values (100, 50), (100, 150);
    insert into thresholds values (100, 200);
    select customer_id, sum(amount) as total 
    from orders 
    group by customer_id 
    having total > (select min_amount, max_amount from thresholds);
  × Parse error: all arguments to binary operator > must return the same number of
  │ values. Got: (1) > (2)

turso>     create table items(id);
    create table config(max_results, other_col);
    insert into items values (1), (2), (3);
    insert into config values (2, 3);
    select * from items limit (select max_results, other_col from config);
  × Parse error: limit expression must return 1 value, got 2

turso>     create table items(id);
    create table config(skip_count, other_col);
    insert into items values (1), (2), (3);
    insert into config values (1, 2);
    select * from items limit 1 offset (select skip_count, other_col from config);
  × Parse error: offset expression must return 1 value, got 2

turso>     create table items(id, name);
    create table sort_order(priority, other_col);
    insert into items values (1, 'a'), (2, 'b');
    insert into sort_order values (1, 2);
    select * from items order by (select priority, other_col from sort_order);
  × Parse error: order by expression must return 1 value, got 2

turso>     create table sales(product_id, amount);
    create table grouping(category, other_col);
    insert into sales values (1, 100), (2, 200);
    insert into grouping values (1, 2);
    select sum(amount) from sales group by (select category, other_col from grouping);
  × Parse error: group by expression must return 1 value, got 2

turso>     create table t1(x);
    create table t2(y, z);
    insert into t1 values (1);
    insert into t2 values (1, 2);
    select case when (select y, z from t2) then 'yes' else 'no' end from t1;
  × Parse error: when expression in CASE must return 1 value. Got: (2)

turso>     create table t1(x);
    create table t2(y, z);
    insert into t1 values (1);
    insert into t2 values (1, 2);
    select case when x = 1 then (select y, z from t2) else 0 end from t1;
  × Parse error: then expression in CASE must return 1 value. Got: (2)

turso>     create table t1(x);
    create table t2(y, z);
    insert into t1 values (1);
    insert into t2 values (1, 2);
    select case when x = 2 then 0 else (select y, z from t2) end from t1;
  × Parse error: else expression in CASE must return 1 value. Got: (2)

turso>     create table t1(x);
    create table t2(y, z);
    insert into t1 values (1);
    insert into t2 values (1, 2);
    select max((select y, z from t2)) from t1;
  × Parse error: argument 0 to function call max must return 1 value. Got: (2)

turso>     create table t1(x);
    create table t2(y, z);
    insert into t1 values (1);
    insert into t2 values (1, 2);
    select x + (select y, z from t2) from t1;
  × Parse error: all arguments to binary operator + must return the same number of
  │ values. Got: (1) + (2)

turso>     create table t1(x);
    create table t2(y, z);
    insert into t1 values (5);
    insert into t2 values (1, 2);
    select * from t1 where x between (select y, z from t2) and 10;
  × Parse error: all arguments to binary operator <= must return the same number of
  │ values. Got: (2) <= (1)

turso>     create table t1(x);
    create table t2(y, z);
    insert into t1 values (1);
    insert into t2 values (1, 2);
    select cast((select y, z from t2) as integer) from t1;
  × Parse error: argument to CAST must return 1 value. Got: (2)

turso>     create table t1(x);
    create table t2(y, z);
    insert into t1 values (1);
    insert into t2 values ('a', 'b');
    select (select y, z from t2) collate nocase from t1;
  × Parse error: argument to COLLATE must return 1 value. Got: (2)

turso>     create table t1(x);
    create table t2(y, z);
    insert into t1 values (1);
    insert into t2 values (1, 2);
    select * from t1 where (select y, z from t2) is null;
  × Parse error: all arguments to binary operator IS must return the same number of
  │ values. Got: (2) IS (1)

turso>     create table t1(x);
    create table t2(y, z);
    insert into t1 values (1);
    insert into t2 values (1, 2);
    select * from t1 where (select y, z from t2) not null;
  × Parse error: argument to NOT NULL must return 1 value. Got: (2)

turso>     create table t1(x);
    create table t2(y, z);
    insert into t1 values (1);
    insert into t2 values ('a', 'b');
    select * from t1 where (select y, z from t2) like 'a%';
  × Parse error: left operand of LIKE must return 1 value. Got: (2)

turso>     create table t1(x);
    create table t2(y, z);
    insert into t1 values (1);
    insert into t2 values (1, 2);
    select -(select y, z from t2) from t1;
  × Parse error: argument to unary operator - must return 1 value. Got: (2)

turso>     create table t1(x);
    create table t2(y, z);
    insert into t1 values (1);
    insert into t2 values (1, 2);
    select abs((select y, z from t2)) from t1;
  × Parse error: argument 0 to function call abs must return 1 value. Got: (2)
  ```

Closes #3906
2025-11-03 13:06:38 +02:00
Pekka Enberg
9aae220509 Merge 'Optimize and refactor schema::Column type' from Preston Thorpe
closes https://github.com/tursodatabase/turso/issues/3773
## Before
```rust
#[derive(Debug, Clone)]
pub struct Column {
    pub name: Option<String>,
    pub ty: Type,
    // many sqlite operations like table_info retain the original string
    pub ty_str: String,
    pub primary_key: bool,
    pub is_rowid_alias: bool,
    pub notnull: bool,
    pub default: Option<Box<Expr>>,
    pub unique: bool,
    pub collation: Option<CollationSeq>,
    pub hidden: bool,
}
```
obviously not ideal. so lets pack `type`, `hidden`, `primary_key`,
`is_rowid_alias`, `notnull` and `collation` into a u16.
## After:
```rust
#[derive(Debug, Clone)]
pub struct Column {
    pub name: Option<String>,
    pub ty_str: String,
    pub default: Option<Box<Expr>>,
    raw: u16,
}
```
Also saw a place to replace a `Mutex<Enum>` with `AtomicEnum`, so I
snuck that in here too

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3905
2025-11-03 13:05:35 +02:00
Jussi Saurio
1c2a8e62ca Fix: return error on provided insert column count mismatch 2025-11-03 11:41:50 +02:00
Jussi Saurio
005d922ab4 Fix: prevent misuse of subqueries that return multiple columns 2025-11-03 11:04:09 +02:00
PThorpe92
481d86f567 Optimize and refactor schema::Column type 2025-11-02 20:46:02 -05:00
pedrocarlo
28c52cdf09 pass the left select in compound select to correctly choose the collation sequence 2025-11-02 11:26:48 -03:00
Pekka Enberg
913b7ac600 core: Disable autovacuum by default
People have discovered various bugs in autovacuum so let's disable it by
default for now.
2025-11-02 12:09:21 +02:00
Pekka Enberg
c091f94de8 Merge 'Fix INSERT UNION ALL' from Duy Dang
Close #3849
Close #3855

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3877
2025-11-01 11:12:38 +02:00
Pekka Enberg
7283f35a29 Merge 'Fix LEFT JOIN subqueries reusing stale right-side values' from Duy Dang
Close #3867

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3874
2025-11-01 11:12:24 +02:00
Duy Dang
e0f6b7cffe Fix INSERT handling for compound VALUES sources 2025-11-01 02:27:42 +07:00
Duy Dang
4b18e3bab5 Fix VALUES UNION ALL register reuse during INSERTs 2025-11-01 02:01:30 +07:00
Duy Dang
3ee47a2c3c Fix LEFT JOIN subqueries reusing stale right-side values 2025-11-01 01:24:31 +07:00
RS2007
7fff8daaa5 Fixing clippy error 2025-10-31 23:14:08 +05:30
RS2007
1f576593ec adding duplicate cte name checks in planner.rs 2025-10-31 23:14:08 +05:30
Nikita Sivukhin
20ba6990a9 ignore "virtual" index entries corresponding to the index_methods from integrity check 2025-10-31 14:25:59 +04:00
Duy Dang
733dc762ed Fix self-insert SUM when table uses INTEGER PRIMARY KEY 2025-10-31 03:34:10 +07:00
Jussi Saurio
6cf2072b51 translate: disallow correlated subqueries in HAVING and ORDER BY
These are supported by SQLite, but we cannot handle them correctly yet.
2025-10-29 15:37:19 +02:00
Jussi Saurio
4bf8ad8cfd Merge 'Support subqueries in all positions of a SELECT statement' from Jussi Saurio
Follow-up to #3847.
Adds support for subqueries in all other positions of a SELECT (the
result list, GROUP BY, ORDER BY, HAVING, LIMIT, OFFSET).
Turns out I am a sql noob and didn't realize that correlated subqueries
are supported in basically all positions except LIMIT/OFFSET, so added
support for those too + accompanying TCL tests.
Thankfully the abstractions introduced in #3847 carry over to this very
well so the code change is relatively small (over half of the diff is
tests and a lot of the remaining diff is just moving logic around).

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3852
2025-10-29 10:19:39 +02:00
Jussi Saurio
fcb927ed24 Merge 'Initialize LIMIT after after ORDER BY / GROUP BY initialization' from Jussi Saurio
Closes #3853
Currently LIMIT 0 jumps to "after the main loop", and it is done before
ORDER BY and GROUP BY cursor have had a chance to be initialized, which
causes a panic.
Simplest fix for now is to delay the LIMIT initialization.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3854
2025-10-29 10:17:05 +02:00
Jussi Saurio
29fe3b585a Add more tests and disable correlated IN-subqueries in HAVING position
I discovered a flaw in our current translation that makes queries of type
HAVING foo IN (SELECT ...) not work properly - in these cases we need to
defer translation of the subquery until later.

I will fix this in a future PR because I suspect it's not trivial.
2025-10-29 09:57:55 +02:00
Jussi Saurio
ad723b615f Merge 'index_method: fully integrate into query planner' from Nikita Sivukhin
This PR completely integrate custom indices to the query planner.
In order to do that new `Cursor::IndexMethod` is introduced with few
correlated changes in the VM implementation:
1. Added special `IndexMethod{Create,Destroy,Query}` opcodes to handle
index method creation, deletion and query
2. `Next` , `IdxRowid` , `IdxInsert`, `IdxDelete` opcodes updated to
properly handle new cursor case

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3827
2025-10-29 09:42:37 +02:00
Pekka Enberg
810ed8ad60 Merge 'Don't allow autovacuum to be flipped on non-empty databases' from Pavan Nambi
Turso incorrectly creates the first table in an autovacuumed table in
page 2.
(Note: this is on collaboration with @LeMikaelF)
SQLite does not allow enabling or disabling auto-vacuum after the first
table has been created
(https://sqlite.org/pragma.html#pragma_auto_vacuum). This is because the
sequence of the pages in the databases is different when auto-vacuum is
enabled, because the first b-tree page must be page 3 instead of 2, to
make room for the first [Pointer Map
page](https://sqlite.org/fileformat.html#pointer_map_or_ptrmap_pages).
But Turso doesn't currently consider this, which can lead to data loss.
The simplest way to reproduce this is to create an autovacuumed
databases with either `pragma auto_vacuum=full` so that autovacuum runs
on each commit, and then create a table with some data. Turso will
incorrectly create the new table on page 2. After this, every time a new
page is created, either through a page split or because a new table is
created, Turso will write a 5-byte pointer in page 2, starting from the
top of the page, thereby overwriting existing data.
For example, let's start with a clean database and the first bytes of
page 2. It starts with `0d`, the discriminator for a leaf page
([source](https://www.sqlite.org/fileformat.html#b_tree_pages)). The
next interesting number is the number of cells contained in this page
(`01`) at offset 5.
```
$ cargo run -- /tmp/a.db
turso> create table t(a);
turso> insert into t values ('myvalue');

$ dbtotxt /tmp/a.db
| size 8192 pagesize 4096 filename a.db
| page 1 offset 0
# ...snip...
| page 2 offset 4096
|      0: 0d 00 00 00 01 0f f5 00 0f f5 00 00 00 00 00 00   ................
|   4080: 00 00 00 00 00 09 01 02 1b 6d 79 76 61 6c 75 65   .........myvalue
| end a.db
```
Pointer map pages are located every N pages, starting from page 2, and
contain a list of 5-byte pointers that represent the parent page of a
certain page. So whenever Turso or SQLite needs to add a page, it will
overwrite 5 bytes of page 2. This means that for data loss to occur, it
is sufficient to add a single page to the database, for example by
creating a table. Offset 5 will then be zeroed out:
```
$ cargo run -- /tmp/a.db
turso> create table t(a);
turso> insert into t values ('myvalue');
turso> pragma auto_vacuum=full;
turso> create table tt(a);

$ dbtotxt /tmp/a.db
| size 12288 pagesize 4096 filename a.db
| page 1 offset 0
# ...snip...
| page 2 offset 4096
|      0: 01 00 00 00 00 0f f5 00 0f f5 00 00 00 00 00 00   ................
|   4080: 00 00 00 00 00 09 01 02 1b 6d 79 76 61 6c 75 65   .........myvalue
```
Creating more tables, or adding more B-tree pages, will keep overwriting
the rest of the page, until the cells themselves are also overwritten.
## Reproducing the issue in the simulator
We have been unable to reproduce this exact corruption mode in the
simulator, but patching it shows many failure modes, all of which don't
occur with the unpatched simulator. The following seeds are failing. The
following seeds are showing the issue when the patched simulator is ran
against `main`:
- `11522841279124073062`, with "Assertion 'table inquisitive_graham_159
should contain all of its expected values' failed: table
inquisitive_graham_159 does not contain the expected values, the
simulator model has more rows than the database"
- `7057400018220918989`, `16028085350691325843`, `7721542713659053944`,
and `203017821863546118`, with "Failed to read ptrmap key=XXX"
- `12533694709304969540`, `18357088553315413457`, `3108945730906932377`,
with "Integrity Check Failed: Cell N in page 2 is out of range."
    - `4757352625344646473`, with "dirty pages should be empty for read
txn"
- `7083498604824302257`, with "header_size: 6272, header_len_bytes: 2,
payload.len(): 13"
- `17881876827470741581`, with "ParseError("no such table:
focused_historians_416")"
- `2092231500503735693`, with "range end index 4789 out of range for
slice of length 4096"
- `7555257419378470845`, with malformed database schema
(imaginative_ontivero\u{1})"
- `12905270229511147245`, with "index out of bounds: the len is 4096 but
the index is 4096"
## Fixing the issue
- When DB is opened, we read the `auto_vacuum` state, instead of
assuming `auto_vacuum=none`.
- Don't allow auto_vacuum to be flipped on non-empty databases as if we
allow this it could cause overlap with existing bits.(ptrmap could
overwrite existing data)
- Modify integrity check to avoid reporting that page 2 is orphaned in
auto-vacuumed databases.
Fixes #3752

Closes #3830
2025-10-28 14:48:35 +02:00
Jussi Saurio
4e48e1ffad Make an exception for Expr::SubqueryResult in collect_result_columns() 2025-10-28 13:11:12 +02:00
Jussi Saurio
c80cf2831d Support subqueries in all positions of a SELECT statement 2025-10-28 13:11:12 +02:00
Jussi Saurio
49ee5529cb Evaluate uncorrelated subqueries as early as possible
even LIMIT can reference an uncorrelated subquery, so we need to translate
them before we do anything with LIMIT.
2025-10-28 13:11:11 +02:00
Jussi Saurio
3294b78051 Initialize LIMIT after after ORDER BY / GROUP BY initialization
Currently LIMIT 0 jumps to "after the main loop", and it is done
before ORDER BY and GROUP BY cursor have had a chance to be initialized,
which causes a panic.

Simplest fix for now is to delay the LIMIT initialization.
2025-10-28 13:08:05 +02:00
Nikita Sivukhin
0da3b4bfd3 fix after rebase 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
bec295f2c0 fix clippy 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
8ea733f917 fix bug with cursor allocation 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
8acbe3de66 make query_start method to return bool - if result will have some rows or not 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
e42ce24534 fix fmt 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
67c1855ba8 fix bug 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
180713d32a plug IndexMethod into optimizer 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
56796151bc support necessary helpers 2025-10-28 11:27:35 +04:00