Commit Graph

10620 Commits

Author SHA1 Message Date
Nikita Sivukhin
35c323730c add test to reproduce the bug with cached cursors for statement in between of different runs
thread 'query_processing::test_read_path::test_stmt_reset' panicked at core/storage/sqlite3_ondisk.rs:754:9:
assertion failed: self.page_type() == PageType::TableLeaf
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
test query_processing::test_read_path::test_stmt_reset ... FAILED
2025-10-29 15:13:00 +04:00
Jussi Saurio
4bf8ad8cfd Merge 'Support subqueries in all positions of a SELECT statement' from Jussi Saurio
Follow-up to #3847.
Adds support for subqueries in all other positions of a SELECT (the
result list, GROUP BY, ORDER BY, HAVING, LIMIT, OFFSET).
Turns out I am a sql noob and didn't realize that correlated subqueries
are supported in basically all positions except LIMIT/OFFSET, so added
support for those too + accompanying TCL tests.
Thankfully the abstractions introduced in #3847 carry over to this very
well so the code change is relatively small (over half of the diff is
tests and a lot of the remaining diff is just moving logic around).

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3852
2025-10-29 10:19:39 +02:00
Jussi Saurio
fcb927ed24 Merge 'Initialize LIMIT after after ORDER BY / GROUP BY initialization' from Jussi Saurio
Closes #3853
Currently LIMIT 0 jumps to "after the main loop", and it is done before
ORDER BY and GROUP BY cursor have had a chance to be initialized, which
causes a panic.
Simplest fix for now is to delay the LIMIT initialization.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3854
2025-10-29 10:17:05 +02:00
Jussi Saurio
29fe3b585a Add more tests and disable correlated IN-subqueries in HAVING position
I discovered a flaw in our current translation that makes queries of type
HAVING foo IN (SELECT ...) not work properly - in these cases we need to
defer translation of the subquery until later.

I will fix this in a future PR because I suspect it's not trivial.
2025-10-29 09:57:55 +02:00
Jussi Saurio
ad723b615f Merge 'index_method: fully integrate into query planner' from Nikita Sivukhin
This PR completely integrate custom indices to the query planner.
In order to do that new `Cursor::IndexMethod` is introduced with few
correlated changes in the VM implementation:
1. Added special `IndexMethod{Create,Destroy,Query}` opcodes to handle
index method creation, deletion and query
2. `Next` , `IdxRowid` , `IdxInsert`, `IdxDelete` opcodes updated to
properly handle new cursor case

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3827
2025-10-29 09:42:37 +02:00
Pekka Enberg
067c4f624b Turso 0.3.0-pre.5 2025-10-28 14:49:34 +02:00
Pekka Enberg
dae2930743 Merge 'core: Switch to FxHash to improve performance' from Pekka Enberg
The default Rust hash map is slow for integer keys. Switch to FxHash
instead to reduce executed instructions for, for example, throughput
benchmark.
Before:
```
penberg@turing:~/src/tursodatabase/turso/perf/throughput/turso$ perf stat ../../../target/release/write-throughput --threads 1 --batch-size 100 --compute 0 -i 10000
Turso,1,100,0,106875.21

 Performance counter stats for '../../../target/release/write-throughput --threads 1 --batch-size 100 --compute 0 -i 10000':

          2,908.02 msec task-clock                       #    0.310 CPUs utilized
            30,508      context-switches                 #   10.491 K/sec
               261      cpu-migrations                   #   89.752 /sec
               813      page-faults                      #  279.572 /sec
    20,655,313,128      instructions                     #    1.73  insn per cycle
                                                  #    0.14  stalled cycles per insn
    11,930,088,949      cycles                           #    4.102 GHz
     2,845,040,381      stalled-cycles-frontend          #   23.85% frontend cycles idle
     3,814,652,892      branches                         #    1.312 G/sec
        54,760,600      branch-misses                    #    1.44% of all branches

       9.372979876 seconds time elapsed

       2.276835000 seconds user
       0.530135000 seconds sys
```
After:
```
penberg@turing:~/src/tursodatabase/turso/perf/throughput/turso$ perf stat ../../../target/release/write-throughput --threads 1 --batch-size 100 --compute 0 -i 10000
Turso,1,100,0,108663.84

 Performance counter stats for '../../../target/release/write-throughput --threads 1 --batch-size 100 --compute 0 -i 10000':

          2,838.65 msec task-clock                       #    0.308 CPUs utilized
            30,629      context-switches                 #   10.790 K/sec
               351      cpu-migrations                   #  123.650 /sec
               818      page-faults                      #  288.165 /sec
    19,887,102,451      instructions                     #    1.72  insn per cycle
                                                  #    0.14  stalled cycles per insn
    11,593,166,024      cycles                           #    4.084 GHz
     2,830,298,617      stalled-cycles-frontend          #   24.41% frontend cycles idle
     3,764,334,333      branches                         #    1.326 G/sec
        53,157,766      branch-misses                    #    1.41% of all branches

       9.218225731 seconds time elapsed

       2.231889000 seconds user
       0.508785000 seconds sys

```

Closes #3837
2025-10-28 14:49:09 +02:00
Pekka Enberg
76da008bc2 Merge 'bindings/rust: Enable mimalloc as global allocator' from Pekka Enberg
This improves performance by using mimalloc for memory allocation in the
Rust bindings.

Closes #3839
2025-10-28 14:49:02 +02:00
Pekka Enberg
810ed8ad60 Merge 'Don't allow autovacuum to be flipped on non-empty databases' from Pavan Nambi
Turso incorrectly creates the first table in an autovacuumed table in
page 2.
(Note: this is on collaboration with @LeMikaelF)
SQLite does not allow enabling or disabling auto-vacuum after the first
table has been created
(https://sqlite.org/pragma.html#pragma_auto_vacuum). This is because the
sequence of the pages in the databases is different when auto-vacuum is
enabled, because the first b-tree page must be page 3 instead of 2, to
make room for the first [Pointer Map
page](https://sqlite.org/fileformat.html#pointer_map_or_ptrmap_pages).
But Turso doesn't currently consider this, which can lead to data loss.
The simplest way to reproduce this is to create an autovacuumed
databases with either `pragma auto_vacuum=full` so that autovacuum runs
on each commit, and then create a table with some data. Turso will
incorrectly create the new table on page 2. After this, every time a new
page is created, either through a page split or because a new table is
created, Turso will write a 5-byte pointer in page 2, starting from the
top of the page, thereby overwriting existing data.
For example, let's start with a clean database and the first bytes of
page 2. It starts with `0d`, the discriminator for a leaf page
([source](https://www.sqlite.org/fileformat.html#b_tree_pages)). The
next interesting number is the number of cells contained in this page
(`01`) at offset 5.
```
$ cargo run -- /tmp/a.db
turso> create table t(a);
turso> insert into t values ('myvalue');

$ dbtotxt /tmp/a.db
| size 8192 pagesize 4096 filename a.db
| page 1 offset 0
# ...snip...
| page 2 offset 4096
|      0: 0d 00 00 00 01 0f f5 00 0f f5 00 00 00 00 00 00   ................
|   4080: 00 00 00 00 00 09 01 02 1b 6d 79 76 61 6c 75 65   .........myvalue
| end a.db
```
Pointer map pages are located every N pages, starting from page 2, and
contain a list of 5-byte pointers that represent the parent page of a
certain page. So whenever Turso or SQLite needs to add a page, it will
overwrite 5 bytes of page 2. This means that for data loss to occur, it
is sufficient to add a single page to the database, for example by
creating a table. Offset 5 will then be zeroed out:
```
$ cargo run -- /tmp/a.db
turso> create table t(a);
turso> insert into t values ('myvalue');
turso> pragma auto_vacuum=full;
turso> create table tt(a);

$ dbtotxt /tmp/a.db
| size 12288 pagesize 4096 filename a.db
| page 1 offset 0
# ...snip...
| page 2 offset 4096
|      0: 01 00 00 00 00 0f f5 00 0f f5 00 00 00 00 00 00   ................
|   4080: 00 00 00 00 00 09 01 02 1b 6d 79 76 61 6c 75 65   .........myvalue
```
Creating more tables, or adding more B-tree pages, will keep overwriting
the rest of the page, until the cells themselves are also overwritten.
## Reproducing the issue in the simulator
We have been unable to reproduce this exact corruption mode in the
simulator, but patching it shows many failure modes, all of which don't
occur with the unpatched simulator. The following seeds are failing. The
following seeds are showing the issue when the patched simulator is ran
against `main`:
- `11522841279124073062`, with "Assertion 'table inquisitive_graham_159
should contain all of its expected values' failed: table
inquisitive_graham_159 does not contain the expected values, the
simulator model has more rows than the database"
- `7057400018220918989`, `16028085350691325843`, `7721542713659053944`,
and `203017821863546118`, with "Failed to read ptrmap key=XXX"
- `12533694709304969540`, `18357088553315413457`, `3108945730906932377`,
with "Integrity Check Failed: Cell N in page 2 is out of range."
    - `4757352625344646473`, with "dirty pages should be empty for read
txn"
- `7083498604824302257`, with "header_size: 6272, header_len_bytes: 2,
payload.len(): 13"
- `17881876827470741581`, with "ParseError("no such table:
focused_historians_416")"
- `2092231500503735693`, with "range end index 4789 out of range for
slice of length 4096"
- `7555257419378470845`, with malformed database schema
(imaginative_ontivero\u{1})"
- `12905270229511147245`, with "index out of bounds: the len is 4096 but
the index is 4096"
## Fixing the issue
- When DB is opened, we read the `auto_vacuum` state, instead of
assuming `auto_vacuum=none`.
- Don't allow auto_vacuum to be flipped on non-empty databases as if we
allow this it could cause overlap with existing bits.(ptrmap could
overwrite existing data)
- Modify integrity check to avoid reporting that page 2 is orphaned in
auto-vacuumed databases.
Fixes #3752

Closes #3830
2025-10-28 14:48:35 +02:00
Jussi Saurio
ec1eac2943 Include subqueries in all positions in subquery fuzz test 2025-10-28 14:32:55 +02:00
Jussi Saurio
ca70df21ac Update COMPAT.md 2025-10-28 13:11:12 +02:00
Jussi Saurio
5fa73679f3 Add TCL tests for subqueries in all positions of a SELECT 2025-10-28 13:11:12 +02:00
Jussi Saurio
4e48e1ffad Make an exception for Expr::SubqueryResult in collect_result_columns() 2025-10-28 13:11:12 +02:00
Jussi Saurio
c80cf2831d Support subqueries in all positions of a SELECT statement 2025-10-28 13:11:12 +02:00
Jussi Saurio
49ee5529cb Evaluate uncorrelated subqueries as early as possible
even LIMIT can reference an uncorrelated subquery, so we need to translate
them before we do anything with LIMIT.
2025-10-28 13:11:11 +02:00
Jussi Saurio
3294b78051 Initialize LIMIT after after ORDER BY / GROUP BY initialization
Currently LIMIT 0 jumps to "after the main loop", and it is done
before ORDER BY and GROUP BY cursor have had a chance to be initialized,
which causes a panic.

Simplest fix for now is to delay the LIMIT initialization.
2025-10-28 13:08:05 +02:00
Nikita Sivukhin
0da3b4bfd3 fix after rebase 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
bec295f2c0 fix clippy 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
e7cab016d4 fix tests 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
8ea733f917 fix bug with cursor allocation 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
6f62621b5e adjust test more 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
18989185d4 add simple fuzz test 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
8acbe3de66 make query_start method to return bool - if result will have some rows or not 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
e42ce24534 fix fmt 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
67c1855ba8 fix bug 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
6206294584 fix clippy 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
180713d32a plug IndexMethod into optimizer 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
d6972a9cf3 fix explain 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
56796151bc support necessary helpers 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
212bcfe08f integrate IndexMethod into select main loop 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
61c9279a57 properly translate column which was covered by index method 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
d9ea3be4b8 forbid usage of IndexMethod in insert/delete loops 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
d65b7eddc0 add helper for simple binding of values in the AST 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
35b96ae8d8 fix few places which needs to be hooked into new types 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
8dd2644c07 add support for new cursor type in existing op codes and also implement new opcodes in the VM 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
e9b1ca12b6 add new access operation through IndexMethod 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
37de39e5d1 integrate IndexMethod to the insert/delete flow 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
b994e2cbd8 add new Cursor type 2025-10-28 11:27:35 +04:00
Nikita Sivukhin
5af10e6ccb add IndexMethod specific VM instructions 2025-10-28 11:27:35 +04:00
Jussi Saurio
f593fd1a8d remove deprecated flag from TempDatabase::new_empty() usage in fuzz test 2025-10-28 09:10:05 +02:00
Jussi Saurio
dae2441dd1 Fix compilation error after incompatible merges 2025-10-28 07:05:18 +02:00
Jussi Saurio
d993ac8157 Merge 'index_method: implement basic trait and simple toy index' from Nikita Sivukhin
This PR adds `index_method` trait and implementation of toy sparse
vector index.
In order to make PR more lightweight - for now index methods are not
deeply integrated into the query planner and only necessary components
are added in order to make integration tests which uses `index_method`
API directly to work.
Primary changes introduced in this PR are:
1. `SymbolTable` extended with `index_methods` field and builtin
extensions populated with 2 native indices: `backing_btree` and
`toy_vector_sparse_ivf`
2. `Index` struct extended with `index_method` field which holds
`IndexMethodAttachment` constructed for the table with given parameters
from `IndexMethod` "factory" trait
The toy index implementation store inverted index pairs `(dimension,
rowid)` in the auxilary BTree index. This index uses special
`backing_btree` index_method which marked as `backing_btree: true` and
treated in a special way by the db core: this is real BTree index which
is not managed by the tursodb core and must be managed by index_method
created it (so it responsible for data population, creation, destruction
of this btree).

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3846
2025-10-28 07:01:36 +02:00
Jussi Saurio
9c87b20cb2 Merge 'Where clause subquery support' from Jussi Saurio
Closes #1282
# Support for WHERE clause subqueries
This PR implements support for subqueries that appear in the WHERE
clause of SELECT statements.
## What are those lol
1. **EXISTS subqueries**: `WHERE EXISTS (SELECT ...)`
2. **Row value subqueries**: `WHERE x = (SELECT ...)` or `WHERE (x, y) =
(SELECT ...)`. The latter are not yet supported - only the single-column
("scalar subquery") case is.
3. **IN subqueries**: `WHERE x IN (SELECT ...)` or `WHERE (x, y) IN
(SELECT ...)`
## Correlated vs Uncorrelated Subqueries
- **Uncorrelated subqueries** reference only their own tables and can be
evaluated once.
- **Correlated subqueries** reference columns from the outer query
(e.g., `WHERE EXISTS (SELECT * FROM t2 WHERE t2.id = t1.id)`) and must
be re-evaluated for each row of the outer query
## Implementation
### Planning
During query planning, the WHERE clause is walked to find subquery
expressions (`Expr::Exists`, `Expr::Subquery`, `Expr::InSelect`). Each
subquery is:
1. Assigned a unique internal ID
2. Compiled into its own `SelectPlan` with outer query tables provided
as available references
3. Replaced in the AST with an `Expr::SubqueryResult` node that
references the subquery with its internal ID
4. Stored in a `Vec<NonFromClauseSubquery>` on the `SelectPlan`
For IN subqueries, an ephemeral index is created to store the subquery
results; for other kinds, the results are stored in register(s).
### Translation
Before emitting bytecode, we need to determine when each subquery should
be evaluated:
- **Uncorrelated**: Evaluated once before opening any table cursors
- **Correlated**: Evaluated at the appropriate nested loop depth after
all referenced outer tables are in scope
This is calculated by examining which outer query tables the subquery
references and finding the right-most (innermost) loop that opens those
tables - using similar mechanisms that we use for figuring out when to
evaluate other `WhereTerm`s too.
### Code Generation
- **EXISTS**: Sets a register to 1 if any row is produced, 0 otherwise.
Has new `QueryDestination::ExistsSubqueryResult` variant.
- **IN**: Results stored in an ephemeral index and the index is probed.
- **RowValue**: Results stored in a range of registers. Has new
`QueryDestination::RowValueSubqueryResult` variant.
## Annoying details
### Which cursor to read from in a subquery?
Sometimes a query will use a covering index, i.e. skip opening the table
cursor at all if the index contains All The Needed Stuff.
Correlated subqueries reading columns from outer tables is a bit
problematic in this regard: with our current translation code, the
subquery doesn't know whether the outer query opened a table cursor,
index cursor, or both. So, for now, we try to find a table cursor first,
then fall back to finding any index cursor for that table.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3847
2025-10-28 06:36:55 +02:00
Preston Thorpe
ccaf39de93 Merge 'index method syntax extension' from Nikita Sivukhin
Add support for index method syntax extension (similar to postgresql)
and hide it for now behind `--experimental-index-method` flag
```sh
$> cargo run --package turso_cli -- --experimental-index-method
turso> CREATE TABLE t(x);
turso> CREATE INDEX t_idx ON t USING index_method (x) WITH (a = 1, b = '2');
turso> SELECT * FROM sqlite_master;
┌───────┬───────┬──────────┬──────────┬───────────────────────────────────────────────────────────────────────┐
│ type  │ name  │ tbl_name │ rootpage │ sql                                                                   │
├───────┼───────┼──────────┼──────────┼───────────────────────────────────────────────────────────────────────┤
│ table │ t     │ t        │        2 │ CREATE TABLE t (x)                                                    │
├───────┼───────┼──────────┼──────────┼───────────────────────────────────────────────────────────────────────┤
│ index │ t_idx │ t        │        3 │ CREATE INDEX t_idx ON t USING index_method (x) WITH (a = 1, b = '2') │
└───────┴───────┴──────────┴──────────┴───────────────────────────────────────────────────────────────────────┘
```

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3842
2025-10-27 14:03:22 -04:00
Jussi Saurio
a0d6fcba23 Unignore those TPC-H tests that can be ignored 2025-10-27 16:23:38 +02:00
Jussi Saurio
0b08f006d3 Add subquery fuzz test 2025-10-27 16:23:38 +02:00
Jussi Saurio
82995b4264 Add subquery TCL tests 2025-10-27 16:10:49 +02:00
Jussi Saurio
f288dfd3d0 TableMask: take tables referenced in subqueries into account
This influences valid potential join orders.
2025-10-27 16:10:49 +02:00
Jussi Saurio
59363a1be3 Translate Expr::SubqueryResult into bytecode 2025-10-27 16:01:39 +02:00
Jussi Saurio
bc2a7c79f9 Add TODO comment about subquery positions we don't support yet 2025-10-27 16:01:39 +02:00