Commit Graph

714 Commits

Author SHA1 Message Date
Jussi Saurio
d993ac8157 Merge 'index_method: implement basic trait and simple toy index' from Nikita Sivukhin
This PR adds `index_method` trait and implementation of toy sparse
vector index.
In order to make PR more lightweight - for now index methods are not
deeply integrated into the query planner and only necessary components
are added in order to make integration tests which uses `index_method`
API directly to work.
Primary changes introduced in this PR are:
1. `SymbolTable` extended with `index_methods` field and builtin
extensions populated with 2 native indices: `backing_btree` and
`toy_vector_sparse_ivf`
2. `Index` struct extended with `index_method` field which holds
`IndexMethodAttachment` constructed for the table with given parameters
from `IndexMethod` "factory" trait
The toy index implementation store inverted index pairs `(dimension,
rowid)` in the auxilary BTree index. This index uses special
`backing_btree` index_method which marked as `backing_btree: true` and
treated in a special way by the db core: this is real BTree index which
is not managed by the tursodb core and must be managed by index_method
created it (so it responsible for data population, creation, destruction
of this btree).

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3846
2025-10-28 07:01:36 +02:00
Jussi Saurio
9c87b20cb2 Merge 'Where clause subquery support' from Jussi Saurio
Closes #1282
# Support for WHERE clause subqueries
This PR implements support for subqueries that appear in the WHERE
clause of SELECT statements.
## What are those lol
1. **EXISTS subqueries**: `WHERE EXISTS (SELECT ...)`
2. **Row value subqueries**: `WHERE x = (SELECT ...)` or `WHERE (x, y) =
(SELECT ...)`. The latter are not yet supported - only the single-column
("scalar subquery") case is.
3. **IN subqueries**: `WHERE x IN (SELECT ...)` or `WHERE (x, y) IN
(SELECT ...)`
## Correlated vs Uncorrelated Subqueries
- **Uncorrelated subqueries** reference only their own tables and can be
evaluated once.
- **Correlated subqueries** reference columns from the outer query
(e.g., `WHERE EXISTS (SELECT * FROM t2 WHERE t2.id = t1.id)`) and must
be re-evaluated for each row of the outer query
## Implementation
### Planning
During query planning, the WHERE clause is walked to find subquery
expressions (`Expr::Exists`, `Expr::Subquery`, `Expr::InSelect`). Each
subquery is:
1. Assigned a unique internal ID
2. Compiled into its own `SelectPlan` with outer query tables provided
as available references
3. Replaced in the AST with an `Expr::SubqueryResult` node that
references the subquery with its internal ID
4. Stored in a `Vec<NonFromClauseSubquery>` on the `SelectPlan`
For IN subqueries, an ephemeral index is created to store the subquery
results; for other kinds, the results are stored in register(s).
### Translation
Before emitting bytecode, we need to determine when each subquery should
be evaluated:
- **Uncorrelated**: Evaluated once before opening any table cursors
- **Correlated**: Evaluated at the appropriate nested loop depth after
all referenced outer tables are in scope
This is calculated by examining which outer query tables the subquery
references and finding the right-most (innermost) loop that opens those
tables - using similar mechanisms that we use for figuring out when to
evaluate other `WhereTerm`s too.
### Code Generation
- **EXISTS**: Sets a register to 1 if any row is produced, 0 otherwise.
Has new `QueryDestination::ExistsSubqueryResult` variant.
- **IN**: Results stored in an ephemeral index and the index is probed.
- **RowValue**: Results stored in a range of registers. Has new
`QueryDestination::RowValueSubqueryResult` variant.
## Annoying details
### Which cursor to read from in a subquery?
Sometimes a query will use a covering index, i.e. skip opening the table
cursor at all if the index contains All The Needed Stuff.
Correlated subqueries reading columns from outer tables is a bit
problematic in this regard: with our current translation code, the
subquery doesn't know whether the outer query opened a table cursor,
index cursor, or both. So, for now, we try to find a table cursor first,
then fall back to finding any index cursor for that table.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3847
2025-10-28 06:36:55 +02:00
Nikita Sivukhin
bdbfac20fb resolve index method parameters 2025-10-27 16:39:22 +04:00
Nikita Sivukhin
97dcc0869e register index_methods as db builtin extensions 2025-10-27 16:31:31 +04:00
Jussi Saurio
de81af29e5 find_table_by_internal_id() returns whether table is an outer query reference
Unfortunately, our current translation machinery is unable to know for sure
whether a subquery reference to an outer table 't1' has opened a table cursor,
an index cursor, or both.

For this reason, return a flag from `TableReferences::find_table_by_internal_id()`
that tells the caller whether the table is an outer query reference, and further
commits will have some additional logic to decide which cursor a subquery will
read from when referencing a table from the outer query.
2025-10-27 13:47:49 +02:00
Nikita Sivukhin
8a80e8b743 rename custom modules to index_method like in postgresql 2025-10-27 13:18:18 +04:00
Nikita Sivukhin
299533b7b6 hide custom modules syntax behind --experimental-custom-modules flag 2025-10-27 12:29:05 +04:00
Nikita Sivukhin
f178daa373 update comment 2025-10-27 11:47:25 +04:00
Nikita Sivukhin
906bbdd1c4 support deep nestedness 2025-10-27 11:37:42 +04:00
Pekka Enberg
c3fb867173 core: Switch RwLock<Arc<Pager>> to ArcSwap<Pager>
We don't actually need the RwLock locking capabilities, just the ability
to swap the instance.
2025-10-24 14:10:08 +03:00
PThorpe92
a8b257c664 Replace several RwLock<Enum> values with new AtomicEnums 2025-10-22 09:35:26 -04:00
Pekka Enberg
bf5de920f2 core: Unsafe Send and Sync pushdown
This patch pushes unsafe Send and Sync to individual components instead
of doing it at Database level. This makes it easier for us to
incrementally fix thread-safety, but avoid developers adding more thread
unsafe code.
2025-10-16 11:26:50 +03:00
pedrocarlo
23380a58d7 make next truly async and non blocking 2025-10-14 12:33:36 -03:00
pedrocarlo
0d95a2924a pass optional waker to step 2025-10-14 12:33:36 -03:00
Bob Peterson
cd56f52bd6 Add cfg attributes for running under Miri 2025-10-13 14:54:16 -05:00
Jussi Saurio
acb3c97fea Merge 'When pwritev fails, clear the dirty pages' from Pedro Muniz
If we don't clear the dirty pages, we will initiate a rollback. In the
rollback, we will attempt to clear the whole page cache, but it will
then panic because there will still be dirty pages from the failed
writev

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3189
2025-10-09 10:38:47 +03:00
Pekka Enberg
3c525219a2 Merge 'mvcc: Disable automatic checkpointing by default' from Pekka Enberg
MVCC checkpointing currently prevents concurrent writes so disable it by
default while we work on it.

Closes #3631
2025-10-08 17:09:37 +03:00
Pekka Enberg
94c343770d mvcc: Disable automatic checkpointing by default
MVCC checkpointing currently prevents concurrent writes so disable it by
default while we work on it.
2025-10-08 09:14:55 +03:00
PThorpe92
7e9277958b Fix deferred FK in vdbe 2025-10-07 16:45:23 -04:00
PThorpe92
a232e3cc7a Implement proper handling of deferred foreign keys 2025-10-07 16:45:23 -04:00
PThorpe92
fa23cedbbe Add helper to pragma to parse enabled opts and fix schema parsing for foreign key constraints 2025-10-07 16:45:22 -04:00
PThorpe92
346e6fedfa Create ForeignKey, ResolvedFkRef types and FK resolution 2025-10-07 16:27:49 -04:00
Levy A.
77a412f6af refactor: remove unsafe reference semantics from RefValue
also renames `RefValue` to `ValueRef`, to align with rusqlite and other
crates
2025-10-07 10:43:44 -03:00
Nikita Sivukhin
bd1013d62f emit proper column information for explain prepared statements 2025-10-07 12:28:55 +04:00
Pekka Enberg
a72b07e949 Merge 'Fix VDBE program abort' from Nikita Sivukhin
This PR add proper program abort in case of unfinished statement reset
and interruption.
Also, this PR makes rollback methods non-failing because otherwise of
their callers usually unclear (if rollback failed - what is the state of
statement/connection/transaction?)

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3591
2025-10-07 09:07:07 +03:00
Pekka Enberg
dacb8e3350 Merge 'Fix attach I/O error with in-memory databases' from Preston Thorpe
closes #3540

Closes #3602
2025-10-07 09:00:02 +03:00
bit-aloo
fb5f5d9a90 Add MVCC checkpoint threshold APIs to Connection 2025-10-07 10:17:04 +05:30
PThorpe92
20d2ca55fe fix clippy warning 2025-10-06 21:43:48 -04:00
PThorpe92
17da71ee3c Open db with proper IO when attaching database to fix #3540 2025-10-06 21:33:20 -04:00
Nikita Sivukhin
e2f7310617 add explicit tracker for Txn cleanup necessary for statement 2025-10-06 17:51:43 +04:00
Nikita Sivukhin
0ace1f9d90 fix code in order to not reset internal prepared statements created during DDL execution 2025-10-06 15:11:23 +04:00
Nikita Sivukhin
4877180784 fix clippy 2025-10-06 13:34:16 +04:00
Nikita Sivukhin
a3ca5f6bf2 implement Drop for Statement 2025-10-06 13:27:42 +04:00
Nikita Sivukhin
48ca3864b8 properly abort statement in case of reset (when statement wasn't executed till completion) and interrupt 2025-10-06 13:22:26 +04:00
Nikita Sivukhin
8dae601fac make rollback non-failing method 2025-10-06 13:21:45 +04:00
Nikita Sivukhin
38d2630969 remove unnecessary SchemaLocked error
- lock() return error in case when another thread panicked while holding the same lock
- we better to just panic too in any such case
2025-10-06 12:15:15 +04:00
Pekka Enberg
c27b167c6d core/io: Add completion group API for managing multiple I/O operations
Introduces a completion group abstraction that allows grouping multiple
I/O completions together for coordinated tracking and error handling.
This enables:

- Tracking completion status of multiple I/O operations as a group
- Detecting when all operations in a group have finished
- Aborting all operations in a group atomically
- Retrieving errors from any completion in the group

The implementation uses intrusive linked lists for efficient membership
tracking and atomic counters for outstanding operation counts. Each
completion can be linked to a group using the new .link() method.

This lays the groundwork for batch I/O operations and coordinated
transaction handling in the storage layer.
2025-10-06 07:33:31 +03:00
pedrocarlo
911b6791b9 when pwritev fails, clear the dirty pages
add flag to `clear_page_cache`
2025-10-05 20:02:21 -03:00
pedrocarlo
e93add6c80 remove dyn DatabaseStorage and replace it with DatabaseFile 2025-10-03 14:14:15 -03:00
Jussi Saurio
ec6731de0a Disallow unexpected interop between WAL mode and MVCC mode
1. DB cannot be opened with MVCC if non-zero WAL file exists
2. DB cannot be opened without MVCC if non-zero logical log file exists
2025-10-03 12:00:35 +03:00
Pekka Enberg
297aaf4887 core/mvcc: Rename "-lg" to "-log"
The "lg" name is just weird.
2025-10-03 10:08:02 +03:00
Pekka Enberg
78e3311c3b Merge 'Sync engine defered sync' from Nikita Sivukhin
This PR makes sync client completely autonomous as now it can defer
initial sync.
This can open possibility to asynchronously create DB in the Turso Cloud
while giving user ability to interact with local DB straight away.

Closes #3531
2025-10-02 17:25:11 +03:00
Nikita Sivukhin
c0b6210756 add missed method in the core 2025-10-02 16:19:52 +04:00
Pekka Enberg
7bfb4dc203 Merge 'Fix MVCC startup infinite loop when using existing DB' from Jussi Saurio
MVCC bootstrap connection got stuck into an infinite statement reparsing
loop because the bootstrap procedure happened before the on-disk schema
was deserialized.
closes #3518

Closes #3522
2025-10-02 14:20:42 +03:00
Jussi Saurio
3a1851ec06 Fix MVCC startup infinite loop when using existing DB
MVCC bootstrap connection got stuck into an infinite statement
reparsing loop because the bootstrap procedure happened before the
on-disk schema was deserialized.
2025-10-02 13:21:44 +03:00
Pekka Enberg
4c5a7cda08 core/vdbe: Avoid cloning Arc<MvStore> on every VDBE step
The VDBE step() function was taking Arc<MvStore> by value, causing it to
be cloned on every single step of query execution. This resulted in
thousands of atomic reference count increments/decrements per query,
showing up as a major hotspot in profiling.

Changed step() and related functions to take Option<&Arc<MvStore>>
instead, passing a reference rather than cloning the Arc. This eliminates
the unnecessary atomic operations while maintaining the same semantics.
2025-10-02 12:28:11 +03:00
Jussi Saurio
7360edc169 Merge 'mvcc: dont try to end pager tx on connection close' from Jussi Saurio
closes #3487

Closes #3491
2025-10-02 10:06:23 +03:00
Jussi Saurio
c395e051cb mvcc: dont try to end pager tx on connection close 2025-10-01 10:17:41 +03:00
Jussi Saurio
28c1ebc128 Add Database::indexes_enabled() 2025-10-01 10:14:05 +03:00
Jussi Saurio
8a08f085e8 Merge 'Fix SQLite database file pending byte page' from Pedro Muniz
Sqlite has a crazy easter egg where a 1 Gib file offset, it creates a
`PENDING_BYTE_PAGE` that is used only by the VFS layer, and is never
read or written into.
To properly test this, I took inspiration from SQLITE testing framework,
and defined a helper method, that is conditionally compiled with the
`test_helper` feature enabled.
https://github.com/sqlite/sqlite/blob/7e38287da43ea3b661da3d8c1f431aa907
d648c9/src/main.c#L4327
As the `PENDING_BYTE` is normally at the 1 Gib mark, I created a
function that modifies the static `PENDING_BYTE` atomic to whatever
value we want. This means we can test this unusual behaviours at any DB
file size we want.
`fuzz_pending_byte_database` is the test that fuzzes different pending
byte offsets and does an integrity check at the end to confirm, we are
compatible with SQLITE
Closes #2749
<img width="1100" height="740" alt="image" src="https://github.com/user-
attachments/assets/06eb258f-b4b4-47bf-85f9-df1cf411e1df" />

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3431
2025-10-01 08:55:44 +03:00