turso

mirror of https://github.com/aljazceru/turso.git synced 2026-01-06 17:54:20 +01:00

Author	SHA1	Message	Date
Jussi Saurio	d993ac8157	Merge 'index_method: implement basic trait and simple toy index' from Nikita Sivukhin This PR adds `index_method` trait and implementation of toy sparse vector index. In order to make PR more lightweight - for now index methods are not deeply integrated into the query planner and only necessary components are added in order to make integration tests which uses `index_method` API directly to work. Primary changes introduced in this PR are: 1. `SymbolTable` extended with `index_methods` field and builtin extensions populated with 2 native indices: `backing_btree` and `toy_vector_sparse_ivf` 2. `Index` struct extended with `index_method` field which holds `IndexMethodAttachment` constructed for the table with given parameters from `IndexMethod` "factory" trait The toy index implementation store inverted index pairs `(dimension, rowid)` in the auxilary BTree index. This index uses special `backing_btree` index_method which marked as `backing_btree: true` and treated in a special way by the db core: this is real BTree index which is not managed by the tursodb core and must be managed by index_method created it (so it responsible for data population, creation, destruction of this btree). Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #3846	2025-10-28 07:01:36 +02:00
Jussi Saurio	9c87b20cb2	Merge 'Where clause subquery support' from Jussi Saurio Closes #1282 # Support for WHERE clause subqueries This PR implements support for subqueries that appear in the WHERE clause of SELECT statements. ## What are those lol 1. EXISTS subqueries: `WHERE EXISTS (SELECT ...)` 2. Row value subqueries: `WHERE x = (SELECT ...)` or `WHERE (x, y) = (SELECT ...)`. The latter are not yet supported - only the single-column ("scalar subquery") case is. 3. IN subqueries: `WHERE x IN (SELECT ...)` or `WHERE (x, y) IN (SELECT ...)` ## Correlated vs Uncorrelated Subqueries - Uncorrelated subqueries reference only their own tables and can be evaluated once. - Correlated subqueries reference columns from the outer query (e.g., `WHERE EXISTS (SELECT * FROM t2 WHERE t2.id = t1.id)`) and must be re-evaluated for each row of the outer query ## Implementation ### Planning During query planning, the WHERE clause is walked to find subquery expressions (`Expr::Exists`, `Expr::Subquery`, `Expr::InSelect`). Each subquery is: 1. Assigned a unique internal ID 2. Compiled into its own `SelectPlan` with outer query tables provided as available references 3. Replaced in the AST with an `Expr::SubqueryResult` node that references the subquery with its internal ID 4. Stored in a `Vec<NonFromClauseSubquery>` on the `SelectPlan` For IN subqueries, an ephemeral index is created to store the subquery results; for other kinds, the results are stored in register(s). ### Translation Before emitting bytecode, we need to determine when each subquery should be evaluated: - Uncorrelated: Evaluated once before opening any table cursors - Correlated: Evaluated at the appropriate nested loop depth after all referenced outer tables are in scope This is calculated by examining which outer query tables the subquery references and finding the right-most (innermost) loop that opens those tables - using similar mechanisms that we use for figuring out when to evaluate other `WhereTerm`s too. ### Code Generation - EXISTS: Sets a register to 1 if any row is produced, 0 otherwise. Has new `QueryDestination::ExistsSubqueryResult` variant. - IN: Results stored in an ephemeral index and the index is probed. - RowValue: Results stored in a range of registers. Has new `QueryDestination::RowValueSubqueryResult` variant. ## Annoying details ### Which cursor to read from in a subquery? Sometimes a query will use a covering index, i.e. skip opening the table cursor at all if the index contains All The Needed Stuff. Correlated subqueries reading columns from outer tables is a bit problematic in this regard: with our current translation code, the subquery doesn't know whether the outer query opened a table cursor, index cursor, or both. So, for now, we try to find a table cursor first, then fall back to finding any index cursor for that table. Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3847	2025-10-28 06:36:55 +02:00
Nikita Sivukhin	bdbfac20fb	resolve index method parameters	2025-10-27 16:39:22 +04:00
Nikita Sivukhin	97dcc0869e	register index_methods as db builtin extensions	2025-10-27 16:31:31 +04:00
Jussi Saurio	de81af29e5	find_table_by_internal_id() returns whether table is an outer query reference Unfortunately, our current translation machinery is unable to know for sure whether a subquery reference to an outer table 't1' has opened a table cursor, an index cursor, or both. For this reason, return a flag from `TableReferences::find_table_by_internal_id()` that tells the caller whether the table is an outer query reference, and further commits will have some additional logic to decide which cursor a subquery will read from when referencing a table from the outer query.	2025-10-27 13:47:49 +02:00
Nikita Sivukhin	8a80e8b743	rename custom modules to index_method like in postgresql	2025-10-27 13:18:18 +04:00
Nikita Sivukhin	299533b7b6	hide custom modules syntax behind --experimental-custom-modules flag	2025-10-27 12:29:05 +04:00
Nikita Sivukhin	f178daa373	update comment	2025-10-27 11:47:25 +04:00
Nikita Sivukhin	906bbdd1c4	support deep nestedness	2025-10-27 11:37:42 +04:00
Pekka Enberg	c3fb867173	core: Switch RwLock<Arc<Pager>> to ArcSwap<Pager> We don't actually need the RwLock locking capabilities, just the ability to swap the instance.	2025-10-24 14:10:08 +03:00
PThorpe92	a8b257c664	Replace several RwLock<Enum> values with new AtomicEnums	2025-10-22 09:35:26 -04:00
Pekka Enberg	bf5de920f2	core: Unsafe Send and Sync pushdown This patch pushes unsafe Send and Sync to individual components instead of doing it at Database level. This makes it easier for us to incrementally fix thread-safety, but avoid developers adding more thread unsafe code.	2025-10-16 11:26:50 +03:00
pedrocarlo	23380a58d7	make next truly async and non blocking	2025-10-14 12:33:36 -03:00
pedrocarlo	0d95a2924a	pass optional waker to step	2025-10-14 12:33:36 -03:00
Bob Peterson	cd56f52bd6	Add cfg attributes for running under Miri	2025-10-13 14:54:16 -05:00
Jussi Saurio	acb3c97fea	Merge 'When pwritev fails, clear the dirty pages' from Pedro Muniz If we don't clear the dirty pages, we will initiate a rollback. In the rollback, we will attempt to clear the whole page cache, but it will then panic because there will still be dirty pages from the failed writev Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #3189	2025-10-09 10:38:47 +03:00
Pekka Enberg	3c525219a2	Merge 'mvcc: Disable automatic checkpointing by default' from Pekka Enberg MVCC checkpointing currently prevents concurrent writes so disable it by default while we work on it. Closes #3631	2025-10-08 17:09:37 +03:00
Pekka Enberg	94c343770d	mvcc: Disable automatic checkpointing by default MVCC checkpointing currently prevents concurrent writes so disable it by default while we work on it.	2025-10-08 09:14:55 +03:00
PThorpe92	7e9277958b	Fix deferred FK in vdbe	2025-10-07 16:45:23 -04:00
PThorpe92	a232e3cc7a	Implement proper handling of deferred foreign keys	2025-10-07 16:45:23 -04:00
PThorpe92	fa23cedbbe	Add helper to pragma to parse enabled opts and fix schema parsing for foreign key constraints	2025-10-07 16:45:22 -04:00
PThorpe92	346e6fedfa	Create ForeignKey, ResolvedFkRef types and FK resolution	2025-10-07 16:27:49 -04:00
Levy A.	77a412f6af	refactor: remove unsafe reference semantics from `RefValue` also renames `RefValue` to `ValueRef`, to align with rusqlite and other crates	2025-10-07 10:43:44 -03:00
Nikita Sivukhin	bd1013d62f	emit proper column information for explain prepared statements	2025-10-07 12:28:55 +04:00
Pekka Enberg	a72b07e949	Merge 'Fix VDBE program abort' from Nikita Sivukhin This PR add proper program abort in case of unfinished statement reset and interruption. Also, this PR makes rollback methods non-failing because otherwise of their callers usually unclear (if rollback failed - what is the state of statement/connection/transaction?) Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3591	2025-10-07 09:07:07 +03:00
Pekka Enberg	dacb8e3350	Merge 'Fix attach I/O error with in-memory databases' from Preston Thorpe closes #3540 Closes #3602	2025-10-07 09:00:02 +03:00
bit-aloo	fb5f5d9a90	Add MVCC checkpoint threshold APIs to Connection	2025-10-07 10:17:04 +05:30
PThorpe92	20d2ca55fe	fix clippy warning	2025-10-06 21:43:48 -04:00
PThorpe92	17da71ee3c	Open db with proper IO when attaching database to fix #3540	2025-10-06 21:33:20 -04:00
Nikita Sivukhin	e2f7310617	add explicit tracker for Txn cleanup necessary for statement	2025-10-06 17:51:43 +04:00
Nikita Sivukhin	0ace1f9d90	fix code in order to not reset internal prepared statements created during DDL execution	2025-10-06 15:11:23 +04:00
Nikita Sivukhin	4877180784	fix clippy	2025-10-06 13:34:16 +04:00
Nikita Sivukhin	a3ca5f6bf2	implement Drop for Statement	2025-10-06 13:27:42 +04:00
Nikita Sivukhin	48ca3864b8	properly abort statement in case of reset (when statement wasn't executed till completion) and interrupt	2025-10-06 13:22:26 +04:00
Nikita Sivukhin	8dae601fac	make rollback non-failing method	2025-10-06 13:21:45 +04:00
Nikita Sivukhin	38d2630969	remove unnecessary SchemaLocked error - lock() return error in case when another thread panicked while holding the same lock - we better to just panic too in any such case	2025-10-06 12:15:15 +04:00
Pekka Enberg	c27b167c6d	core/io: Add completion group API for managing multiple I/O operations Introduces a completion group abstraction that allows grouping multiple I/O completions together for coordinated tracking and error handling. This enables: - Tracking completion status of multiple I/O operations as a group - Detecting when all operations in a group have finished - Aborting all operations in a group atomically - Retrieving errors from any completion in the group The implementation uses intrusive linked lists for efficient membership tracking and atomic counters for outstanding operation counts. Each completion can be linked to a group using the new .link() method. This lays the groundwork for batch I/O operations and coordinated transaction handling in the storage layer.	2025-10-06 07:33:31 +03:00
pedrocarlo	911b6791b9	when pwritev fails, clear the dirty pages add flag to `clear_page_cache`	2025-10-05 20:02:21 -03:00
pedrocarlo	e93add6c80	remove `dyn DatabaseStorage` and replace it with `DatabaseFile`	2025-10-03 14:14:15 -03:00
Jussi Saurio	ec6731de0a	Disallow unexpected interop between WAL mode and MVCC mode 1. DB cannot be opened with MVCC if non-zero WAL file exists 2. DB cannot be opened without MVCC if non-zero logical log file exists	2025-10-03 12:00:35 +03:00
Pekka Enberg	297aaf4887	core/mvcc: Rename "-lg" to "-log" The "lg" name is just weird.	2025-10-03 10:08:02 +03:00
Pekka Enberg	78e3311c3b	Merge 'Sync engine defered sync' from Nikita Sivukhin This PR makes sync client completely autonomous as now it can defer initial sync. This can open possibility to asynchronously create DB in the Turso Cloud while giving user ability to interact with local DB straight away. Closes #3531	2025-10-02 17:25:11 +03:00
Nikita Sivukhin	c0b6210756	add missed method in the core	2025-10-02 16:19:52 +04:00
Pekka Enberg	7bfb4dc203	Merge 'Fix MVCC startup infinite loop when using existing DB' from Jussi Saurio MVCC bootstrap connection got stuck into an infinite statement reparsing loop because the bootstrap procedure happened before the on-disk schema was deserialized. closes #3518 Closes #3522	2025-10-02 14:20:42 +03:00
Jussi Saurio	3a1851ec06	Fix MVCC startup infinite loop when using existing DB MVCC bootstrap connection got stuck into an infinite statement reparsing loop because the bootstrap procedure happened before the on-disk schema was deserialized.	2025-10-02 13:21:44 +03:00
Pekka Enberg	4c5a7cda08	core/vdbe: Avoid cloning Arc<MvStore> on every VDBE step The VDBE step() function was taking Arc<MvStore> by value, causing it to be cloned on every single step of query execution. This resulted in thousands of atomic reference count increments/decrements per query, showing up as a major hotspot in profiling. Changed step() and related functions to take Option<&Arc<MvStore>> instead, passing a reference rather than cloning the Arc. This eliminates the unnecessary atomic operations while maintaining the same semantics.	2025-10-02 12:28:11 +03:00
Jussi Saurio	7360edc169	Merge 'mvcc: dont try to end pager tx on connection close' from Jussi Saurio closes #3487 Closes #3491	2025-10-02 10:06:23 +03:00
Jussi Saurio	c395e051cb	mvcc: dont try to end pager tx on connection close	2025-10-01 10:17:41 +03:00
Jussi Saurio	28c1ebc128	Add Database::indexes_enabled()	2025-10-01 10:14:05 +03:00
Jussi Saurio	8a08f085e8	Merge 'Fix SQLite database file pending byte page' from Pedro Muniz Sqlite has a crazy easter egg where a 1 Gib file offset, it creates a `PENDING_BYTE_PAGE` that is used only by the VFS layer, and is never read or written into. To properly test this, I took inspiration from SQLITE testing framework, and defined a helper method, that is conditionally compiled with the `test_helper` feature enabled. https://github.com/sqlite/sqlite/blob/7e38287da43ea3b661da3d8c1f431aa907 d648c9/src/main.c#L4327 As the `PENDING_BYTE` is normally at the 1 Gib mark, I created a function that modifies the static `PENDING_BYTE` atomic to whatever value we want. This means we can test this unusual behaviours at any DB file size we want. `fuzz_pending_byte_database` is the test that fuzzes different pending byte offsets and does an integrity check at the end to confirm, we are compatible with SQLITE Closes #2749 <img width="1100" height="740" alt="image" src="https://github.com/user- attachments/assets/06eb258f-b4b4-47bf-85f9-df1cf411e1df" /> Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #3431	2025-10-01 08:55:44 +03:00

1 2 3 4 5 ...

714 Commits