turso

mirror of https://github.com/aljazceru/turso.git synced 2026-02-16 05:24:22 +01:00

Author	SHA1	Message	Date
Piotr Rzysko	6d84cbedc2	Fix delimiter handling in group_concat and string_agg Non-literal delimiters must be translated by AggArgumentSource.	2025-09-13 10:49:14 +02:00
Piotr Rzysko	110ffba2a1	Fix accumulator reset when arguments outnumber aggregates Previously, while resetting accumulator registers, we would also reset subsequent registers. This happened because the number of registers to reset was computed as the sum of arguments rather than the number of aggregate functions.	2025-09-13 10:49:14 +02:00
Piotr Rzysko	6224cdbbd3	Support WalkControl in walk_expr_mut Now walk_expr_mut can use WalkControl to skip parts of the expression tree. This makes it consistent with walk_expr.	2025-09-13 10:49:14 +02:00
Piotr Rzysko	b911e80607	Add AggValue instruction Adds the AggValue instruction, which computes the current aggregate result and writes it to a dedicated destination register. Unlike AggFinal, it does not overwrite or clear the accumulator register. This makes it possible to retrieve aggregate results multiple times—needed when processing window functions—while preserving the accumulator state.	2025-09-13 10:49:14 +02:00
Piotr Rzysko	c5a12f52c2	Don't mutate state in op_agg_final Previously, only the External and Avg aggregates mutated state during AggFinal. This is unnecessary because AggFinal runs only once per group, so caching the result provides no performance benefit. By avoiding state mutation, we can also reuse op_agg_final for the AggValue instruction that will be added soon.	2025-09-13 10:49:14 +02:00
Piotr Rzysko	458172220e	Remove unused method from AggContext	2025-09-13 10:49:14 +02:00
Piotr Rzysko	867bef55d8	Add ResetSorter instruction This instruction isn't used yet, but it will be needed for window functions, since they heavily rely on ephemeral tables.	2025-09-13 10:44:56 +02:00
Piotr Rzysko	ea9599681e	Add OpenDup instruction The instruction isn’t used yet, but it’ll be needed for window functions, since they heavily rely on ephemeral tables.	2025-09-13 10:35:33 +02:00
Preston Thorpe	b1420904bb	Merge 'fix(btree): advance cursor after interior node replacement in delete' from Jussi Saurio ## Problem When a delete replaces an index interior cell, the replacement key is LT the deleted key. Currently on the main branch, after the deletion happens, the following call to BTreeCursor::next() stops at the replaced interior cell. This is incorrect - imagine the following sequence: - We are executing a query that deletes all keys WHERE key > 5 - We delete <key=6> from an interior node, and take a replacement <key=5> from the left subtree of that interior page - next() is called, and we land on the interior node again, which now has <key=5>, and we incorrectly delete it even though our WHERE condition is key > 5. ## Solution This PR: - Tracks `interior_node_was_replaced` in CheckNeedsBalancing - If no balancing is needed and a replacement occurred, advances once so the next invocation of next() will skip the replaced cell properly i.e. we prevent next() from landing on the replaced content and ensures iteration continues with the next logical record. ## Details This problem only became apparent once we started using indexes as valid iteration cursors for DELETE operations in #2981 Closes #3045 Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3049	2025-09-12 17:37:01 -04:00
Pekka Enberg	ad6157028e	Merge 'core/vdbe: Fix BEGIN CONCURRENT transactions' from Pekka Enberg The transaction upgrade logic in Transaction opcode is total nonsense for concurrent transactions so just drop it. Fixes #3061 Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #3070	2025-09-12 23:11:12 +03:00
Pekka Enberg	a0921c4221	Merge 'core/storage: Remove unused import warning' from Pekka Enberg Closes #3069	2025-09-12 23:11:05 +03:00
Pekka Enberg	5e2b1bc0d3	Merge 'Fix incompatible math functions' from Levy A. Fixes #1817, #2068, #1326, #1397. The solution is very much not ideal, but fixes all math function related incompatibilities. Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3033	2025-09-12 21:28:08 +03:00
Pekka Enberg	86dcdad3d0	core/vdbe: Fix BEGIN CONCURRENT transactions The transaction upgrade logic in Transaction opcode is total nonsense for concurrent transactions so just drop it. Fixes #3061	2025-09-12 21:19:34 +03:00
Pekka Enberg	2bc8c0c850	core/storage: Remove unused import warning	2025-09-12 21:09:38 +03:00
Pekka Enberg	dcd43ab8fc	Merge 'Handle `EXPLAIN QUERY PLAN` like SQLite' from Lâm Hoàng Phúc After this PR: ``` turso> EXPLAIN QUERY PLAN SELECT 1; QUERY PLAN `--SCAN CONSTANT ROW turso> EXPLAIN QUERY PLAN SELECT 1 UNION SELECT 1; QUERY PLAN `--COMPOUND QUERY \|--LEFT-MOST SUBQUERY \| `--SCAN CONSTANT ROW `--UNION USING TEMP B-TREE `--SCAN CONSTANT ROW turso> CREATE TABLE x(y); turso> CREATE TABLE z(y); turso> EXPLAIN QUERY PLAN SELECT * from x,z; QUERY PLAN \|--SCAN x `--SCAN z turso> EXPLAIN QUERY PLAN SELECT * from x,z ON x.y = z.y; QUERY PLAN \|--SCAN x `--SEARCH z USING INDEX ephemeral_z_t2 turso> ``` Closes #3057	2025-09-12 20:41:23 +03:00
PThorpe92	b04c364981	Fix clippy error	2025-09-12 11:43:38 -04:00
PThorpe92	7a14c7394f	Remove the header copy stored on the WalFile, fix fast_path	2025-09-12 11:29:43 -04:00
PThorpe92	25e7c719f1	Update checkpoint_seq on each checkpoint, not just when log restarts This was causing checkpoint_seq to be 0 when we had already successfully ran a passive checkpoint, and causing us to use improper pages from the cache.	2025-09-12 11:29:42 -04:00
Pekka Enberg	14da283e36	Merge 'MVCC: remove reliance on BTreeCursor::has_record()' from Jussi Saurio Closes #3051 Closes #3032 Closes #3056	2025-09-12 17:31:15 +03:00
Pekka Enberg	54b4c9f30b	Merge 'Implement the balance_quick algorithm' from Jussi Saurio Fast balancing routine for the common special case where the rightmost leaf page of a given subtree overflows such that the overflowing cell would be the rightmost cell on the page -- i.e. an append. In this case we just add a new leaf page as the right sibling of that page, put the overflow cell there, and insert a new divider cell into the parent. The high level steps are: 1. Allocate a new leaf page and insert the overflow cell payload in it. 2. Create a new divider cell in the parent - it contains the page number of the old rightmost leaf, plus the largest rowid on that page. 3. Update the rightmost pointer of the parent to point to the new leaf page. 4. Continue balance from the parent page (inserting the new divider cell may have overflowed the parent Closes #3041	2025-09-12 17:30:52 +03:00
Pekka Enberg	443720c74a	Merge 'benchmark: introduce simple 1 thread concurrent benchmark for mvcc/sq…' from Pere Diaz Bou …lite/wal This is considerably simpler with 1 thread as we just try to yield control when I/O happens and we only run io.run_once when all connections tried to do some work. This allows connections to cooperatively progress. Closes #3060	2025-09-12 17:27:41 +03:00
Pekka Enberg	7fdb116d41	Merge 'core/mvcc: queue mvcc txns on pager's end_tx' from Pere Diaz Bou Flushing mvcc changes to disk requires serialization. To do so we simply introduce a lock for pager.end_tx, which will take ownership of flushing to WAL. Once this is finished we can simply release lock. When multiple tx writes happen concurrently in mvcc, max frame will be updated. This new max_frame makes is the point of view of the other transaction return busy because his current wal snapshot is outdated. Closes #3059	2025-09-12 17:27:17 +03:00
Pere Diaz Bou	ec2cff2026	benchmark: introduce simple 1 thread concurrent benchmark for mvcc/sqlite/wal This is considerably simpler with 1 thread as we just try to yield control when I/O happens and we only run io.run_once when all connections tried to do some work. This allows connections to cooperatively progress.	2025-09-12 14:02:57 +00:00
Pere Diaz Bou	39fb5913e0	core/mvcc: queue write txn commits in mvcc on pager end_tx Flushing mvcc changes to disk requires serialization. To do so we simply introduce a lock for pager.end_tx, which will take ownership of flushing to WAL. Once this is finished we can simply release lock.	2025-09-12 14:00:02 +00:00
Pere Diaz Bou	e87226548c	core/mvcc: fix concurrent tests mvcc	2025-09-12 13:49:40 +00:00
Pere Diaz Bou	9b6d181be4	wal: add hacky update max frame for mvcc use When multiple tx writes happen concurrently in mvcc, max frame will be updated. This new max_frame makes is the point of view of the other transaction return busy because his current wal snapshot is outdated.	2025-09-12 13:49:14 +00:00
Pere Diaz Bou	66b5630870	vdbe/mvcc: rollback mvcc txn on vdbe error	2025-09-12 13:47:45 +00:00
Jussi Saurio	305b2f55ae	MVCC: remove reliance on BTreeCursor::has_record()	2025-09-12 16:03:55 +03:00
TcMits	9dac467b40	support EXPLAIN QUERY PLAN	2025-09-12 19:58:45 +07:00
PThorpe92	5849819a59	Fix tests for views	2025-09-12 08:20:40 -04:00
Preston Thorpe	b09dcceeef	Merge 'Fixes views' from Glauber Costa This is a collection of fixes for materialized views ahead of adding support for JOINs. It is mostly issues with how we assume there is a single table, with a single delta, but we have to send more than one. Those are things that are just objectively wrong, so I am sending it separately to make the JOIN PR smaller. Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3009	2025-09-12 07:43:32 -04:00
Preston Thorpe	16a3410934	Merge 'Fix checkpoint fast-path, don't use cached pages w/o write lock' from Preston Thorpe closes #3024 Don't use pages from the cache unless we hold an exclusive write lock, because a page could be updated by a writer in-memory at any point before we backfill it. Clear the WAL tag in other areas to prevent any stale tags. Also, we will just snapshot the page when we determine that it's eligible, and pay a memcpy instead of the read from disk, but this further prevents any in-memory changes to the page/TOCTOU issues, and we also assert that it's still eligible after we copy it to a new buffer. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #3036	2025-09-12 07:39:32 -04:00
Preston Thorpe	f55023acc8	Merge 'Refactor UPSERT to use wal_expr_mut to walk AST.' from Preston Thorpe Working on https://github.com/tursodatabase/turso/issues/2964 I came upon `walk_expr_mut`, I don't think it existed last time I really spent much time in the translator. So quickly went back and cleaned this up. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #3044	2025-09-12 06:45:13 -04:00
PThorpe92	f60ca3970f	Remove old comment from wal	2025-09-12 06:39:59 -04:00
PThorpe92	faf3531a4e	Fix checkpoint fast-path, don't use cached pages w/o write lock closes #3024 Also we snapshot the page when we determine that it's eligible, and pay a memcpy instead of the read from disk, but this further prevents any in-memory changes to the page/TOCTOU issues.	2025-09-12 06:38:02 -04:00
TcMits	29d8d04d58	Merge branch 'main' into explain-query-plan	2025-09-12 17:34:11 +07:00
TcMits	5dddc5e00b	introduce OP_Explain	2025-09-12 17:31:50 +07:00
Pekka Enberg	6a992e551c	Merge 'core: Fix reprepare to properly reset statement cursors and registers' from Pedro Muniz Before we were not updating the number of registers and cursors, which meant that on a schema change the Program could now open an additional cursor and we would not have space for it in the ProgramState, which lead to the panic. Closes #3002 Closes #3034	2025-09-12 12:29:53 +03:00
Jussi Saurio	9f6e1a2e7c	fix(btree): advance cursor after interior node replacement in delete When a delete replaces an interior cell, the replacement key is LT the deleted key. Currently on the main branch, after the deletion happens, the following call to BTreeCursor::next() stops at the replaced interior cell. This is incorrect - imagine the following sequence: - We are executing a query that deletes all keys WHERE key > 5 - We delete <key=6> from an interior node, and take a replacement <key=5> from the left subtree of that interior page - next() is called, and we land on the interior node again, which now has <key=5>, and we incorrectly delete it even though our WHERE condition is key > 5. This PR: - Tracks `interior_node_was_replaced` in CheckNeedsBalancing - If no balancing is needed and a replacement occurred, advances once so the next invocation of next() will skip the replaced cell properly i.e. we prevent next() from landing on the replaced content and ensures iteration continues with the next logical record. Closes #3045	2025-09-12 10:49:44 +03:00
Pekka Enberg	aa32574554	core/mvcc: Fix begin_exclusive_tx() The RwLock elimination patches conflicted with the BEGIN CONCURRENT changes.	2025-09-12 08:42:14 +03:00
Pekka Enberg	a9a48f6272	Merge 'core/schema: Optimize get_dependent_materialized_views() when no views' from Pekka Enberg Eliminates get_dependent_materialized_views() overhead when there are no views. Note that we need to optimize the case when there are views as well because this ends up being pretty hot in write-intensive workloads. Closes #3046	2025-09-12 08:29:24 +03:00
Pekka Enberg	06371d8894	Merge 'Add BEGIN CONCURRENT support for MVCC mode' from Pekka Enberg Currently, when MVCC is enabled, every transaction mode supports concurrent reads and writes, which makes it hard to adopt for existing applications that use `BEGIN DEFERRED` or `BEGIN IMMEDIATE`. Therefore, add support for `BEGIN CONCURRENT` transactions when MVCC is enabled. The transaction mode allows multiple concurrent read/write transactions that don't block each other, with conflicts resolved at commit time. Furthermore, implement the correct semantics for `BEGIN DEFERRED` and `BEGIN IMMEDIATE` by taking advantage of the pager level write lock when transaction upgrades to write. This means that now concurrent MVCC transactions are serialized against the legacy ones when needed. The implementation includes: - Parser support for CONCURRENT keyword in BEGIN statements - New Concurrent variant in TransactionMode to distinguish from regular read/write transactions - MVCC store tracking of exclusive transactions to support IMMEDIATE and EXCLUSIVE modes alongside CONCURRENT - Proper transaction state management for all transaction types in MVCC This enables better concurrency for applications that can handle optimistic concurrency control, while still supporting traditional SQLite transaction semantics via IMMEDIATE and EXCLUSIVE modes. Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #3021	2025-09-12 07:38:53 +03:00
Pekka Enberg	d80814fa2c	core/schema: Optimize get_dependent_materialized_views() when no views Eliminates get_dependent_materialized_views() overhead when there are no views. Note that we need to optimize the case when there are views as well because this ends up being pretty hot in write-intensive workloads.	2025-09-12 07:22:18 +03:00
Preston Thorpe	f9f7a44955	Merge 'add explicit usize type annotation to range iterator in test' from Denizhan Dakılır very small fix when i was reading the codebase with rust-analyser while trying to find a bug for simulator. original error: `non-primitive cast: <Range<i32> as Iterator>::Item as i32 rust-analyzer E0605` Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3043	2025-09-11 21:12:32 -04:00
PThorpe92	36425b2ada	Refactor UPSERT to use wal_expr_mut to walk AST. Working on https://github.com/tursodatabase/turso/issues/2964 I came upon `walk_expr_mut`, I don't think it existed last time I really spent much time in the translator. So quickly went back and cleaned this up.	2025-09-11 21:08:11 -04:00
Denizhan Dakılır	70102f5f6e	add explicit usize type annotation to range iterator in test	2025-09-12 02:18:49 +03:00
Jussi Saurio	9b14c0022d	Implement the balance_quick algorithm Fast balancing routine for the common special case where the rightmost leaf page of a given subtree overflows (= an append). In this case we just add a new leaf page as the right sibling of that page, and insert a new divider cell into the parent. The high level steps are: 1. Allocate a new leaf page and insert the overflow cell payload in it. 2. Create a new divider cell in the parent - it contains the page number of the old rightmost leaf, plus the largest rowid on that page. 3. Update the rightmost pointer of the parent to point to the new leaf page. 4. Continue balance from the parent page (inserting the new divider cell may have overflowedImplement the balance_quick algorithm	2025-09-12 00:42:27 +03:00
Preston Thorpe	e9944f5d1f	Merge 'Fix automatic indexes' from Jussi Saurio Closes #2993 ## Background When a `CREATE TABLE` statement specifies constraints like `col UNIQUE`, `col PRIMARY KEY`, `UNIQUE (col1, col2)`, `PRIMARY KEY(col3, col4)`, SQLite creates indexes for these constraints automatically with the naming scheme `sqlite_autoindex_<table_name>_<increasing_number>`. ## Problem SQLite expects these indexes to be created in table definition order. For example: ```sql CREATE TABLE t(x UNIQUE, y PRIMARY KEY, c, d, UNIQUE(c,d)); ``` Should result in: ```sql sqlite_autoindex_t_1 -- x UNIQUE sqlite_autoindex_t_2 -- y PRIMARY KEY sqlite_autoindex_t_3-- UNIQUE(c,d) ``` However, `tursodb` currently doesn't uphold this invariant -- for example: the PRIMARY KEY index is always constructed first. SQLite flags this as a corruption error (see #2993). ## Solution - Process "unique sets" in table definition order. "Unique sets" are groups of 1-n columns that are part of either a UNIQUE or a PRIMARY KEY constraint. - Deduplicate unique sets properly: a PRIMARY KEY of a rowid alias (INTEGER PRIMARY KEY) is not a unique set. `UNIQUE (a desc, b)` and `PRIMARY KEY(a, b)` are a single unique set, not two. - Unify logic for creating automatic indexes and parsing them - remove separate logic in `check_automatic_pk_index_required()` and use the existing `create_table()` utility in both index creation and deserialization. - Deserialize a single automatic index per unique set, and assert that `unique_sets.len() == autoindexes.len()`. - Verify consistent behavior by adding a fuzz tests that creates 1000 databases with 1 table each and runs `PRAGMA integrity_check` on all of them with SQLite. ## Trivia Apart from fixing the exact issue #2993, this PR also fixes other bugs related to autoindex construction - namely cases where too many indexes were created due to improper deduplication of unique sets. Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3018	2025-09-11 17:04:53 -04:00
pedrocarlo	dbb7d6f532	reprepare optimization using reset()	2025-09-11 16:59:53 -03:00
pedrocarlo	c04cf535b0	flip `is_done` to `is_busy`	2025-09-11 14:58:51 -03:00

1 2 3 4 5 ...

4840 Commits