turso

mirror of https://github.com/aljazceru/turso.git synced 2025-12-28 21:44:21 +01:00

Author	SHA1	Message	Date
Nikita Sivukhin	c0d5c55d5c	fix tests and clippy	2025-08-06 01:03:49 +04:00
Nikita Sivukhin	c6a87d61c7	emit CDC entries if necessary for schema changes	2025-08-06 01:03:49 +04:00
Nikita Sivukhin	0b4c1ac802	refactor code a little bit	2025-08-06 01:03:48 +04:00
PThorpe92	914c10e095	Remove Clone impl for Buffer and PageContent	2025-08-05 14:26:53 -04:00
Pekka Enberg	9492a29d47	Merge 'Fix performance regression' from Jussi Saurio Closes #2440 ## Fix 1 Do not start a read transaction when a SELECT is not going to access the database, which means we can avoid checking whether the schema has changed. ## Fix 2 Add a field `accesses_db` to `Program` and `Statement` so we can avoid even checking for `SchemaUpdated` errors when it's not possible to get one. ## Fix 3 Avoid doing any work in `commit_txn` when not in a transaction. This optimization is only enabled when `mv_store.is_none()`, because MVCC has its own logic and this doesn't work with MVCC enabled, and honestly I'm too tired to find out why. Left an inline comment about it, though. ```sql Execute `SELECT 1`/limbo_execute_select_1 time: [21.440 ns 21.513 ns 21.586 ns] change: [-60.766% -60.616% -60.453%] (p = 0.00 < 0.05) Performance has improved. ``` Effect is even more dramatic in CI where the latency is down over 80% Closes #2441	2025-08-05 16:30:18 +03:00
Jussi Saurio	cde8567b1d	Merge 'More state machine + Return IO in places where completions are created' from Pedro Muniz In preparation for tracking IO Completions, we need to start to return IO in places where completions are created. Doing some more plumbing now to avoid bigger PRs for the future Closes #2438	2025-08-05 15:47:51 +03:00
Pekka Enberg	49123db6e8	Merge 'core/mvcc: implement exists' from Pere Diaz Bou Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #2446	2025-08-05 15:34:23 +03:00
Jussi Saurio	1feb5ba2d3	perf/vdbe: avoid doing work in commit_txn if not in txn	2025-08-05 15:25:28 +03:00
Jussi Saurio	3f633247f7	perf/stmt: avoid checking for SchemaUpdated errors if it's impossible	2025-08-05 15:10:55 +03:00
Jussi Saurio	c498196c7b	fix/perf: fix regression in SELECT 1 benchmark Do not start a read transaction when a SELECT is not going to access the database, which means we can avoid checking whether the schema has changed.	2025-08-05 15:10:55 +03:00
Pere Diaz Bou	474f0d8bbc	core/mvcc: implement exists	2025-08-05 13:34:51 +02:00
Jussi Saurio	a28e64bfdd	cleanup: remove unused page uptodate flag	2025-08-05 14:25:42 +03:00
Pekka Enberg	d2fea25fef	Merge 'perf/btree: implement fast algorithm for defragment_page' from Jussi Saurio Implement sqlite's fast path defragment algorithm. This path is taken when: 1. There are 1-2 freeblocks 2. There are at most `max_frag_bytes` fragmented free bytes (-1..=4) Instead of reconstructing the entire page, it merges the two freeblocks and then moves the merged freeblock to the left, effectively turning it into free space in the unallocated region, instead of a freeblock. `max_frag_bytes` is particularly important when jnserting a new cell, because if the page contains (in total) ~just enough space for the new cell, then there can be hardly any fragmented free space because otherwise, merging the 1-2 freeblocks won't produce enough contiguous free space to fit the cell. ## Benchmark ```sql Insert rows in batches/limbo_insert_1_rows time: [26.692 µs 27.153 µs 27.695 µs] change: [-9.9033% -2.9097% +1.6336%] (p = 0.55 > 0.05) No change in performance detected. Insert rows in batches/limbo_insert_10_rows time: [38.618 µs 40.022 µs 42.201 µs] change: [-8.9137% -6.6405% -4.2299%] (p = 0.00 < 0.05) Performance has improved. Insert rows in batches/limbo_insert_100_rows time: [168.94 µs 169.58 µs 170.31 µs] change: [-22.520% -17.669% -12.790%] (p = 0.00 < 0.05) Performance has improved. ``` Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #2411	2025-08-05 12:44:48 +03:00
Pekka Enberg	aa20c2f1ba	Merge 'Relax I/O configuration attribute to cover all Unixes' from Pedro Muniz hopefully fixes #2268. Closes #2435	2025-08-05 12:44:34 +03:00
Pekka Enberg	e355fc4c65	Merge 'core/mvcc: implement seeking operations with rowid' from Pere Diaz Bou Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #2429	2025-08-05 12:40:48 +03:00
Jussi Saurio	ad35cf07eb	Add extra illustrative doodle for pere	2025-08-05 11:24:15 +03:00
Jussi Saurio	a5330aa6fb	perf/btree: implement fast algorithm for defragment_page	2025-08-05 11:24:14 +03:00
Jussi Saurio	5b84ad6b0f	Merge 'Update defragment page to defragment in-place' from João Severo Change original code from doing a full copy of the original buffer to modify the buffer in-place using a temporary vector with offsets. Closes #2258	2025-08-05 11:22:22 +03:00
Jussi Saurio	c9c5565867	Merge 'Integrate virtual tables with optimizer' from Piotr Rżysko This PR integrates virtual tables into the query optimizer. It is a follow-up to https://github.com/tursodatabase/turso/pull/1727. The most immediate improvement is better support for inner joins involving TVFs, particularly when TVF arguments are column references. ### Example The following two queries are semantically equivalent, but require different join orders to be valid: ```sql -- TVF depends on `t.id`, so `t` must be evaluated in outer loop SELECT t.id, series.value FROM target t, generate_series(t.id, 3) series; -- Equivalent query, but with reversed table order in the FROM clause SELECT t.id, series.value FROM generate_series(t.id, 3) series, target t; ``` Without optimizer integration, the second query would fail because the planner would attempt to evaluate `generate_series` before `t`. With this change, the optimizer detects column dependencies and produces the correct join order in both cases. ### TODO Support for outer joins with TVFs is still missing and will be addressed in a follow-up PR. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #2439	2025-08-05 09:22:08 +03:00
pedrocarlo	aa8d17cbf1	state machine for `ptrmap_get`	2025-08-05 01:38:42 -03:00
Piotr Rzysko	59ec2d3949	Replace ConstraintInfo::plan_info with ConstraintInfo::index The side of the binary expression no longer needs to be stored in `ConstraintInfo`, since the optimizer now guarantees that it is always on the right. As a result, only the index of the corresponding constraint needs to be preserved.	2025-08-05 05:48:29 +02:00
Piotr Rzysko	8fb4fbf8af	Make WhereTerm::consumed a plain bool Now that virtual tables are integrated into the optimizer, this field no longer needs to be wrapped in Cell<bool>.	2025-08-05 05:48:28 +02:00
Piotr Rzysko	99f87c07c1	Support column references in table-valued function arguments This change extends table-valued function support by allowing arguments to be column references, not only literals. Virtual tables can now reject a plan by returning an error from best_index (e.g., when a TVF argument references a table that appears later in the join order). The planner using this information excludes invalid plans during join order search.	2025-08-05 05:48:28 +02:00
Piotr Rzysko	82491ceb6a	Integrate virtual tables with optimizer This change connects virtual tables with the query optimizer. The optimizer now considers virtual tables during join order search and invokes their best_index callbacks to determine feasible access paths. Currently, this is not a visible change, since none of the existing extensions return information indicating that a plan is invalid.	2025-08-05 05:48:28 +02:00
pedrocarlo	0ac040cc87	return IO in some other functions in Pager	2025-08-04 23:28:57 -03:00
pedrocarlo	a4a2425ffd	return IO in places where completions are created	2025-08-04 23:28:57 -03:00
Jussi Saurio	a66b56678d	Merge 'Reprepare Statements when Schema changes' from Pedro Muniz Closes #1967 To support this I had to change how we did `epilogue` similarly to how SQLite does it. SQLIte first declares a `beginWriteOperation` when some statement is going to necessitate a Write Transaction. And as we now need to pass the current schema cookie to `epilogue` it was easier to call epilogue only in one location (like we do with prologue), and just have each statement declare their intentions separately. This allows us to not have to pass the Schema around just to do the epilogue. I believe this is something that @jussisaurio would be interested in. ~Also had to disable the MVCC test, as it was extremely buggy for me.~ Just disabled reprepare statements for MVCC Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #2214	2025-08-05 00:01:14 +03:00
Jussi Saurio	1e59165ea6	Merge 'More State Machines in preparation for tracking IO Completions' from Pedro Muniz More changes. I want to avoid big PRs, so doing these changes in small increments. I think in like 2 PRs after this one, I will be able make the change effectively. Closes #2400	2025-08-05 00:00:09 +03:00
Jussi Saurio	4cf02dfd14	Merge 'coalesce any adjacent buffers from writev calls into fewer iovecs' from Preston Thorpe In `io_uring` and `unix` IO backends, we can check if our buffers are sequential in memory and reduce the number of iovecs per call. Although this is highly unlikely to actually happen at the moment due to our buffer pool implementation. Later on, when #2419 is merged, we will be able to specifically request runs of contiguous buffers, so that our `writev` calls will (in the ideal case) be coalesced into a single `pwrite` or preferrably `WriteFixed` operation on the io_uring backend. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #2436	2025-08-04 23:54:57 +03:00
Jussi Saurio	13219dbf87	Merge 'extend raw WAL API with few more methods' from Nikita Sivukhin This PR extends raw WAL API with few methods which will be helpful for offline-sync: 1. `try_wal_watermark_read_page` - try to read page from the DB with given WAL watermark value\ * Usually, WAL max_frame is set automatically to the latest value (`shared.max_frame`) when transaction is started and then this "watermark" is preserved throughout whole transaction * New method allows to simulate "read from the past" by controlling frame watermark explicitly * There is an alternative to implement some API like `start_read_session(frame_watermark: u64)` - but I decided to expose just single method to simplify the logic and reduce "surface" of actions which can be executed in this "controllable" manner * Also, for simplicity, now `try_wal_watermark_read_page` always read data from disk and bypass any cached values (and also do not populate the cache) 2. `wal_changed_pages_after` - return set of unique pages changed after watermark WAL position in the current WAL session With these 2 methods we can implement `REVERT frame_watermark` logic which will just fetch all changed pages first, and then revert them to the previous value by using `try_wal_watermark_read_page` and `wal_insert_frame` methods (see `test_wal_api_revert_pages` test). Note, that if there were schema changes - than `REVERT` logic described above can bring connection to the inconsistent state, as it will preserve schema information in memory and will still think that table exist (while it can be reverted). This should be considered by any consumer of this new methods. Closes #2433	2025-08-04 23:53:46 +03:00
PThorpe92	2a3fa0955f	Attempt to coalesce contiguous iovecs during pwritev operation for unix IO	2025-08-04 16:18:19 -04:00
PThorpe92	b76ef20f4c	Attempt to coalesce contiguous iovecs during pwritev operation for io_uring	2025-08-04 16:18:05 -04:00
PThorpe92	6cbc8ff868	Replace values with constants	2025-08-04 15:14:06 -04:00
PThorpe92	73d1fdef14	Fix and change bitmap, apply suggestions and add some optimizations	2025-08-04 14:58:58 -04:00
PThorpe92	f4197f1eb5	change debug assertions to turso asserts	2025-08-04 14:55:48 -04:00
PThorpe92	54696d2f0d	Add additional test for edge cases	2025-08-04 14:55:48 -04:00
PThorpe92	5378195ad6	Add page bitmap to storage mod.rs	2025-08-04 14:55:48 -04:00
PThorpe92	3e30335ea5	Add tests for PageBitmap	2025-08-04 14:55:48 -04:00
PThorpe92	7b1f908c00	Add PageBitmap for use with arena page allocator	2025-08-04 14:55:48 -04:00
pedrocarlo	ebe6aa0d28	adjust cfg for unix and linux IO	2025-08-04 15:49:52 -03:00
pedrocarlo	f2d84a534c	adjust `clear_overflow_pages`	2025-08-04 15:28:06 -03:00
Piotr Rzysko	521eb2368e	Return error when no valid plan exists Replace panics with proper errors when a valid plan does not exist. Currently, this never happens because a naive plan is always available. However, once virtual tables are integrated into the planner, it may occur—for example, when table-valued function arguments are column references, and the function cannot be placed in the join order so that its arguments can be evaluated. Although this change is effectively a no-op for now, it is extracted into a separate commit to avoid polluting the one that introduces virtual table integration with the planner.	2025-08-04 20:27:23 +02:00
Piotr Rzysko	c80cd370cb	Remove cost_upper_bound_ordered It was redundant, as it was always equal to cost_upper_bound.	2025-08-04 20:27:23 +02:00
Piotr Rzysko	718598eab8	Introduce scan type Different scan parameters are required for different table types. Currently, index and iteration direction are only used by B-tree tables, while the remaining table types don’t require any parameters. Planning access to virtual tables, however, will require passing additional information from the planner, such as the virtual table index (distinct from a B-tree index) and the constraints that must be forwarded to the `filter` method.	2025-08-04 20:27:22 +02:00
Piotr Rzysko	9167b30c7c	Introduce AccessMethodParams Previously, AccessMethod stored fields like `iter_dir`, `index`, and `constraint_refs` directly, but these only applied to BTree tables. Other table types (virtual tables, subqueries) either ignored these fields or required different parameters entirely. This change prepares the planner to handle virtual table access methods with their own specialized parameters.	2025-08-04 20:23:44 +02:00
Piotr Rzysko	4166735953	Return error when start argument is missing for generate_series This matches SQLite’s behavior and will help in the future to differentiate between an invalid function invocation (missing argument, not provided by the user) and an invalid combination of constraints proposed by the planner. No new integration tests are added, since this case was already covered by the `filter` method. With the ability to return result codes from `best_index`, we can now detect this error earlier.	2025-08-04 20:18:44 +02:00
Piotr Rzysko	61234eeb19	Add ResultCode to best_index result The `best_index` implementation now returns a ResultCode along with the IndexInfo. This allows it to signal specific outcomes, such as errors or constraint violations. This change aligns better with SQLite’s xBestIndex contract, where cases like missing constraints or invalid combinations of constraints must not result in a valid plan.	2025-08-04 20:18:44 +02:00
Piotr Rzysko	6a4cf02a90	Fix computation of argv_index in best_index The `filter` methods for extensions affected by this fix expect arguments to be passed in a specific order. For example, `generate_series` assumes that if the `start` argument exists, it is always passed to `filter` first. If `start` does not exist, then `stop` is passed first — but `stop` must never come before `start`. Previously, this was not guaranteed: `best_index` relied on constraints being passed in the order matching `filter`'s expectations.	2025-08-04 19:38:45 +02:00
Piotr Rzysko	c465ce6e7b	Clarify semantics of argv_index Extend the documentation of `argv_index` and add validations enforcing the requirements it must meet.	2025-08-04 19:31:18 +02:00
Piotr Rzysko	b0460a589f	Ensure argv_index is either None or >= 1 Previously, there were two ways to indicate that a constraint should not be passed to the filter function: setting `argv_index` to `None` or to a value less than 1. This was redundant, so now only `None` is used.	2025-08-04 19:27:53 +02:00

1 2 3 4 5 ...

3962 Commits