turso

mirror of https://github.com/aljazceru/turso.git synced 2026-01-04 08:54:20 +01:00

Author	SHA1	Message	Date
Jussi Saurio	7fd63d8a5d	btree: cache usable_space in the btreecursor constructor	2025-08-08 10:32:18 +03:00
Jussi Saurio	15c429b673	btree: remove completely unused ParseRecordState	2025-08-08 10:08:59 +03:00
Pekka Enberg	0f9d0cf519	Merge branch 'main' into 2025-08-07-add-query-only-pragma	2025-08-08 07:41:38 +03:00
bit-aloo	c7f7ae32e3	review fixes	2025-08-08 08:43:15 +05:30
Preston Thorpe	850ee8fe62	Merge 'bench/insert: fix expected return value from pragma' from Jussi Saurio CI did not fail when this panicked :] Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #2489	2025-08-07 21:34:13 -04:00
Preston Thorpe	7a793b818d	Merge 'perf: a few small insert optimizations' from Jussi Saurio 1. We spend a lot of time in `cell_get_raw_region` in the balancing routine, and especially calling `contents.page_type()` there a lot, so extract a version that can take some precomputed arguments so those don't have to be redundantly computed multiple times for successive calls where those values are going to be the same 2. Avoid calling `self.usable_space()` in a loop in `insert_into_page()`. 3. Avoid accessing `pages_in_frames` lock if we're not going to modify it main improvement is to the "insert 100 rows" bench which ends up doing balancing a lot: ``` Insert rows in batches/limbo_insert_1_rows time: [22.856 µs 24.342 µs 27.496 µs] change: [-3.3579% +15.495% +67.671%] (p = 0.62 > 0.05) No change in performance detected. Benchmarking Insert rows in batches/limbo_insert_10_rows: Collecting 100 samples in estim Insert rows in batches/limbo_insert_10_rows time: [32.196 µs 32.604 µs 32.981 µs] change: [+1.3253% +2.9177% +4.5863%] (p = 0.00 < 0.05) Performance has regressed. Insert rows in batches/limbo_insert_100_rows time: [89.425 µs 92.105 µs 96.304 µs] change: [-18.317% -13.605% -9.1022%] (p = 0.00 < 0.05) Performance has improved. ``` Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #2483	2025-08-07 21:33:30 -04:00
Preston Thorpe	6b266e7e84	Merge 'Direct schema mutation – add instruction' from Levy A. <img width="960" height="205" alt="image" src="https://github.com/user- attachments/assets/f60a2133-dfe4-4411-9a7c-7283eb073875" /> <img width="944" height="504" alt="image" src="https://github.com/user- attachments/assets/9383c8e2-4d8d-40b9-8ace-825ca3cf8682" /> ``` `ALTER TABLE _ ADD COLUMN _`/limbo_add_column/ time: [2.1199 ms 2.1921 ms 2.2756 ms] change: [-85.983% -85.416% -84.716%] (p = 0.00 < 0.05) Performance has improved. Found 13 outliers among 100 measurements (13.00%) 6 (6.00%) high mild 7 (7.00%) high severe `ALTER TABLE _ ADD COLUMN _`/sqlite_add_column/ time: [10.358 ms 10.404 ms 10.469 ms] change: [-6.2566% -2.3515% +0.2046%] (p = 0.21 > 0.05) No change in performance detected. Found 14 outliers among 100 measurements (14.00%) 2 (2.00%) high mild 12 (12.00%) high severe ``` Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #2482	2025-08-07 21:31:49 -04:00
PThorpe92	bcadcb2014	Remove RefCell from copy_to method in io trait	2025-08-07 17:07:53 -04:00
PThorpe92	98f4e5cd2d	Add comment/TODO about method we use to copy the db file	2025-08-07 16:27:08 -04:00
PThorpe92	e32d04ea97	Use ephemeral PlatformIO for clone method to support memory io	2025-08-07 16:27:08 -04:00
PThorpe92	039fe22405	Add copy_to to io::File trait to support copying DB files	2025-08-07 16:27:02 -04:00
Preston Thorpe	e23637b6ad	Merge 'only allow multiples of 64 for performance in arena bitmap' from Preston Thorpe Trying to support this is unnecessary and just adds branches and bit ops when we could just round the allocation up or down Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #2497	2025-08-07 14:25:11 -04:00
bit-aloo	aaeec4d4f3	Implement PRAGMA query_only logic	2025-08-07 23:49:07 +05:30
bit-aloo	2cf7f66a02	Enforce query_only in write operations	2025-08-07 23:46:00 +05:30
bit-aloo	ef084af42f	Add getter and setter methods	2025-08-07 23:44:54 +05:30
bit-aloo	697eb35ca9	Add query_only field to Connection	2025-08-07 23:44:29 +05:30
PThorpe92	c03ca5701a	Dont accept values that are not multiples of 64 for performance in page bitmap	2025-08-07 13:45:07 -04:00
Jussi Saurio	aea6d942d6	bench/insert: fix expected return value from pragma	2025-08-07 17:44:46 +03:00
Levy A.	658405d6b3	feat: add `AddColumn` instruction	2025-08-07 11:43:16 -03:00
Jussi Saurio	1fe32dadf3	PageContent: make read_x/write_x methods private and add dedicated methods Problem: A very easy source of bugs is to mistakenly use e.g. PageContent::read_u16() instead of PageContent::read_u16_no_offset(). The difference between the two is that `read_u16()` adds 100 bytes to the requested byte offset if and only if the page in question is page 1, which contains a 100-byte database header. Case in point: see #2491. Observation: In all of the cases where we want to read from or write to a page "header-sensitively", those reads/writes are to so-called "well known offsets", e.g. specific bytes in a btree page header. In all other cases, the "no-offset" versions, i.e. the ones taking the absolute byte offset as parameter, should be used. Solution: 1. Make all the offset-sensitive versions (read_u16() and friends) private methods of `PageContent`. 2. Expose dedicated methods for things like updating rightmost pointer, updating fragmented bytes count and so on, and use them instead of the plain read/write methods universally.	2025-08-07 17:00:06 +03:00
Pekka Enberg	3f181c9145	Merge 'btree: Use correct byte offsets for page 1 in defragmentation ' from Jussi Saurio ## Beef `defragment_page_fast()` incorrectly didn't use the version of read/write methods on `PageContent` that does NOT add the 100 byte database header into the requested byte offset. this resulted in defragment of page 1 in reading 2nd/3rd freeblocks from the wrong offset and reading/writing freeblock sizes and cell offsets to the wrong location. ## Testing Adds fuzz test for CREATE TABLE / DROP TABLE / ALTER TABLE, which I was able to reproduce this with. Closes #2491	2025-08-07 16:52:11 +03:00
Jussi Saurio	6cd7334afc	btree/fix: use correct byte offsets for page1 in defragmentation `defragment_page_fast()` incorrectly didn't use the version of read/write methods on `PageContent` that does NOT add the 100 byte database header into the requested byte offset. this resulted in defragment of page 1 in reading 2nd/3rd freeblocks from the wrong offset and writing cell offsets to the wrong location.	2025-08-07 15:42:06 +03:00
Jussi Saurio	95c6c7581b	bench/insert: use PRAGMA synchronous=full synchronous=FULL means WAL is fsynced on every commit, which is what we also do. however we do not do the padding to sector alignment that sqlite does (see #2450), which makes sqlite do more work than we do.	2025-08-07 13:40:14 +03:00
Jussi Saurio	c5bdbe306d	perf/wal: avoid accessing pages_in_frames unless necessary	2025-08-07 10:27:11 +03:00
Jussi Saurio	2ed41bbb35	btree/insert: avoid calling self.usable_space() in a loop	2025-08-07 10:09:35 +03:00
Jussi Saurio	4b27cc0d46	btree: add fast path version of cell_get_raw_region	2025-08-07 09:57:56 +03:00
Jussi Saurio	c98136c8c4	btree: use new cell start helper method in cell_get_raw_region	2025-08-07 09:37:33 +03:00
Jussi Saurio	3db25cf84c	perf/btree: add method for getting raw offset of cell payload start	2025-08-07 09:34:05 +03:00
Jussi Saurio	edd7c45c22	Merge 'Fix segfault on schema update for virtual tables' from Preston Thorpe Closes #2478 ```console turso>create table t(a,b); turso>insert into t select 1,2 from generate_series(1,20); turso>create index idxa on t(a); turso>create index idxb on t(b); # segfault ``` The issue was the `turso_ext::Conn` pointer was stored on the `VirtualTable`, which lives longer than necessary and the underlying core `Connection` is not guaranteed pinned. Moving the `turso_ext::Conn` on to the cursor fixes this issue because it's independent of the Schema. This also fixes the issue that comes up when that issue is fixed, and some of the relevant code for this hadn't been updated when other surrounding changes had been made. Closes #2479	2025-08-07 09:02:11 +03:00
Jussi Saurio	eb7fa9693d	Merge 'Return error on attempting to drop index associated with PRIMARY KEY and UNIQUE constraints' from Closes issue #2455. Also includes tests. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #2461	2025-08-07 09:00:02 +03:00
Pekka Enberg	be8e8ff7c0	Merge 'turso-sync: rewrite' from Nikita Sivukhin This PR rewrites `turso-sync` package introduced in the #2334 and renames it to the `turso-sync-engine` (anyway the diff will be unreadable). The purpose of rewrite is to get rid of Tokio because this makes things harder when we wants to export bindings to WASM. In order to achieve "runtime"-agnostic sync core but still be able to use async/await machiner - this PR introduce usage of `genawaiter` crate which allows to transfer async/await Rust state machines to the generators. So, sync operations are just generators which can yield `IO` command in case where there is a need for it. Also, this PR introduces separate `ProtocolIo` in the `turso-sync- engine` which defines extra IO methods: 1. HTTP interaction 2. Atomic read/writes to the file. This is not strictly necessary and `turso_core::IO` methods can be extended to support few more things (like `delete`/`rename`) - but I decided that it will be simpler to just expose 2 more methods for sync protocol for the sake of atomic metadata update (which is very small - dozens of bytes). * As a bonus, we can store metadata for browser in the `LocalStorage` which may be more natural thing to do(?) (user can reset everything by just clearing local storage) The `ProtocolIo` works similarly to the `IO` in a sense that it gives the caller `Completion` which it can check periodically for new data. Closes #2457	2025-08-07 07:58:02 +03:00
Preston Thorpe	f55d34a3db	Merge 'Fix panic on loading extension on brand new connection' from Preston Thorpe Closes #2476 Closes #2477	2025-08-06 23:44:38 -04:00
Preston Thorpe	0777fd9082	Merge 'implement the MaxPgCount opcode' from Glauber Costa It is used by the pragma max_page_count, which is also implemented. Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #2472	2025-08-06 23:44:05 -04:00
PThorpe92	df2c39b98e	Use load_insn macro for op_journal_mode	2025-08-06 23:42:47 -04:00
Glauber Costa	071330a739	implement the JournalMode vdbe instruction We do this already, but not through any opcode. Move it to an opcode for compatibility reasons.	2025-08-06 19:30:19 -05:00
PThorpe92	273c12b2b3	Remove extension Conn from VirtualTable to survive schema changes	2025-08-06 16:27:26 -04:00
PThorpe92	657f3f3095	Fix panic on loading extension on brand new connection	2025-08-06 15:51:49 -04:00
Jussi Saurio	ff128e2f20	bench/insert: use locking_mode EXCLUSIVE and journal_mode=WAL for sqlite	2025-08-06 21:40:43 +03:00
Jussi Saurio	64c8587f27	Merge 'IO More State Machine' from Pedro Muniz I swear we just need one more state machine. Just more state machine until we achieve IO tracking. (PS: this is a meme) Closes #2462	2025-08-06 21:26:19 +03:00
Glauber Costa	f36974f086	implement the MaxPgCount opcode It is used by the pragma max_page_count, which is also implemented.	2025-08-06 13:20:15 -05:00
Jussi Saurio	cc98f9f88b	Merge 'Direct schema mutation – add instruction' from Levy A. 86% performance improvement. We are 25x faster than SQLite. <img width="953" height="511" alt="image" src="https://github.com/user- attachments/assets/fd717d1e-bbbe-4959-ae48-41afc73e5e9f" /> ``` ALTER TABLE _ DROP COLUMN _`/limbo_drop_column/ time: [1.8821 ms 1.8929 ms 1.9047 ms] change: [-86.850% -86.733% -86.614%] (p = 0.00 < 0.05) Performance has improved. Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe Benchmarking `ALTER TABLE _ DROP COLUMN _`/sqlite_drop_column/: Warming up for 3.0000 s `ALTER TABLE _ DROP COLUMN _`/sqlite_drop_column/ time: [46.227 ms 46.258 ms 46.291 ms] change: [-1.3202% -1.0505% -0.8109%] (p = 0.00 < 0.05) Change within noise threshold. Found 15 outliers among 100 measurements (15.00%) 10 (10.00%) high mild 5 (5.00%) high severe ``` Closes #2452	2025-08-06 20:32:22 +03:00
Jussi Saurio	f8f2ad1e7a	Merge 'refactor/btree: cleanup write/delete/balancing states' from Jussi Saurio ## Problem: Currently `WriteState`, usually triggered by an insert operation, "owns" the balancing state machine, even though a delete operation (tracked by a separate `DeleteState`) can also trigger balancing, which results in awkward back-and-forth switching between `CursorState::Write` and `CursorState::Delete` during balancing. ## Fix: 1. Extract `balance_state` as a separate state machine, since its state transitions are exactly the same regardless of whether an insert or a delete triggered the balancing. 2. This allows to remove the different 'Balance-xxx' variants from `WriteState`, as well as removing `WriteInfo` and `DeleteInfo`, as the delete&insert states become just simple enums now. Each of them now has a substate called `Balancing` which just delegates work to the balancing state machine. 3. This further allows us to remove the awkward switching between `CursorState::Delete` and `CursorState::Write` during a balance that happens as a result of a deletion. Reviewed-by: Nikita Sivukhin (@sivukhin) Reviewed-by: Avinash Sajjanshetty (@avinassh) Closes #2468	2025-08-06 20:19:17 +03:00
Levy A.	c9e1eca8dc	feat: add `DropColumn` instruction	2025-08-06 13:39:30 -03:00
Levy A.	3bc1001a93	feat(bench): complete `ALTER TABLE` benchmarks	2025-08-06 13:38:26 -03:00
Nikita Sivukhin	b612259a3a	more friendly copmletely runtime agnostic turso-sync-engine crate	2025-08-06 19:26:55 +04:00
pedrocarlo	7b746ccc65	adjust state machine for `ptrmap_get`	2025-08-06 11:32:21 -03:00
pedrocarlo	b529305b82	state machine for `ptrmap_put`	2025-08-06 11:32:21 -03:00
pedrocarlo	931384afb6	state machine fix for btree create for AutoVacuum::Full	2025-08-06 11:32:21 -03:00
pedrocarlo	f656d0bc20	header ref state machine	2025-08-06 11:32:21 -03:00
Pekka Enberg	0c9216d1cc	Merge 'cdc: emit entries for schema changes' from Nikita Sivukhin This PR emit CDC entries as changes in `sqlite_schema` table for DDL statements: `CREATE TABLE` / `CREATE INDEX` / etc. The logic is a bit tricky as under the hood `turso` can do some implicit DDL operations like: 1. Creating auto-indexes in case of `CREATE TABLE` 2. Deletion of all attached indices in case of `DROP TABLE` ``` turso> PRAGMA unstable_capture_data_changes_conn('full'); turso> CREATE TABLE t(x, y, z UNIQUE, q, PRIMARY KEY (x, y)); turso> CREATE INDEX t_xy ON t(x, y); turso> CREATE TABLE q(a, b, c); turso> ALTER TABLE q DROP COLUMN b; turso> SELECT change_id, id, change_type, table_name, bin_record_json_object(table_columns_json_array(table_name), before) AS before, bin_record_json_object(table_columns_json_array(table_name), after) AS after FROM turso_cdc; ┌───────────┬────┬─────────────┬───────────────┬─────────────────────────────────────────────────────────────────────┬─────────────────────────────────────────────────────────────────────┐ │ change_id │ id │ change_type │ table_name │ before │ after │ ├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤ │ 1 │ 2 │ 1 │ sqlite_schema │ │ {"type":"table","name":"t","tbl_name":"t","rootpage":3,"sql":"CREA… │ ├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤ │ 2 │ 5 │ 1 │ sqlite_schema │ │ {"type":"index","name":"t_xy","tbl_name":"t","rootpage":6,"sql":"C… │ ├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤ │ 3 │ 6 │ 1 │ sqlite_schema │ │ {"type":"table","name":"q","tbl_name":"q","rootpage":7,"sql":"CREA… │ ├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤ │ 4 │ 6 │ 0 │ sqlite_schema │ {"type":"table","name":"q","tbl_name":"q","rootpage":7,"sql":"CREA… │ {"type":"table","name":"q","tbl_name":"q","rootpage":7,"sql":"CREA… │ └───────────┴────┴─────────────┴───────────────┴─────────────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────┘ ``` For now, CDC capture only all explicit operations and ignore all implicit operations. The reasoning for that is that one use case for CDC is to apply logical changes as is with simple SQL statements - but if implicit operations will be logged to the CDC table too - we can have hard times using simple SQL statement (for example, creation of `autoindices` will always work; implicit deletion of indices for `DROP TABLE` also can lead to some troubles and force us to is `DROP INDEX IF EXISTS ...` statements + we will need to filter out autoindices in this case too). Also, to simplify PR, for now `DatabaseTape` from `turso-sync` package just ignore all schema changes from CDC table. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #2426	2025-08-06 14:48:27 +03:00

1 2 3 4 5 ...

4033 Commits