turso

mirror of https://github.com/aljazceru/turso.git synced 2026-02-14 04:24:20 +01:00

Author	SHA1	Message	Date
Pekka Enberg	06d869ea5e	core/ext: Switch vtab_modules from Rc to Arc	2025-09-17 10:36:12 +03:00
Pekka Enberg	6cfd803dad	Merge 'core: Convert Rc<Pager> to Arc<Pager>' from Pekka Enberg Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3154	2025-09-17 10:23:15 +03:00
Pekka Enberg	17e9f05ea4	core: Convert Rc<Pager> to Arc<Pager>	2025-09-17 09:32:49 +03:00
Jussi Saurio	104b8dd083	Merge 'Encrypt page 1' from This PR extends the existing encryption support to include the database header page (page 1). Reviewed-by: Avinash Sajjanshetty (@avinassh) Closes #3040	2025-09-17 09:26:06 +03:00
Jussi Saurio	fad8d0c8b8	fix build	2025-09-17 08:45:13 +03:00
Jussi Saurio	cae234818b	Merge 'Inital support for window functions' from Piotr Rżysko This adds basic support for window functions. For now: * Only existing aggregate functions can be used as window functions. * Specialized window-specific functions (`rank`, `row_number`, etc.) are not yet supported. * Only the default frame definition is implemented: `RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW EXCLUDE NO OTHERS`. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3079	2025-09-17 08:29:16 +03:00
Pekka Enberg	5680fe3903	Merge 'whopper: Handle write-write conflict' from Pekka Enberg Write-write conflict is not an error that stops the simulation. Closes #3149	2025-09-16 20:34:04 +03:00
Pekka Enberg	cf0480e469	whopper: Handle write-write conflict Write-write conflict is not an error that stops the simulation. Note that the transaction is rolled back automatically.	2025-09-16 17:58:33 +03:00
Pekka Enberg	8bf4105bca	Fix Antithesis Dockerfile some more	2025-09-16 17:14:16 +03:00
Pekka Enberg	709b6d9579	Fix simulator and Antithesis Dockerfiles	2025-09-16 17:09:49 +03:00
rajajisai	e605aff31b	Merge branch 'main' into enc-page-1	2025-09-16 10:06:00 -04:00
rajajisai	89caa868f9	Encryption support for database header page	2025-09-16 10:04:30 -04:00
Pekka Enberg	ae25a0f088	Merge 'Implement Min/Max aggregators' from Glauber Costa We have not implemented them before because they require the raw elements to be kept. It is easy to see why in the following example: ``` current_min = 3; insert(2) => current_min = 2 // can be done without state delete(2) => needs to look at the state to determine new min! ``` The aggregator state was a very simple key-value structure. To accomodate for min/max, we will make it into a more complex table, where we can encode a more complex structure. The key insight is that we can use a primary key composed of: ``` 1) storage_id 2) zset_id, 3) element ``` The storage_id and zset_id are our previous key, except they are now exploded to support a larger range of storage_id. With more bits available in the storage_id, we can encode information about which column we are storing. For aggregations in multiple columns, we will need to keep a different list of values for min/max! The element is just the values of the columns. Because this is a primary key, the data will be sorted in the btree. We can then just do a prefix search in the first two components of the key and easily find the min/max when needed. This new format is also adequate for joins. Joins will just have a new storage_id which encodes two "columns" (left side, right side). Closes #3143	2025-09-16 16:19:59 +03:00
Jussi Saurio	08218e518c	Merge 'mvcc: handle properly the case where starting pager read tx fails with busy' from Jussi Saurio Fixes panics with `must have a read transaction to start a write transaction` - previously we were simply ignoring these Busy errors and thinking we have a read tx, when we actually don't. Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3148	2025-09-16 15:49:10 +03:00
Jussi Saurio	d9e7b7f0e1	mvcc: starting a pager read tx can fail with busy	2025-09-16 15:19:49 +03:00
Jussi Saurio	0b223e78e3	Merge 'Fix 3 different MVCC bugs' from Jussi Saurio Commit messages contain explanations of each change. Closes #3129 Closes #3128 Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #3146	2025-09-16 15:19:38 +03:00
Jussi Saurio	e012768549	mvcc: dont allow CONCURRENT transaction to overwrite others changes We start a pager read transaction at the beginning of the MV transaction, because any reads we do from the database file and WAL must uphold snapshot isolation. However, we must end and immediately restart the read transaction before committing. This is because other transactions may have committed writes to the DB file or WAL, and our pager must read in those changes when applying our writes; otherwise we would overwrite the changes from the previous committed transactions. Note that this would be incredibly unsafe in the regular transaction model, but in MVCC we trust the MV-store to uphold the guarantee that no write-write conflicts happened.	2025-09-16 15:03:26 +03:00
Jussi Saurio	b4fba69fe2	mvcc: fix logic bug in CommitState::WriteRow iteration order We must iterate the row versions in reverse order because the versions are in order of oldest to newest, and we must commit the newest version applied by the active transaction.	2025-09-16 12:56:17 +03:00
Jussi Saurio	139ce39a00	mvcc: fix logic bug in MvStore::insert_version_raw() In insert_version_raw(), we correctly iterate the versions backwards because we want to find the newest version that is still older than the one we are inserting. However, the order of `.enumerate()` and `.rev()` was wrong, so the insertion position was calculated based on the position in the _reversed_ iterator, not the original iterator.	2025-09-16 12:56:17 +03:00
Jussi Saurio	847e413c34	mvcc: assert that DeleteRowStateMachine must find the row it is deleting	2025-09-16 12:56:17 +03:00
Jussi Saurio	ea6373b8ae	Switch to BTreeMap for deterministic iteration	2025-09-16 12:56:17 +03:00
Pekka Enberg	74331898a3	Merge 'Add quoted identifier test cases for `ALTER TABLE`' from Levy A. Resolves #2093 There is a small incompatibility on how we quote the added column on the final schema, but doesn't change any behavior. Closes #2943	2025-09-16 11:46:12 +03:00
Pekka Enberg	4b12ce954a	Merge 'core/mvcc: Specify level for tracing' from Pekka Enberg ..otherwise we perform the tracing for every step() dropping write throughput by 40%. Closes #3145	2025-09-16 10:44:17 +03:00
Pekka Enberg	b625e73355	Merge 'Switch to GitHub runners for performance workflows' from Diego Reis Blacksmith runners have a lot of variance in performance, making it hard for Nyrkiö to do its job. Discussed on [Discord](https://discord.com/cha nnels/1258658826257961020/1402269486752469085) Reviewed-by: Henrik Ingo <henrik@nyrk.io> Closes #2448	2025-09-16 10:40:08 +03:00
Pekka Enberg	3c62352bcb	core/mvcc: Specify level for tracing ..otherwise we perform the tracing for every step() dropping write throughput by 40%.	2025-09-16 09:51:08 +03:00
Pekka Enberg	950cb8a818	Merge 'Move common dependencies to workspace ' from Pedro Muniz This removes 4 crates from the `cargo build` and tries to ensure that in the future we avoid the same crates with different versions. Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3141	2025-09-16 08:30:06 +03:00
Glauber Costa	6bee6bb785	implement min/max We have not implemented them before because they require the raw elements to be kept. It is easy to see why in the following example: current_min = 3; insert(2) => current_min = 2 // can be done without state delete(2) => needs to look at the state to determine new min! The aggregator state was a very simple key-value structure. To accomodate for min/max, we will make it into a more complex table, where we can encode a more complex structure. The key insight is that we can use a primary key composed of: 1) storage_id 2) zset_id, 3) element The storage_id and zset_id are our previous key, except they are now exploded to support a larger range of storage_id. With more bits available in the storage_id, we can encode information about which column we are storing. For aggregations in multiple columns, we will need to keep a different list of values for min/max! The element is just the values of the columns. Because this is a primary key, the data will be sorted in the btree. We can then just do a prefix search in the first two components of the key and easily find the min/max when needed. This new format is also adequate for joins. Joins will just have a new storage_id which encodes two "columns" (left side, right side).	2025-09-15 22:30:48 -05:00
Glauber Costa	3565e7978a	Add an index to the dbsp internal table And also change the schema of the main table. I have come to see the current key-value schema as inadequate for non-aggregate operators. Calculating Min/Max, for example, doesn't feat in this schema because we have to be able to track existing values and index them. Another alternative is to keep one table per operator type, but this quickly leads to an explosion of tables.	2025-09-15 22:30:48 -05:00
Glauber Costa	3e9a5d93b5	hide internal tables from .schema	2025-09-15 22:30:48 -05:00
Preston Thorpe	3aa477f819	Merge 'fix re-entrancy issue in Pager::free_page' from Jussi Saurio current logic can lead to a situation where: - we call read_page(trunk_page_id) - we assign trunk_page in the FreePageState state machine - the page read fails and cache marks it as !locked && !loaded - next call to Pager::free_page() asserts that the page is loaded and panics Whopper takes so long to run that i wasn't patient enough, but i'm pretty sure this closes #3101 Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3139	2025-09-15 19:12:28 -04:00
pedrocarlo	3c91ae206b	move as many dependencies as possible to workspace to avoid multiple versions of the same dependency	2025-09-15 17:19:36 -03:00
Jussi Saurio	d2d1d1bc61	fix re-entrancy issue in Pager::free_page current logic can lead to a situation where: - we call read_page(trunk_page_id) - we assign trunk_page in the FreePageState state machine - the page read fails and cache marks it as !locked && !loaded - next call to Pager::free_page() asserts that the page is loaded and panics	2025-09-15 21:41:18 +03:00
Pekka Enberg	0dcd38a3c3	Merge 'stress: Retry sync on error to avoid a panic' from Pekka Enberg We now panic on fsync error by default to be safe against fsyncgate. However, no reason to do that in the stress tester, especially since we test out of disk space errors under Antithesis. Closes #3131	2025-09-15 19:03:46 +03:00
Pekka Enberg	bfce9e02a0	Merge 'move `divider_cell_is_overflow_cell` to debug assertions' from Pedro Muniz Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #3135	2025-09-15 17:42:55 +03:00
pedrocarlo	7021386f86	move `divider_cell_is_overflow_cell` to debug assertions so it stops appearing in release builds	2025-09-15 11:11:28 -03:00
Pekka Enberg	e79dfd2f50	Merge 'Fix SharedWalFile deadlock in multithreaded context' from Jussi Saurio Fixes `write-throughput` benchmark deadlocking on 2 threads or more. The gist of the PR is in the big code comment: ```rust // important not to hold shared lock beyond this point to avoid deadlock scenario where: // thread 1: takes readlock here, passes reference to shared.file to begin_read_wal_frame // thread 2: tries to acquire write lock elsewhere // thread 1: tries to re-acquire read lock in the completion (see 'complete' above) // // this causes a deadlock due to the locking policy in parking_lot: // from https://docs.rs/parking_lot/latest/parking_lot/type.RwLock.html: // "This lock uses a task-fair locking policy which avoids both reader and writer starvation. // This means that readers trying to acquire the lock will block even if the lock is unlocked // when there are writers waiting to acquire the lock. // Because of this, attempts to recursively acquire a read lock within a single thread may result in a deadlock." ``` Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #3132	2025-09-15 15:25:04 +03:00
Jussi Saurio	32cd01a615	fix deadlock	2025-09-15 14:48:26 +03:00
Jussi Saurio	d493a72cc0	dont unwrap begin_tx	2025-09-15 14:48:26 +03:00
Jussi Saurio	26c0d72c25	perf/thrpt: add tracing	2025-09-15 14:25:18 +03:00
Pekka Enberg	247d4c06c6	Merge 'Fix MVCC update' from Jussi Saurio Based on #3126 Closes #3029 Closes #3030 Closes #3065 Closes #3083 Closes #3084 Closes #3085 simple reason why mvcc update didn't work: it didn't try to update. Closes #3127	2025-09-15 14:24:59 +03:00
Pekka Enberg	a5eac9b700	Merge 'avoid unnecessary cloning when formatting Txn for Display' from Avinash Sajjanshetty Closes #3109	2025-09-15 14:24:32 +03:00
Pekka Enberg	dd06d2eb99	Merge 'add perf/throughput/rusqlite to workspace' from Pedro Muniz Closes #3116	2025-09-15 14:24:13 +03:00
Pekka Enberg	244458199f	Merge 'Various fixes to sync' from Nikita Sivukhin This PR fixes incorrect path registration for sync in browser, add tests and also expose revision string in the `stats()` method of synced database Closes #3124	2025-09-15 14:24:02 +03:00
Pekka Enberg	eeab6d5ce0	stress: Retry sync on error to avoid a panic We now panic on fsync error by default to be safe against fsyncgate. However, no reason to do that in the stress tester, especially since we test out of disk space errors under Antithesis.	2025-09-15 14:21:53 +03:00
Pekka Enberg	877b28bcb3	perf/throughput/turso: Use 30 second busy timeout like in rusqlite	2025-09-15 13:57:58 +03:00
Pekka Enberg	380b27f58a	Merge 'Busy handler' from Pedro Muniz I searched using deepwiki how SQLite implements their busy handler. They use a callback system with exponential backoff, where it stores the callback in the pager and in the database. I confess I found this slightly confusing, so I just implemented a simple exponential backoff directly in the `Statement` struct. I imagine SQLite does this in a more convoluted manner, as they do not have a concept of yielding as we do. https://deepwiki.com/search/where-is-the-code-for-the- busy_4a5ed006-4eed-479f-80c3-dd038832831b I also fixed the rust bindings so that it yields when we return `StepResult::IO`, instead of just blocking the async function. To achieve this I implemented the `Stream` trait for `Rows` struct, which unfortunately came with a slight change to the function signature of `rows.next()` to `rows.try_next()`. EDIT: ~test `test_multiple_connections_fuzz` timeouts because now it has the busy handler "slowing" things down (this test generates a lot of busy transactions), so it takes a lot longer for the test to run. Not sure if it is acceptable for us to reduce the number of operations so the test is shorter.~ EDIT: Adjusted the API to be more in line with https://www.sqlite.org/c3ref/busy_timeout.html. Sets maximum total accumulated timeout. If the duration is None or Zero, we unset the busy handler for this Connection. This api defers slightly from SQLite as instead of sleeping for linear amount of time specified by the user, we will sleep in phases until the the total amount of time requested is reached. This means we first sleep of 1ms, then if we still return busy, we sleep for 2 ms, and repeat until a maximum of 100 ms per phase or we reached the total timeout. Example: 1. Set duration to 5ms 2. Step through query -> returns Busy -> sleep/yield for 1 ms 3. Step through query -> returns Busy -> sleep/yield for 2 ms 4. Step through query -> returns Busy -> sleep/yield for 2 ms (totaling 5 ms of sleep) 5. Step through query -> returns Busy -> return Busy to user This slight api change demonstrated a better throughtput in `perf/throughput/turso` benchmark ```sh cargo run -p write-throughput --release -- -t 2 Running write throughput benchmark with 2 threads, 100 batch size, 10 iterations, mode: Legacy Database created at: write_throughput_test.db Thread 1: 1000 inserts in 0.04s (23438.42 inserts/sec) Thread 0: 1000 inserts in 0.08s (12385.64 inserts/sec) === BENCHMARK RESULTS === Total inserts: 2000 Total time: 0.08s Overall throughput: 24762.60 inserts/sec Threads: 2 Batch size: 100 Iterations per thread: 10 Database file exists: true Database file size: 4096 bytes ``` Depends on #3102 Closes #3067 Closes #3074	2025-09-15 13:52:49 +03:00
Pekka Enberg	07c580aadf	Merge 'mvcc: fix hang when CONCURRENT tx tries to commit and non-CONCURRENT tx is active' from Jussi Saurio Based on #3125 Closes #3120 Closes #3126	2025-09-15 11:45:30 +03:00
Jussi Saurio	61764bf415	clippy	2025-09-15 11:37:17 +03:00
Jussi Saurio	1fa57b2dec	add test demonstrating that issue 3085 can be closed	2025-09-15 11:36:19 +03:00
Jussi Saurio	88856de48e	fmt	2025-09-15 11:33:15 +03:00

1 2 3 4 5 ...

8946 Commits