turso

mirror of https://github.com/aljazceru/turso.git synced 2026-02-17 22:14:37 +01:00

Author	SHA1	Message	Date
Pekka Enberg	95660535da	core/storage: Demote info logging to debug	2025-09-14 13:10:46 +03:00
PThorpe92	f6dd0bc4d6	Dont grab page cache write lock in a loop	2025-09-13 12:21:13 -04:00
Pekka Enberg	6a2f0d6061	Merge 'Add per page checksums' from Avinash Sajjanshetty This patch adds checksums to Turso DB. You may check the design here in the [RFC](https://github.com/tursodatabase/turso/issues/2178). 1. We use reserved bytes (8 bytes) to store the checksums. On every IO read, we verify that the checksum matches. 2. We use twox hash for checksums. 3. Checksum works only on 4K pages now. It's a small change to enable for all other sizes, I will send another PR. 4. Right now, it's not possible to switch to different algorithm or turn off altogether. That will be added in the future PRs. 5. Checksums can be enabled only for new dbs. For existing DBs, we will disable it. 6. To add checksums for existing DBs, we need vacuum since it would require rewrite of whole db. Closes #2840	2025-09-13 18:46:53 +03:00
Pekka Enberg	d8f07fe3da	core: Panic on fsync() error by default Retrying fsync() on error was historically not safe ("fsyncgate") and Postgres still defaults to panicing on fsync(). Therefore, add a "data_sync_retry" pragma (disabled by default) and use it to determine whether to panic on fsync() error or not.	2025-09-13 10:21:12 +03:00
Pekka Enberg	a7e34f1551	Merge 'Handle partial writes in unix IO for pwrite and pwritev' from Preston Thorpe currently, `io_uring` is setup to handle partial writes for `pwritev` (will add `pwrite` in subsequent PR), but unix and other IO back-ends were not correctly setup for this. Closes #3073	2025-09-13 09:08:43 +03:00
Avinash Sajjanshetty	5256f29a9c	Add checksums behind a feature flag	2025-09-13 11:00:39 +05:30
Avinash Sajjanshetty	11030056c7	rename method to `verify_checksum`	2025-09-13 11:00:39 +05:30
Avinash Sajjanshetty	e010c46552	use checksums when reading/writing from db file	2025-09-13 11:00:39 +05:30
Avinash Sajjanshetty	4b59cf19e5	use checksums when reading/writing from wal	2025-09-13 11:00:39 +05:30
Avinash Sajjanshetty	14a1307720	Set reserved space as required when allocating page1	2025-09-13 11:00:39 +05:30
Avinash Sajjanshetty	3b410e4f79	set required reserved bytes while initialising the pager	2025-09-13 11:00:39 +05:30
Avinash Sajjanshetty	2e6943bfdf	Add helper to read reserved bytes value from disk	2025-09-13 11:00:39 +05:30
Avinash Sajjanshetty	c2c1ec2dba	Pass use `usable_space()` instead of hardcoding the value	2025-09-13 11:00:38 +05:30
Avinash Sajjanshetty	15266105f7	Update IOContext to carry checksum ctx	2025-09-13 11:00:38 +05:30
Avinash Sajjanshetty	3f72de3623	Add checksum module	2025-09-13 11:00:37 +05:30
TcMits	48522c1cc0	remove Stmt clone	2025-09-13 12:08:29 +07:00
PThorpe92	6098bca211	Handle partial writes in unix IO for pwrite and pwritev	2025-09-12 18:13:02 -04:00
Preston Thorpe	b1420904bb	Merge 'fix(btree): advance cursor after interior node replacement in delete' from Jussi Saurio ## Problem When a delete replaces an index interior cell, the replacement key is LT the deleted key. Currently on the main branch, after the deletion happens, the following call to BTreeCursor::next() stops at the replaced interior cell. This is incorrect - imagine the following sequence: - We are executing a query that deletes all keys WHERE key > 5 - We delete <key=6> from an interior node, and take a replacement <key=5> from the left subtree of that interior page - next() is called, and we land on the interior node again, which now has <key=5>, and we incorrectly delete it even though our WHERE condition is key > 5. ## Solution This PR: - Tracks `interior_node_was_replaced` in CheckNeedsBalancing - If no balancing is needed and a replacement occurred, advances once so the next invocation of next() will skip the replaced cell properly i.e. we prevent next() from landing on the replaced content and ensures iteration continues with the next logical record. ## Details This problem only became apparent once we started using indexes as valid iteration cursors for DELETE operations in #2981 Closes #3045 Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3049	2025-09-12 17:37:01 -04:00
Pekka Enberg	ad6157028e	Merge 'core/vdbe: Fix BEGIN CONCURRENT transactions' from Pekka Enberg The transaction upgrade logic in Transaction opcode is total nonsense for concurrent transactions so just drop it. Fixes #3061 Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #3070	2025-09-12 23:11:12 +03:00
Pekka Enberg	a0921c4221	Merge 'core/storage: Remove unused import warning' from Pekka Enberg Closes #3069	2025-09-12 23:11:05 +03:00
Pekka Enberg	5e2b1bc0d3	Merge 'Fix incompatible math functions' from Levy A. Fixes #1817, #2068, #1326, #1397. The solution is very much not ideal, but fixes all math function related incompatibilities. Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3033	2025-09-12 21:28:08 +03:00
Pekka Enberg	86dcdad3d0	core/vdbe: Fix BEGIN CONCURRENT transactions The transaction upgrade logic in Transaction opcode is total nonsense for concurrent transactions so just drop it. Fixes #3061	2025-09-12 21:19:34 +03:00
Pekka Enberg	2bc8c0c850	core/storage: Remove unused import warning	2025-09-12 21:09:38 +03:00
Pekka Enberg	dcd43ab8fc	Merge 'Handle `EXPLAIN QUERY PLAN` like SQLite' from Lâm Hoàng Phúc After this PR: ``` turso> EXPLAIN QUERY PLAN SELECT 1; QUERY PLAN `--SCAN CONSTANT ROW turso> EXPLAIN QUERY PLAN SELECT 1 UNION SELECT 1; QUERY PLAN `--COMPOUND QUERY \|--LEFT-MOST SUBQUERY \| `--SCAN CONSTANT ROW `--UNION USING TEMP B-TREE `--SCAN CONSTANT ROW turso> CREATE TABLE x(y); turso> CREATE TABLE z(y); turso> EXPLAIN QUERY PLAN SELECT * from x,z; QUERY PLAN \|--SCAN x `--SCAN z turso> EXPLAIN QUERY PLAN SELECT * from x,z ON x.y = z.y; QUERY PLAN \|--SCAN x `--SEARCH z USING INDEX ephemeral_z_t2 turso> ``` Closes #3057	2025-09-12 20:41:23 +03:00
PThorpe92	b04c364981	Fix clippy error	2025-09-12 11:43:38 -04:00
PThorpe92	7a14c7394f	Remove the header copy stored on the WalFile, fix fast_path	2025-09-12 11:29:43 -04:00
PThorpe92	25e7c719f1	Update checkpoint_seq on each checkpoint, not just when log restarts This was causing checkpoint_seq to be 0 when we had already successfully ran a passive checkpoint, and causing us to use improper pages from the cache.	2025-09-12 11:29:42 -04:00
Pekka Enberg	14da283e36	Merge 'MVCC: remove reliance on BTreeCursor::has_record()' from Jussi Saurio Closes #3051 Closes #3032 Closes #3056	2025-09-12 17:31:15 +03:00
Pekka Enberg	54b4c9f30b	Merge 'Implement the balance_quick algorithm' from Jussi Saurio Fast balancing routine for the common special case where the rightmost leaf page of a given subtree overflows such that the overflowing cell would be the rightmost cell on the page -- i.e. an append. In this case we just add a new leaf page as the right sibling of that page, put the overflow cell there, and insert a new divider cell into the parent. The high level steps are: 1. Allocate a new leaf page and insert the overflow cell payload in it. 2. Create a new divider cell in the parent - it contains the page number of the old rightmost leaf, plus the largest rowid on that page. 3. Update the rightmost pointer of the parent to point to the new leaf page. 4. Continue balance from the parent page (inserting the new divider cell may have overflowed the parent Closes #3041	2025-09-12 17:30:52 +03:00
Pekka Enberg	443720c74a	Merge 'benchmark: introduce simple 1 thread concurrent benchmark for mvcc/sq…' from Pere Diaz Bou …lite/wal This is considerably simpler with 1 thread as we just try to yield control when I/O happens and we only run io.run_once when all connections tried to do some work. This allows connections to cooperatively progress. Closes #3060	2025-09-12 17:27:41 +03:00
Pekka Enberg	7fdb116d41	Merge 'core/mvcc: queue mvcc txns on pager's end_tx' from Pere Diaz Bou Flushing mvcc changes to disk requires serialization. To do so we simply introduce a lock for pager.end_tx, which will take ownership of flushing to WAL. Once this is finished we can simply release lock. When multiple tx writes happen concurrently in mvcc, max frame will be updated. This new max_frame makes is the point of view of the other transaction return busy because his current wal snapshot is outdated. Closes #3059	2025-09-12 17:27:17 +03:00
Pere Diaz Bou	ec2cff2026	benchmark: introduce simple 1 thread concurrent benchmark for mvcc/sqlite/wal This is considerably simpler with 1 thread as we just try to yield control when I/O happens and we only run io.run_once when all connections tried to do some work. This allows connections to cooperatively progress.	2025-09-12 14:02:57 +00:00
Pere Diaz Bou	39fb5913e0	core/mvcc: queue write txn commits in mvcc on pager end_tx Flushing mvcc changes to disk requires serialization. To do so we simply introduce a lock for pager.end_tx, which will take ownership of flushing to WAL. Once this is finished we can simply release lock.	2025-09-12 14:00:02 +00:00
Pere Diaz Bou	e87226548c	core/mvcc: fix concurrent tests mvcc	2025-09-12 13:49:40 +00:00
Pere Diaz Bou	9b6d181be4	wal: add hacky update max frame for mvcc use When multiple tx writes happen concurrently in mvcc, max frame will be updated. This new max_frame makes is the point of view of the other transaction return busy because his current wal snapshot is outdated.	2025-09-12 13:49:14 +00:00
Pere Diaz Bou	66b5630870	vdbe/mvcc: rollback mvcc txn on vdbe error	2025-09-12 13:47:45 +00:00
Jussi Saurio	305b2f55ae	MVCC: remove reliance on BTreeCursor::has_record()	2025-09-12 16:03:55 +03:00
TcMits	9dac467b40	support EXPLAIN QUERY PLAN	2025-09-12 19:58:45 +07:00
PThorpe92	5849819a59	Fix tests for views	2025-09-12 08:20:40 -04:00
Preston Thorpe	b09dcceeef	Merge 'Fixes views' from Glauber Costa This is a collection of fixes for materialized views ahead of adding support for JOINs. It is mostly issues with how we assume there is a single table, with a single delta, but we have to send more than one. Those are things that are just objectively wrong, so I am sending it separately to make the JOIN PR smaller. Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3009	2025-09-12 07:43:32 -04:00
Preston Thorpe	16a3410934	Merge 'Fix checkpoint fast-path, don't use cached pages w/o write lock' from Preston Thorpe closes #3024 Don't use pages from the cache unless we hold an exclusive write lock, because a page could be updated by a writer in-memory at any point before we backfill it. Clear the WAL tag in other areas to prevent any stale tags. Also, we will just snapshot the page when we determine that it's eligible, and pay a memcpy instead of the read from disk, but this further prevents any in-memory changes to the page/TOCTOU issues, and we also assert that it's still eligible after we copy it to a new buffer. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #3036	2025-09-12 07:39:32 -04:00
Preston Thorpe	f55023acc8	Merge 'Refactor UPSERT to use wal_expr_mut to walk AST.' from Preston Thorpe Working on https://github.com/tursodatabase/turso/issues/2964 I came upon `walk_expr_mut`, I don't think it existed last time I really spent much time in the translator. So quickly went back and cleaned this up. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #3044	2025-09-12 06:45:13 -04:00
PThorpe92	f60ca3970f	Remove old comment from wal	2025-09-12 06:39:59 -04:00
PThorpe92	faf3531a4e	Fix checkpoint fast-path, don't use cached pages w/o write lock closes #3024 Also we snapshot the page when we determine that it's eligible, and pay a memcpy instead of the read from disk, but this further prevents any in-memory changes to the page/TOCTOU issues.	2025-09-12 06:38:02 -04:00
TcMits	29d8d04d58	Merge branch 'main' into explain-query-plan	2025-09-12 17:34:11 +07:00
TcMits	5dddc5e00b	introduce OP_Explain	2025-09-12 17:31:50 +07:00
Pekka Enberg	6a992e551c	Merge 'core: Fix reprepare to properly reset statement cursors and registers' from Pedro Muniz Before we were not updating the number of registers and cursors, which meant that on a schema change the Program could now open an additional cursor and we would not have space for it in the ProgramState, which lead to the panic. Closes #3002 Closes #3034	2025-09-12 12:29:53 +03:00
Jussi Saurio	9f6e1a2e7c	fix(btree): advance cursor after interior node replacement in delete When a delete replaces an interior cell, the replacement key is LT the deleted key. Currently on the main branch, after the deletion happens, the following call to BTreeCursor::next() stops at the replaced interior cell. This is incorrect - imagine the following sequence: - We are executing a query that deletes all keys WHERE key > 5 - We delete <key=6> from an interior node, and take a replacement <key=5> from the left subtree of that interior page - next() is called, and we land on the interior node again, which now has <key=5>, and we incorrectly delete it even though our WHERE condition is key > 5. This PR: - Tracks `interior_node_was_replaced` in CheckNeedsBalancing - If no balancing is needed and a replacement occurred, advances once so the next invocation of next() will skip the replaced cell properly i.e. we prevent next() from landing on the replaced content and ensures iteration continues with the next logical record. Closes #3045	2025-09-12 10:49:44 +03:00
Pekka Enberg	aa32574554	core/mvcc: Fix begin_exclusive_tx() The RwLock elimination patches conflicted with the BEGIN CONCURRENT changes.	2025-09-12 08:42:14 +03:00
Pekka Enberg	a9a48f6272	Merge 'core/schema: Optimize get_dependent_materialized_views() when no views' from Pekka Enberg Eliminates get_dependent_materialized_views() overhead when there are no views. Note that we need to optimize the case when there are views as well because this ends up being pretty hot in write-intensive workloads. Closes #3046	2025-09-12 08:29:24 +03:00

1 2 3 4 5 ...

4849 Commits