Commit Graph

7624 Commits

Author SHA1 Message Date
Jussi Saurio
f9bd047e4d Merge 'Fix non-4096 page sizes' from Jussi Saurio
Closes #2555
## Problem
The main problem we had with the current implementation of
`init_pager()` was that the WAL header was eagerly allocated and written
to disk with a page size, and a potential already-set page size on an
initialized database was not checked. Given this scenario:
- Initialized database with e.g. page size 512 but no WAL
- Tursodb connects to DB
It would not check the database file for the page size and instead would
initialize the WAL header with the default 4096 page size, plus
initialize the `BufferPool` similarly with the wrong size, and then
panic when reading pages from the DB, expecting to read `4096` instead
of `512`, as demonstrated in the reproduction of #2555.
## Fix
1. Add `Database::read_page_size_from_db_header()` method that can be
used in the above cases
2. Initialize the WAL header lazily during the first frame append, using
the existing `WalFile::ensure_header_if_needed()` method, removing the
need to eagerly pass `page_size` when constructing the in-memory
`WalFileShared` structure.
## Reader notes
This PR follows a fairly logical commit-by-commit structure so you'll
preferably want to read it that way.

Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #2569
2025-08-14 13:02:33 +03:00
Jussi Saurio
bd8c6f3c7c make PageSize more robust: only accept literal '1' value if it comes directly from db header 2025-08-14 12:40:58 +03:00
Jussi Saurio
c2e89f94f8 Change more page size panics to corrupt errors 2025-08-14 12:40:58 +03:00
Jussi Saurio
0c6d548402 integration test tweak 2025-08-14 12:40:58 +03:00
Jussi Saurio
f94fa2bbbe salt tweak 2025-08-14 12:40:58 +03:00
Jussi Saurio
38bb0719cc read from disk tweak 2025-08-14 12:40:58 +03:00
Jussi Saurio
8a1c3390e6 Add integration test for page_size=512 2025-08-14 12:40:58 +03:00
Jussi Saurio
a2a88e2c69 Make exception for page size literal value 1 2025-08-14 12:40:58 +03:00
Jussi Saurio
c75e4c1092 Fix non-4096 page sizes by making WAL header lazy 2025-08-14 12:40:58 +03:00
Jussi Saurio
f8620a9869 Use non-hardcoded size for BTreeCursor immutablerecord 2025-08-14 12:40:58 +03:00
Jussi Saurio
f5e27f23ad Use type-safe PageSize newtype for connection.page_size 2025-08-14 12:40:58 +03:00
Jussi Saurio
bb21bd93da Use type-safe PageSize newtype for pager.page_size 2025-08-14 12:40:58 +03:00
Jussi Saurio
ee58b7bd86 Add fn read_header() to DatabaseStorage trait 2025-08-14 12:40:58 +03:00
Jussi Saurio
a2a6feb193 Merge 'Use BufferPool owned by Database instead of a static global' from Jussi Saurio
## Problem
There are several problems with our current statically allocated
`BufferPool`.
1. You cannot open two databases in the same process with different page
sizes, because the `BufferPool`'s `Arena`s will be locked forever into
the page size of the first database. This is the case regardless of
whether the two `Database`s are open at the same time, or if the first
is closed before the second is opened.
2. It is impossible to even write Rust tests for different page sizes
because of this, assuming the test uses a single process.
## Solution
Make `Database` own `BufferPool` instead of it being statically
allocated, so this problem goes away.
Note that I didn't touch the still statically-allocated
`TEMP_BUFFER_CACHE`, because it should continue to work regardless of
this change. It should only be a problem if the user has two or more
databases with different page sizes open simultaneously, because
`TEMP_BUFFER_CACHE` will only support one pool of a given page size at a
time, so the rest of the allocations will go through the global
allocator instead.
## Notes
I extracted this change out from #2569, because I didn't want it to be
smuggled in without being reviewed as an individual piece.

Reviewed-by: Avinash Sajjanshetty (@avinassh)

Closes #2596
2025-08-14 12:40:32 +03:00
Jussi Saurio
6a33d4e792 Merge 'sync-engine: avoid unnecessary WAL push' from Nikita Sivukhin
This PR ignores changes from `TURSO_SYNC_TABLE_NAME` meta table in order
to not generate unnecessary push commands when nothing actually changed
on the client side.

Closes #2597
2025-08-14 12:26:05 +03:00
Nikita Sivukhin
34a7b2ffd4 ignore changes in the turso_sync_last_change_id 2025-08-14 12:39:44 +04:00
Nikita Sivukhin
8c9d648852 add test which check that we don't push without the need 2025-08-14 12:38:15 +04:00
Nikita Sivukhin
f603a0dfc8 change log level to INFO in order to simplify debugging (DEBUG logs in the db are pretty spammy) 2025-08-14 12:37:49 +04:00
Jussi Saurio
69d8a73028 Merge 'use virtual root page for sqlite_schema' from Mikaël Francoeur
This PR fixes a problem where `sqlite_schema` could be read before page
1 was allocated.
The fix is similar to that in SQLite. In SQLite, if `btreeCursor()` sees
that the root page is 1 and that the b-tree is empty, it sets the page
to 0 ([here](https://github.com/sqlite/sqlite/blob/master/src/btree.c#L4
691-L4696)). SQLite's `moveToRoot()` then uses this special value to
return `CURSOR_INVALID` with no rows ([here](https://github.com/sqlite/s
qlite/blob/master/src/btree.c#L5538-L5540)).
Fixes https://github.com/tursodatabase/turso/issues/2449

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2551
2025-08-14 11:08:11 +03:00
Jussi Saurio
d7186c7d7b Merge 'Add support for unlikely(X)' from bit-aloo
Implements the unlikely(X) function. Removes runtime implementations of
likely(), unlikely() and likelihood(), replacing them with panics if
they reach the VDBE.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2559
2025-08-14 10:56:27 +03:00
Jussi Saurio
78f1ed979e Merge 'io_uring: Gracefully handle submission queue overflow' from Preston Thorpe
Current handling is not ideal, this adds proper squeue overflow handling
by ensuring everything is still submitted in-order

Closes #2586
2025-08-14 10:55:17 +03:00
Jussi Saurio
0b17957f4e Merge 'Implement normal views' from Glauber Costa
Now that we actually implemented the statement parsing around views,
implementing normal SQLite views is relatively trivial, as they are just
an alias to a query.
We'll implement them now to get them out of the way, and then I'll go
back to DBSP

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2591
2025-08-14 10:54:46 +03:00
Jussi Saurio
fed70b49be Merge 'Disable unused variables in cargo clippy for CI' from Pedro Muniz
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2584
2025-08-14 10:54:18 +03:00
Jussi Saurio
359cba0474 Use BufferPool owned by Database instead of a static global
Problem

There are several problems with our current statically allocated
`BufferPool`.

1. You cannot open two databases in the same process with different
page sizes, because the `BufferPool`'s `Arena`s will be locked forever
into the page size of the first database. This is the case regardless
of whether the two `Database`s are open at the same time, or if the first
is closed before the second is opened.

2. It is impossible to even write Rust tests for different page sizes because
of this, assuming the test uses a single process.

Solution

Make `Database` own `BufferPool` instead of it being statically allocated, so this
problem goes away.

Note that I didn't touch the still statically-allocated `TEMP_BUFFER_CACHE`, because
it should continue to work regardless of this change. It should only be a problem if
the user has two or more databases with different page sizes open simultaneously, because
`TEMP_BUFFER_CACHE` will only support one pool of a given page size at a time, so the rest
of the allocations will go through the global allocator instead.

Notes

I extracted this change out from #2569, because I didn't want it to be smuggled in without
being reviewed as an individual piece.
2025-08-14 10:29:52 +03:00
bit-aloo
32e59614c7 remove unnecessary copy instr in likelihood, likely and unlikely 2025-08-14 09:08:32 +05:30
Preston Thorpe
be2c0ec6ab Merge 'Refactor: atomic ordering' from Preston Thorpe
Sequential is very rarely actually needed, we can very safely use
Acquire / Release for loads/stores, and some of these aren't guarding
anything and can use Relaxed.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2548
2025-08-13 22:39:50 -04:00
Preston Thorpe
78abe72762 Merge 'fix: Handle fresh INSERTs in materialized view incremental maintenance' from Glauber Costa
The op_insert function was incorrectly trying to capture an "old record"
for fresh INSERT operations when a table had dependent materialized
views. This caused a "Cannot delete: no current row" error because the
cursor wasn't positioned on any row for new inserts.
The issue was introduced in commit f38333b3 which refactored the state
machine for incremental view handling but didn't properly distinguish
between:
- Fresh INSERT operations (no old record exists)
- UPDATE operations without rowid change (old record should be captured)
- UPDATE operations with rowid change (already handled by DELETE)
This fix checks if cursor.rowid() returns a value before attempting to
capture the old record. If no row exists (fresh INSERT), we correctly
set old_record to None instead of erroring out.
I am also including tests to make sure this doesn't break. The reason I
didn't include tests earlier is that I didn't know it was possible to
run the tests under a flag. But in here, I am just adding the flag to
the execution script.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2579
2025-08-13 21:36:19 -04:00
Preston Thorpe
c3e29087a8 Merge 'Fix: do computations on usable_space as usize, not as u16' from Jussi Saurio
Otherwise page size 65536 will not work as casting to u16 will make it
wrap around to 0.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2583
2025-08-13 17:08:43 -04:00
Mikaël Francoeur
07ef47924c use virtual root page for sqlite_schema 2025-08-13 16:31:21 -04:00
Glauber Costa
5ab6f78f6b Implement views
Views (non materialized) are relatively simple, since they are just
query aliases.

We can expand them as if they were subqueries.
2025-08-13 14:14:03 -05:00
Glauber Costa
337f27a433 rename some structures to mention materialized views
A lot of the structures we have - like the ones under Schema, are
specific for materialized views. In preparation to adding normal views,
rename them, so things are less confusing.
2025-08-13 14:13:16 -05:00
bit-aloo
96be4eb40c remove the exec_* test 2025-08-13 22:51:36 +05:30
bit-aloo
8e6df064df remove likelihood, likely and unlikely exec methods, as we dont need them 2025-08-13 22:51:22 +05:30
bit-aloo
198ba6ca62 panic in vdbe if we hit likely, likelihood, and unlikely scalar method 2025-08-13 22:50:29 +05:30
bit-aloo
eda3a82306 strip unylikely and just translate the inner value 2025-08-13 22:46:31 +05:30
bit-aloo
e72097e2b7 strip likely and just translate the inner value 2025-08-13 22:46:22 +05:30
Nikita Sivukhin
887b25dd00 do not push wal unnecessary when nothing was changed locally 2025-08-13 20:22:10 +04:00
PThorpe92
ec4bf19fc7 Gracefully handle submission queue overflow in io_uring backend 2025-08-13 12:07:41 -04:00
Pekka Enberg
aaf7b39d96 Update manual.md 2025-08-13 18:52:14 +03:00
Pekka Enberg
29aea48405 Merge 'Document the I/O model' from Pedro Muniz
I modified the `docs/manual.md` file to store the current I/O model that
Turso follows. I want some feedback on the text and on the design as
well!

Closes #2308
2025-08-13 18:50:58 +03:00
Pekka Enberg
c62a9558e2 Merge 'Sync engine fixes' from Nikita Sivukhin
This PR fixes several small bugs around the sync-engine:
1. WAL pull handled end of the "frame" iterator incorrectly - this is
fixed in the https://github.com/tursodatabase/turso/commit/eff8d8540d1e8
3214459822ac6eeb0d3409ecc24 and test with concurrent DBs were added
2. Using `:memory:` in the sync engine lead to weird behavior because
engine will create different `MemoryIO` instances but turso-core under
the hood will use global registry of databases. I **changed** criteria
for determining in-memory databases by checking the prefix of the path
to be equal to `:memory:` (https://github.com/tursodatabase/turso/commit
/80476b3069f2cd460f368e10b4b2ef51f7608077)
3. Switched from `Buffer` to `Vec<u8>` for now as browser seems to not
support `Buffer` natively: https://github.com/tursodatabase/turso/commit
/2ca8a15dcc8ec79eddfaf7068fab4e1aa3241506
4. Added tracing to the sync engine: https://github.com/tursodatabase/tu
rso/commit/33ef1aa0da7dfe9416025e26e98f9cf9d48c9119

Closes #2582
2025-08-13 18:49:36 +03:00
Pekka Enberg
fbab40e5a6 Merge 'disable checkpoint: adjust semantic' from Nikita Sivukhin
This PR change semantic of `wal_disable_checkpoint` flag and make it
disable only "automatic" checkpoint behaviour which at the moment
includes:
1. Checkpoint on shutdown
2. Checkpoint when WAL reach certain size
sync-engine needs checkpoint to be in full control but it also will
issue `TRUNCATE` checkpoint at some certain moments of time - so
complete disablement of checkpoints will not suffice.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2580
2025-08-13 18:49:11 +03:00
Pekka Enberg
753e6689da Merge 'SDK: enable indices everywhere' from Nikita Sivukhin
Indices enabled by default in the shell - so let's enable them also in
the SDK

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2581
2025-08-13 18:44:41 +03:00
pedrocarlo
e416d91df2 Proposed I/O model 2025-08-13 12:16:14 -03:00
pedrocarlo
eeaa3c788a disable unused variables in cargo clippy 2025-08-13 12:00:57 -03:00
Jussi Saurio
fd72a2ff20 Fix: do computations on usable_space as usize, not as u16
Otherwise page size 65536 will not work as casting to u16 will make
it wrap around to 0.
2025-08-13 17:20:29 +03:00
PThorpe92
f1475bd5ac Remove bool return value from page set_locked 2025-08-13 10:17:33 -04:00
PThorpe92
614a0a45a6 Relax and fix memory ordering 2025-08-13 10:09:37 -04:00
Nikita Sivukhin
56b86cd5f5 add comment about :memory: in sync-engine 2025-08-13 17:16:46 +04:00
Nikita Sivukhin
eff8d8540d fix bug and add test with concurrent dbs in sync 2025-08-13 17:08:07 +04:00