Commit Graph

276 Commits

Author SHA1 Message Date
pedrocarlo
96a6bc5125 end_tx does not need schema_did_change variable 2025-08-11 18:59:11 -03:00
bit-aloo
cf12c90428 expose freepage_list in pager 2025-08-11 09:57:46 +05:30
PThorpe92
84ffed709a Round up allocation for wal frame arena to next page multiple of 64 2025-08-08 10:55:29 -04:00
PThorpe92
faf248df03 Add more docs and comments for TempBufferCache 2025-08-08 10:55:28 -04:00
PThorpe92
39d230a899 Add bitmap for tracking pages in arena 2025-08-08 10:55:27 -04:00
PThorpe92
9d1ca1c8ca Add ReadFixed/WriteFixed opcodes for buffers from registered arena 2025-08-08 10:55:27 -04:00
PThorpe92
7ea52a3f89 Fix changing page size and initialization for buffer pool 2025-08-08 10:55:26 -04:00
PThorpe92
4ffb273b53 Adjust IO to use new buffer pool and buffer API 2025-08-08 10:55:26 -04:00
Jussi Saurio
1fe32dadf3 PageContent: make read_x/write_x methods private and add dedicated methods
Problem:

A very easy source of bugs is to mistakenly use e.g. PageContent::read_u16()
instead of PageContent::read_u16_no_offset(). The difference between the two
is that `read_u16()` adds 100 bytes to the requested byte offset if and only if
the page in question is page 1, which contains a 100-byte database header.

Case in point: see #2491.

Observation:

In all of the cases where we want to read from or write to a page  "header-sensitively",
those reads/writes are to so-called "well known offsets", e.g. specific bytes in a btree
page header.

In all other cases, the "no-offset" versions, i.e. the ones taking the absolute byte offset
as parameter, should be used.

Solution:

1. Make all the offset-sensitive versions (read_u16() and friends) private methods of
`PageContent`.
2. Expose dedicated methods for things like updating rightmost pointer, updating fragmented
bytes count and so on, and use them instead of the plain read/write methods universally.
2025-08-07 17:00:06 +03:00
Pekka Enberg
be8e8ff7c0 Merge 'turso-sync: rewrite' from Nikita Sivukhin
This PR rewrites `turso-sync` package introduced in the #2334 and
renames it to the `turso-sync-engine` (anyway the diff will be
unreadable).
The purpose of rewrite is to get rid of Tokio because this makes things
harder when we wants to export bindings to WASM.
In order to achieve "runtime"-agnostic sync core but still be able to
use async/await machiner - this PR introduce usage of `genawaiter` crate
which allows to transfer async/await Rust state machines to the
generators. So, sync operations are just generators which can yield `IO`
command in case where there is a need for it.
Also, this PR introduces separate `ProtocolIo` in the `turso-sync-
engine` which defines extra IO methods:
1. HTTP interaction
2. Atomic read/writes to the file. This is not strictly necessary and
`turso_core::IO` methods can be extended to support few more things
(like `delete`/`rename`) - but I decided that it will be simpler to just
expose 2 more methods for sync protocol for the sake of atomic metadata
update (which is very small - dozens of bytes).
    * As a bonus, we can store metadata for browser in the
`LocalStorage` which may be more natural thing to do(?) (user can reset
everything by just clearing local storage)
The `ProtocolIo` works similarly to the `IO` in a sense that it gives
the caller `Completion` which it can check periodically for new data.

Closes #2457
2025-08-07 07:58:02 +03:00
Preston Thorpe
0777fd9082 Merge 'implement the MaxPgCount opcode' from Glauber Costa
It is used by the pragma max_page_count, which is also implemented.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2472
2025-08-06 23:44:05 -04:00
Glauber Costa
f36974f086 implement the MaxPgCount opcode
It is used by the pragma max_page_count, which is also implemented.
2025-08-06 13:20:15 -05:00
Nikita Sivukhin
b612259a3a more friendly copmletely runtime agnostic turso-sync-engine crate 2025-08-06 19:26:55 +04:00
pedrocarlo
7b746ccc65 adjust state machine for ptrmap_get 2025-08-06 11:32:21 -03:00
pedrocarlo
b529305b82 state machine for ptrmap_put 2025-08-06 11:32:21 -03:00
pedrocarlo
931384afb6 state machine fix for btree create for AutoVacuum::Full 2025-08-06 11:32:21 -03:00
pedrocarlo
f656d0bc20 header ref state machine 2025-08-06 11:32:21 -03:00
PThorpe92
f6a68cffc2 Remove RefCell from IO and Page apis 2025-08-05 16:24:49 -04:00
Jussi Saurio
cde8567b1d Merge 'More state machine + Return IO in places where completions are created' from Pedro Muniz
In preparation for tracking IO Completions, we need to start to return
IO in places where completions are created. Doing some more plumbing now
to avoid bigger PRs for the future

Closes #2438
2025-08-05 15:47:51 +03:00
Jussi Saurio
a28e64bfdd cleanup: remove unused page uptodate flag 2025-08-05 14:25:42 +03:00
pedrocarlo
aa8d17cbf1 state machine for ptrmap_get 2025-08-05 01:38:42 -03:00
pedrocarlo
0ac040cc87 return IO in some other functions in Pager 2025-08-04 23:28:57 -03:00
Jussi Saurio
a66b56678d Merge 'Reprepare Statements when Schema changes' from Pedro Muniz
Closes #1967
To support this I had to change how we did `epilogue` similarly to how
SQLite does it. SQLIte first declares a `beginWriteOperation` when some
statement is going to necessitate a Write Transaction. And as we now
need to pass the current schema cookie to `epilogue` it was easier to
call epilogue only in one location (like we do with prologue), and just
have each statement declare their intentions separately. This allows us
to not have to pass the Schema around just to do the epilogue. I believe
this is something that @jussisaurio would be interested in.
~Also had to disable the MVCC test, as it was extremely buggy for me.~
Just disabled reprepare statements for MVCC

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2214
2025-08-05 00:01:14 +03:00
pedrocarlo
266a7e1c66 do not error in op_transaction if page 1 was not allocated 2025-08-04 12:32:34 -03:00
Nikita Sivukhin
76bdf0c1ab small fixes 2025-08-04 17:02:53 +04:00
Nikita Sivukhin
2e23230e79 extend raw WAL API with few more methods
- try_wal_watermark_read_page - try to read page from the DB with given WAL watermark value
- wal_changed_pages_after - return set of unique pages changed after watermark WAL position
2025-08-04 16:55:50 +04:00
Nikita Sivukhin
0adb40534c hind dangerous methods behind conn_raw_api feature 2025-08-04 12:40:28 +04:00
Jussi Saurio
3b0c8b08fe Merge 'perf/pager: dont clear page cache on commit' from Jussi Saurio
This should be safe to do as:
1. page cache is private per connection
2. since this connection wrote the flushed pages/frames, they are up to
date from its perspective
3. multiple concurrent statements inside one connection are not
snapshot-transactional even in sqlite

Reviewed-by: Pekka Enberg <penberg@iki.fi>

Closes #2407
2025-08-02 13:35:57 +03:00
Jussi Saurio
4497d22d3f perf/pager: dont clear page cache on commit 2025-08-02 13:09:36 +03:00
Pekka Enberg
598fdade3e core: Fold HeaderRef to pager module 2025-08-02 09:50:25 +03:00
Jussi Saurio
e147494642 pager: make WAL optional again and remove DummyWAL 2025-08-01 10:14:35 +03:00
pedrocarlo
543cdb3e2c underscoring completions and IOResult to avoid warning messages 2025-07-31 11:51:17 -03:00
Jussi Saurio
e88707c6fd fix/wal: only rollback WAL if txn was write 2025-07-31 14:18:43 +03:00
Jussi Saurio
f619556344 Merge 'Direct DatabaseHeader reads and writes – with_header and with_header_mut' from Levy A.
This PR introduces two methods to pager. Very much inspired by
`with_schema` and `with_schema_mut`. `Pager::with_header` and
`Pager::with_header_mut` will give to the closure a shared and unique
reference respectively that are transmuted references from the `PageRef`
buffer.
This PR also adds type-safe wrappers for `Version`, `PageSize`,
`CacheSize` and `TextEncoding`, as they have special in-memory
representations.
Writing the `DatabaseHeader` is just a single `memcpy` now.
```rs
pub fn write_database_header(&self, header: &DatabaseHeader) {
    let buf = self.as_ptr();
    buf[0..DatabaseHeader::SIZE].copy_from_slice(bytemuck::bytes_of(header));
}
```
`HeaderRef` and `HeaderRefMut` are used in the `with_header*` methods,
but also can be used on its own when there are multiple reads and writes
to the header, where putting everything in a closure would add too much
nesting.

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2234
2025-07-31 10:02:47 +03:00
PThorpe92
ef69df7258 Apply review suggestions 2025-07-30 19:42:53 -04:00
PThorpe92
62f004c898 Fix write counter for writev batching in checkpoint 2025-07-30 19:42:49 -04:00
PThorpe92
7b2163208b batch backfilling pages when checkpointing 2025-07-30 19:42:48 -04:00
Levy A.
fe66c61ff5 add usable_space to DatabaseHeader
we already have the `DatabaseHeader`, we don't need the cached result
2025-07-30 17:33:59 -03:00
Levy A.
e35fdb8263 feat: zero-copy DatabaseHeader 2025-07-30 17:33:59 -03:00
Jussi Saurio
7240d7903c fmt 2025-07-30 18:22:17 +03:00
Jussi Saurio
c00d1fcfc0 fmt 2025-07-30 17:21:29 +03:00
Jussi Saurio
66c4b44c55 pager: call rollback() after ending txn so that read lock info is not lost when ending txn 2025-07-30 17:21:19 +03:00
Jussi Saurio
7b1f04dc5e pager: only ROLLBACK your own transaction, not if someone else is writing 2025-07-30 17:00:38 +03:00
Jussi Saurio
b1aa13375d call pager.end_tx() everywhere instead of pager.rollback() 2025-07-30 16:39:38 +03:00
PThorpe92
4dc15492d8 Integrate changes from tx isolation commits from @jussisaurio 2025-07-30 14:10:12 +03:00
PThorpe92
8806b77d26 Clear snapshot and readmark/lock index flags on failure 2025-07-30 14:09:18 +03:00
PThorpe92
3e75444388 Remove panic in cacheflush io.block in pager now that checkpoitns can return busy 2025-07-30 14:08:33 +03:00
PThorpe92
eaa6f99fa8 Hold and ensure release of proper locks if we trunc the db file post-checkpoint 2025-07-30 14:08:33 +03:00
PThorpe92
9b7e5ed292 Trunc db file after backfilling everything in calling method 2025-07-30 14:08:33 +03:00
PThorpe92
1a9b7ef76e Add support for truncate, restart and full checkpointing methods 2025-07-30 14:08:31 +03:00