Commit Graph

3896 Commits

Author SHA1 Message Date
Jussi Saurio
4cf02dfd14 Merge 'coalesce any adjacent buffers from writev calls into fewer iovecs' from Preston Thorpe
In `io_uring` and `unix` IO backends, we can check if our buffers are
sequential in memory and reduce the number of iovecs per call. Although
this is highly unlikely to actually happen at the moment due to our
buffer pool implementation.
Later on, when #2419 is merged, we will be able to specifically request
runs of contiguous buffers, so that our `writev` calls will (in the
ideal case) be coalesced into a single `pwrite` or preferrably
`WriteFixed` operation on the io_uring backend.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2436
2025-08-04 23:54:57 +03:00
Jussi Saurio
13219dbf87 Merge 'extend raw WAL API with few more methods' from Nikita Sivukhin
This PR extends raw WAL API with few methods which will be helpful for
offline-sync:
1. `try_wal_watermark_read_page` - try to read page from the DB with
given WAL watermark value\
    * Usually, WAL max_frame is set automatically to the latest value
(`shared.max_frame`) when transaction is started and then this
"watermark" is preserved throughout whole transaction
    * New method allows to simulate "read from the past" by controlling
frame watermark explicitly
    * There is an alternative to implement some API like
`start_read_session(frame_watermark: u64)` - but I decided to expose
just single method to simplify the logic and reduce "surface" of actions
which can be executed in this "controllable" manner
    * Also, for simplicity, now `try_wal_watermark_read_page` always
read data from disk and bypass any cached values (and also do not
populate the cache)
2. `wal_changed_pages_after` - return set of unique pages changed after
watermark WAL position in the current WAL session
With these 2 methods we can implement `REVERT frame_watermark` logic
which will just fetch all changed pages first, and then revert them to
the previous value by using `try_wal_watermark_read_page` and
`wal_insert_frame` methods (see `test_wal_api_revert_pages` test).
Note, that if there were schema changes - than `REVERT` logic described
above can bring connection to the inconsistent state, as it will
preserve schema information in memory and will still think that table
exist (while it can be reverted). This should be considered by any
consumer of this new methods.

Closes #2433
2025-08-04 23:53:46 +03:00
PThorpe92
2a3fa0955f Attempt to coalesce contiguous iovecs during pwritev operation for unix IO 2025-08-04 16:18:19 -04:00
PThorpe92
b76ef20f4c Attempt to coalesce contiguous iovecs during pwritev operation for io_uring 2025-08-04 16:18:05 -04:00
PThorpe92
6cbc8ff868 Replace values with constants 2025-08-04 15:14:06 -04:00
PThorpe92
73d1fdef14 Fix and change bitmap, apply suggestions and add some optimizations 2025-08-04 14:58:58 -04:00
PThorpe92
f4197f1eb5 change debug assertions to turso asserts 2025-08-04 14:55:48 -04:00
PThorpe92
54696d2f0d Add additional test for edge cases 2025-08-04 14:55:48 -04:00
PThorpe92
5378195ad6 Add page bitmap to storage mod.rs 2025-08-04 14:55:48 -04:00
PThorpe92
3e30335ea5 Add tests for PageBitmap 2025-08-04 14:55:48 -04:00
PThorpe92
7b1f908c00 Add PageBitmap for use with arena page allocator 2025-08-04 14:55:48 -04:00
Nikita Sivukhin
443c177d13 fix test for windows 2025-08-04 21:00:29 +04:00
Jussi Saurio
7045d44fdc Merge 'fix/wal: remove start_pages_in_frames_hack to prevent checkpoint data loss' from Jussi Saurio
Closes #2421
## Background
We have some kind of transaction-local hack (`start_pages_in_frames`)
for bookkeeping how many pages are currently in the in-memory WAL frame
cache, I assume for performance reasons or whatever.
`wal.rollback()` clears all the frames from `shared.frame_cache` that
the rollbacking tx is allowed to clear, and then truncates
`shared.pages_in_frames` to however much its local
`start_pages_in_frames` value was.
## Problem
In `complete_append_frame`, we check if `frame_cache` has that key
(page) already, and if not, we add it to `pages_in_frames`.
However, `wal.rollback()` never _removes_ the key (page) if its value is
empty, so we can end up in a scenario where the `frame_cache` key for
`page P` exists but has no frames, and so `page P` does not get added to
`pages_in_frames` in `complete_append_frame`.
This leads to a checkpoint data loss scenario:
- transaction rolls back, has start_pages_in_frames=0, so truncates
shared pages_in_frames to an empty vec. let's say `page P` key in
`frame_cache` still remains but it has no frames.
- The next time someone commits a frame for `page P`, it does NOT get
added to `pages_in_frames` because `frame_cache` has that key (although
the value vector is empty)
- At some point, a checkpoint checkpoints `n` frames, but since
`pages_in_frames` does not have `page P`, it doesn't actually checkpoint
it and all the "checkpointed" frames are simply thrown away
- very similar to the scenario in #2366
## Fix
Remove the `start_pages_in_frames` hack entirely and just make
`pages_in_frames` effectively the same as `frame_cache.keys`. I think we
could also just get rid of `pages_in_frames` and just use
`frame_cache.contains_key(p)` but maybe Pere can chime in here

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #2422
2025-08-04 19:49:55 +03:00
Jussi Saurio
506bb5f67f Merge 'Direct schema mutation – add instruction' from Levy A.
Resolves #2378.
```
`ALTER TABLE _ RENAME TO _`/limbo_rename_table/
                        time:   [15.645 ms 15.741 ms 15.850 ms]
Found 12 outliers among 100 measurements (12.00%)
  8 (8.00%) high mild
  4 (4.00%) high severe
`ALTER TABLE _ RENAME TO _`/sqlite_rename_table/
                        time:   [34.728 ms 35.260 ms 35.955 ms]
Found 15 outliers among 100 measurements (15.00%)
  8 (8.00%) high mild
  7 (7.00%) high severe
  ```
<img width="1000" height="199" alt="image" src="https://github.com/user-
attachments/assets/ad943355-b57d-43d9-8a84-850461b8af41" />

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2399
2025-08-04 16:55:38 +03:00
Jussi Saurio
1813171b91 Merge 'Use pwrite for single buffer pwritev call in unix IO' from Preston Thorpe
Closes #2416
2025-08-04 16:52:14 +03:00
Jussi Saurio
5a06411ce6 Merge 'fix/core/translate: ALTER TABLE DROP COLUMN: ensure schema cookie is updated even when target table is empty' from Jussi Saurio
Closes #2431
Discovered while fuzzing #2086
## What
We update `schema_version` whenever the schema changes
## Problem
Probably unintentionally, we were calling `SetCookie` in a loop for each
row in the target table, instead of only once at the end. This means 2
things:
- For large `n`, this is a lot of unnecessary instructions
- For `n==0`, `SetCookie` doesn't get called at all -> the schema won't
get marked as having been updated -> conns can operate on a stale schema
## Fix
Lift `SetCookie` out of the loop

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2432
2025-08-04 16:51:24 +03:00
Nikita Sivukhin
33b814054b fix tests for windows 2025-08-04 17:37:22 +04:00
Nikita Sivukhin
9694366645 add one more assert 2025-08-04 17:23:34 +04:00
Nikita Sivukhin
76bdf0c1ab small fixes 2025-08-04 17:02:53 +04:00
Nikita Sivukhin
2e23230e79 extend raw WAL API with few more methods
- try_wal_watermark_read_page - try to read page from the DB with given WAL watermark value
- wal_changed_pages_after - return set of unique pages changed after watermark WAL position
2025-08-04 16:55:50 +04:00
Jussi Saurio
8a1723b3c8 fix/core/translate: ALTER TABLE DROP COLUMN: ensure schema cookie is updated even when target table is empty 2025-08-04 15:05:00 +03:00
Pekka Enberg
e4accdc29d Merge 'hide dangerous methods behind conn_raw_api feature' from Nikita Sivukhin
WAL API shouldn't be exposed by default because this is relatively
dangerous API which we use internally and ordinary users shouldn't not
be interested in it.

Reviewed-by: Pekka Enberg <penberg@iki.fi>

Closes #2424
2025-08-04 14:52:40 +03:00
Pekka Enberg
1572285ee6 Merge 'preserve files in IO memory backend' from Nikita Sivukhin
Simple PR to preserve and reuse files in memory IO

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2428
2025-08-04 14:52:24 +03:00
Nikita Sivukhin
129895f0b2 preserve files in IO memory backend 2025-08-04 15:22:04 +04:00
Pere Diaz Bou
56240ddac9 core/mvcc: add restart tests 2025-08-04 12:31:17 +02:00
Pere Diaz Bou
f26e442597 core/mvcc: fix new rowid
next rowid was being tracked globally for all tables and restarted to 0
every time database was opened
2025-08-04 12:31:17 +02:00
Pere Diaz Bou
83a658d3d6 core/mvcc: add option to test with a random file and restart it 2025-08-04 12:31:17 +02:00
Nikita Sivukhin
0adb40534c hind dangerous methods behind conn_raw_api feature 2025-08-04 12:40:28 +04:00
Jussi Saurio
4f3f66d55e fix/wal: remove start_pages_in_frames_hack to prevent checkpoint data loss
We have some kind of transaction-local hack (`start_pages_in_frames`) for bookkeeping
how many pages are currently in the in-memory WAL frame cache,
I assume for performance reasons or whatever.

`wal.rollback()` clears all the frames from `shared.frame_cache` that the rollbacking tx is
allowed to clear, and then truncates `shared.pages_in_frames` to however much its local
`start_pages_in_frames` value was.

In `complete_append_frame`, we check if `frame_cache` has that key (page) already, and if not,
we add it to `pages_in_frames`.

However, `wal.rollback()` never _removes_ the key (page) if its value is empty, so we can end
up in a scenario where the `frame_cache` key for `page P` exists but has no frames, and so `page P`
does not get added to `pages_in_frames` in `complete_append_frame`.

This leads to a checkpoint data loss scenario:

- transaction rolls back, has start_pages_in_frames=0, so truncates
  shared pages_in_frames to an empty vec. let's say `page P` key in `frame_cache` still remains
  but it has no frames.
- The next time someone commits a frame for `page P`, it does NOT get added to `pages_in_frames`
  because `frame_cache` has that key
- At some point, a PASSIVE checkpoint checkpoints `n` frames, but since `pages_in_frames` does not have
  `page P`, it doesn't actually checkpoint it and all the "checkpointed" frames are simply thrown away
- very similar to the scenario in #2366

Remove the `start_pages_in_frames` hack entirely and just make `pages_in_frames` effectively
the same as `frame_cache.keys`. I think we could also just get rid of `pages_in_frames` and just use
`frame_cache.contains_key(p)` but maybe Pere can chime in here
2025-08-04 10:35:12 +03:00
Pekka Enberg
ca14799da5 Merge 'Make completions idempotent' from Preston Thorpe
Closes #2417
2025-08-04 08:42:42 +03:00
PThorpe92
79629daff4 Make completions idempotent 2025-08-02 21:48:39 -04:00
Levy A.
b9a3a93ef0 fix: clippy 2025-08-02 20:06:05 -03:00
PThorpe92
b5117ac5c7 Use pwrite for single buffer in unix IO 2025-08-02 18:34:16 -04:00
Levy A.
b14a11a2fd fix: change name for schema btree + fix benchmark 2025-08-02 17:17:36 -03:00
Jussi Saurio
130e1f80ea fix/vdbe: call seek_to_last() only once in op_new_rowid 2025-08-02 14:18:58 +03:00
Jussi Saurio
63a5ef596b perf/btree: skip seek in move_to_rightmost() if we are already on rightmost page 2025-08-02 13:56:59 +03:00
Jussi Saurio
3b0c8b08fe Merge 'perf/pager: dont clear page cache on commit' from Jussi Saurio
This should be safe to do as:
1. page cache is private per connection
2. since this connection wrote the flushed pages/frames, they are up to
date from its perspective
3. multiple concurrent statements inside one connection are not
snapshot-transactional even in sqlite

Reviewed-by: Pekka Enberg <penberg@iki.fi>

Closes #2407
2025-08-02 13:35:57 +03:00
Jussi Saurio
4497d22d3f perf/pager: dont clear page cache on commit 2025-08-02 13:09:36 +03:00
Pekka Enberg
12455c6531 Merge 'core: Fold HeaderRef to pager module' from Pekka Enberg
Closes #2401
2025-08-02 12:34:13 +03:00
Pekka Enberg
2c05a3e787 Merge 'perf/vdbe: remove eager cloning in op_comparison' from Jussi Saurio
Shaves off about 100-200ms of runtime from TPC-H `19.sql`

Closes #2385
2025-08-02 10:01:47 +03:00
Pekka Enberg
598fdade3e core: Fold HeaderRef to pager module 2025-08-02 09:50:25 +03:00
Jussi Saurio
43c1afe4b6 Merge 'bindings/rust: Enhance API by removing verbosity' from Diego Reis
While working on #2151 I saw myself forced to do things like:
```rust
assert_eq!(
                6,
                *result
                    .next()
                    .await?
                    .unwrap()
                    .get_value(0)?
                    .as_integer()
                    .unwrap()
            );
```
Just to get a simple value from a row, now with this PR users can just
do:
```rust
assert_eq!(6, result.get::<i32>(0)?);
```
(Thanks libsql devs, this is so much better!)

Closes #2377
2025-08-02 09:39:27 +03:00
Jussi Saurio
c6b178483b Merge 'io_uring: setup plumbing for Fixed opcodes' from Preston Thorpe
This PR by itself is uninteresting and doesn't do anything. But I am
heavily trying to avoid massive PR's, and this is very merge-able
😄

Closes #2396
2025-08-02 09:37:48 +03:00
Jussi Saurio
be1456f7cb Merge 'use state machine for NoConflict opcode' from Mikaël Francoeur
This will save some work when yielding to IO. Previously, on every
invocation, if the record was a packed record, we parsed it and iterated
through the values to check for nulls. Now, the pre-seeking work is done
only once.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2394
2025-08-02 09:37:00 +03:00
Jussi Saurio
37a565021e Merge 'state_machine: remove State associated type' from Pere Diaz Bou
Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2395
2025-08-02 09:36:43 +03:00
Levy A.
1e177053cb feat: add RenameTable instruction
direct schema mutation, no reparsing
2025-08-01 21:11:25 -03:00
Mikaël Francoeur
81412b4a17 use state machine for NoConflict opcode 2025-08-01 17:29:57 -04:00
rajajisai
f6d43df46f Merge branch 'tursodatabase:main' into issue/2077 2025-08-01 15:20:36 -04:00
Diego Reis
8a47b9d5a4 Address PR's comments 2025-08-01 16:00:32 -03:00
Diego Reis
d8af28ddf0 Implement FromValue to common Rust's types
One step further to help to simplify the API for users.

This is in core and not in Rust bind because, in core,
this could benefit a broader set of users/developers
2025-08-01 16:00:30 -03:00