Commit Graph

3880 Commits

Author SHA1 Message Date
Jussi Saurio
7045d44fdc Merge 'fix/wal: remove start_pages_in_frames_hack to prevent checkpoint data loss' from Jussi Saurio
Closes #2421
## Background
We have some kind of transaction-local hack (`start_pages_in_frames`)
for bookkeeping how many pages are currently in the in-memory WAL frame
cache, I assume for performance reasons or whatever.
`wal.rollback()` clears all the frames from `shared.frame_cache` that
the rollbacking tx is allowed to clear, and then truncates
`shared.pages_in_frames` to however much its local
`start_pages_in_frames` value was.
## Problem
In `complete_append_frame`, we check if `frame_cache` has that key
(page) already, and if not, we add it to `pages_in_frames`.
However, `wal.rollback()` never _removes_ the key (page) if its value is
empty, so we can end up in a scenario where the `frame_cache` key for
`page P` exists but has no frames, and so `page P` does not get added to
`pages_in_frames` in `complete_append_frame`.
This leads to a checkpoint data loss scenario:
- transaction rolls back, has start_pages_in_frames=0, so truncates
shared pages_in_frames to an empty vec. let's say `page P` key in
`frame_cache` still remains but it has no frames.
- The next time someone commits a frame for `page P`, it does NOT get
added to `pages_in_frames` because `frame_cache` has that key (although
the value vector is empty)
- At some point, a checkpoint checkpoints `n` frames, but since
`pages_in_frames` does not have `page P`, it doesn't actually checkpoint
it and all the "checkpointed" frames are simply thrown away
- very similar to the scenario in #2366
## Fix
Remove the `start_pages_in_frames` hack entirely and just make
`pages_in_frames` effectively the same as `frame_cache.keys`. I think we
could also just get rid of `pages_in_frames` and just use
`frame_cache.contains_key(p)` but maybe Pere can chime in here

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #2422
2025-08-04 19:49:55 +03:00
Jussi Saurio
506bb5f67f Merge 'Direct schema mutation – add instruction' from Levy A.
Resolves #2378.
```
`ALTER TABLE _ RENAME TO _`/limbo_rename_table/
                        time:   [15.645 ms 15.741 ms 15.850 ms]
Found 12 outliers among 100 measurements (12.00%)
  8 (8.00%) high mild
  4 (4.00%) high severe
`ALTER TABLE _ RENAME TO _`/sqlite_rename_table/
                        time:   [34.728 ms 35.260 ms 35.955 ms]
Found 15 outliers among 100 measurements (15.00%)
  8 (8.00%) high mild
  7 (7.00%) high severe
  ```
<img width="1000" height="199" alt="image" src="https://github.com/user-
attachments/assets/ad943355-b57d-43d9-8a84-850461b8af41" />

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2399
2025-08-04 16:55:38 +03:00
Jussi Saurio
1813171b91 Merge 'Use pwrite for single buffer pwritev call in unix IO' from Preston Thorpe
Closes #2416
2025-08-04 16:52:14 +03:00
Jussi Saurio
5a06411ce6 Merge 'fix/core/translate: ALTER TABLE DROP COLUMN: ensure schema cookie is updated even when target table is empty' from Jussi Saurio
Closes #2431
Discovered while fuzzing #2086
## What
We update `schema_version` whenever the schema changes
## Problem
Probably unintentionally, we were calling `SetCookie` in a loop for each
row in the target table, instead of only once at the end. This means 2
things:
- For large `n`, this is a lot of unnecessary instructions
- For `n==0`, `SetCookie` doesn't get called at all -> the schema won't
get marked as having been updated -> conns can operate on a stale schema
## Fix
Lift `SetCookie` out of the loop

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2432
2025-08-04 16:51:24 +03:00
Jussi Saurio
8a1723b3c8 fix/core/translate: ALTER TABLE DROP COLUMN: ensure schema cookie is updated even when target table is empty 2025-08-04 15:05:00 +03:00
Pekka Enberg
e4accdc29d Merge 'hide dangerous methods behind conn_raw_api feature' from Nikita Sivukhin
WAL API shouldn't be exposed by default because this is relatively
dangerous API which we use internally and ordinary users shouldn't not
be interested in it.

Reviewed-by: Pekka Enberg <penberg@iki.fi>

Closes #2424
2025-08-04 14:52:40 +03:00
Pekka Enberg
1572285ee6 Merge 'preserve files in IO memory backend' from Nikita Sivukhin
Simple PR to preserve and reuse files in memory IO

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2428
2025-08-04 14:52:24 +03:00
Nikita Sivukhin
129895f0b2 preserve files in IO memory backend 2025-08-04 15:22:04 +04:00
Pere Diaz Bou
56240ddac9 core/mvcc: add restart tests 2025-08-04 12:31:17 +02:00
Pere Diaz Bou
f26e442597 core/mvcc: fix new rowid
next rowid was being tracked globally for all tables and restarted to 0
every time database was opened
2025-08-04 12:31:17 +02:00
Pere Diaz Bou
83a658d3d6 core/mvcc: add option to test with a random file and restart it 2025-08-04 12:31:17 +02:00
Nikita Sivukhin
0adb40534c hind dangerous methods behind conn_raw_api feature 2025-08-04 12:40:28 +04:00
Jussi Saurio
4f3f66d55e fix/wal: remove start_pages_in_frames_hack to prevent checkpoint data loss
We have some kind of transaction-local hack (`start_pages_in_frames`) for bookkeeping
how many pages are currently in the in-memory WAL frame cache,
I assume for performance reasons or whatever.

`wal.rollback()` clears all the frames from `shared.frame_cache` that the rollbacking tx is
allowed to clear, and then truncates `shared.pages_in_frames` to however much its local
`start_pages_in_frames` value was.

In `complete_append_frame`, we check if `frame_cache` has that key (page) already, and if not,
we add it to `pages_in_frames`.

However, `wal.rollback()` never _removes_ the key (page) if its value is empty, so we can end
up in a scenario where the `frame_cache` key for `page P` exists but has no frames, and so `page P`
does not get added to `pages_in_frames` in `complete_append_frame`.

This leads to a checkpoint data loss scenario:

- transaction rolls back, has start_pages_in_frames=0, so truncates
  shared pages_in_frames to an empty vec. let's say `page P` key in `frame_cache` still remains
  but it has no frames.
- The next time someone commits a frame for `page P`, it does NOT get added to `pages_in_frames`
  because `frame_cache` has that key
- At some point, a PASSIVE checkpoint checkpoints `n` frames, but since `pages_in_frames` does not have
  `page P`, it doesn't actually checkpoint it and all the "checkpointed" frames are simply thrown away
- very similar to the scenario in #2366

Remove the `start_pages_in_frames` hack entirely and just make `pages_in_frames` effectively
the same as `frame_cache.keys`. I think we could also just get rid of `pages_in_frames` and just use
`frame_cache.contains_key(p)` but maybe Pere can chime in here
2025-08-04 10:35:12 +03:00
Pekka Enberg
ca14799da5 Merge 'Make completions idempotent' from Preston Thorpe
Closes #2417
2025-08-04 08:42:42 +03:00
PThorpe92
79629daff4 Make completions idempotent 2025-08-02 21:48:39 -04:00
Levy A.
b9a3a93ef0 fix: clippy 2025-08-02 20:06:05 -03:00
PThorpe92
b5117ac5c7 Use pwrite for single buffer in unix IO 2025-08-02 18:34:16 -04:00
Levy A.
b14a11a2fd fix: change name for schema btree + fix benchmark 2025-08-02 17:17:36 -03:00
Jussi Saurio
130e1f80ea fix/vdbe: call seek_to_last() only once in op_new_rowid 2025-08-02 14:18:58 +03:00
Jussi Saurio
63a5ef596b perf/btree: skip seek in move_to_rightmost() if we are already on rightmost page 2025-08-02 13:56:59 +03:00
Jussi Saurio
3b0c8b08fe Merge 'perf/pager: dont clear page cache on commit' from Jussi Saurio
This should be safe to do as:
1. page cache is private per connection
2. since this connection wrote the flushed pages/frames, they are up to
date from its perspective
3. multiple concurrent statements inside one connection are not
snapshot-transactional even in sqlite

Reviewed-by: Pekka Enberg <penberg@iki.fi>

Closes #2407
2025-08-02 13:35:57 +03:00
Jussi Saurio
4497d22d3f perf/pager: dont clear page cache on commit 2025-08-02 13:09:36 +03:00
Pekka Enberg
12455c6531 Merge 'core: Fold HeaderRef to pager module' from Pekka Enberg
Closes #2401
2025-08-02 12:34:13 +03:00
Pekka Enberg
2c05a3e787 Merge 'perf/vdbe: remove eager cloning in op_comparison' from Jussi Saurio
Shaves off about 100-200ms of runtime from TPC-H `19.sql`

Closes #2385
2025-08-02 10:01:47 +03:00
Pekka Enberg
598fdade3e core: Fold HeaderRef to pager module 2025-08-02 09:50:25 +03:00
Jussi Saurio
43c1afe4b6 Merge 'bindings/rust: Enhance API by removing verbosity' from Diego Reis
While working on #2151 I saw myself forced to do things like:
```rust
assert_eq!(
                6,
                *result
                    .next()
                    .await?
                    .unwrap()
                    .get_value(0)?
                    .as_integer()
                    .unwrap()
            );
```
Just to get a simple value from a row, now with this PR users can just
do:
```rust
assert_eq!(6, result.get::<i32>(0)?);
```
(Thanks libsql devs, this is so much better!)

Closes #2377
2025-08-02 09:39:27 +03:00
Jussi Saurio
c6b178483b Merge 'io_uring: setup plumbing for Fixed opcodes' from Preston Thorpe
This PR by itself is uninteresting and doesn't do anything. But I am
heavily trying to avoid massive PR's, and this is very merge-able
😄

Closes #2396
2025-08-02 09:37:48 +03:00
Jussi Saurio
be1456f7cb Merge 'use state machine for NoConflict opcode' from Mikaël Francoeur
This will save some work when yielding to IO. Previously, on every
invocation, if the record was a packed record, we parsed it and iterated
through the values to check for nulls. Now, the pre-seeking work is done
only once.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2394
2025-08-02 09:37:00 +03:00
Jussi Saurio
37a565021e Merge 'state_machine: remove State associated type' from Pere Diaz Bou
Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2395
2025-08-02 09:36:43 +03:00
Levy A.
1e177053cb feat: add RenameTable instruction
direct schema mutation, no reparsing
2025-08-01 21:11:25 -03:00
Mikaël Francoeur
81412b4a17 use state machine for NoConflict opcode 2025-08-01 17:29:57 -04:00
rajajisai
f6d43df46f Merge branch 'tursodatabase:main' into issue/2077 2025-08-01 15:20:36 -04:00
Diego Reis
8a47b9d5a4 Address PR's comments 2025-08-01 16:00:32 -03:00
Diego Reis
d8af28ddf0 Implement FromValue to common Rust's types
One step further to help to simplify the API for users.

This is in core and not in Rust bind because, in core,
this could benefit a broader set of users/developers
2025-08-01 16:00:30 -03:00
rajajisai
d09dd4170b Format code 2025-08-01 11:59:57 -07:00
Preston Thorpe
15e43185bb Merge 'Single quotes inside a string literal have to be doubled in ' from Diego Reis
Close #2390
Single quotes inside a string literal have to be doubled

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2392
2025-08-01 14:57:46 -04:00
PThorpe92
b8ed4358f1 register buffers sparse on ring initiate to support fixed operations 2025-08-01 14:56:43 -04:00
PThorpe92
9289dd7e9a Implement register_fixed_buffer for io_uring IO backend 2025-08-01 14:55:35 -04:00
PThorpe92
3048e4fa97 Add optional register_fixed_buffer method to IO trait 2025-08-01 14:54:26 -04:00
Mikaël Francoeur
444a7bb5ac wrap doc 2025-08-01 14:53:46 -04:00
Pere Diaz Bou
f9e1d9bb40 state_machine: remove State associated type 2025-08-01 20:04:27 +02:00
rajajisai
30c059483e Parse value as float if it cannot be parsed as integer(when the value cannot fit in i64) 2025-08-01 10:49:40 -07:00
rajajisai
7e84148883 Fix integer overflow check in number parser 2025-08-01 10:10:02 -07:00
Pekka Enberg
d161c2652c Merge 'core/mvcc: Move commit_txn() to generic state machinery ' from Pere Diaz Bou
Unfortunately it seems we are never reaching the point to remove state
machines, so might as well make it easier to make.
There are two points that must be highlighted:
1. There is a `StateTransition` trait implemented like:
```rust
pub trait StateTransition {
    type State;
    type Context;

    fn transition<'a>(&mut self, context: &Self::Context) ->
Result<TransitionResult>;
    fn finalize<'a>(&mut self, context: &Self::Context) -> Result<()>;
    fn is_finalized(&self) -> bool;
}
```
where there exists `transition` which tries to move state forward, and
`finalize` which marks the state machine as "finalized" so that **no
other call to finalize will forward the state and it will panic instead.
2. Before, we would store the state of a state machine inside the
callee's struct, but I'm proposing we do something different where the
callee will return the state machine and the caller will be responsible
of advancing it. This way we don't need to track many reset operations
in case of failures or rollbacks, and instead we could simply drop a
state machine and all other nested state machines will drop in a
cascade.

Closes #2384
2025-08-01 19:28:16 +03:00
Diego Reis
7c70ac2c4a Fix #2390
Single quotes inside a string literal have to be doubled
2025-08-01 11:37:13 -03:00
Pere Diaz Bou
764523a8bb core/mvcc: fix tests with state machines 2025-08-01 15:48:09 +02:00
Jussi Saurio
86b1232268 chore: enable indexes by default 2025-08-01 15:44:56 +03:00
Pere Diaz Bou
69b20d9d43 state_machine: add result to StateTransition 2025-08-01 14:07:07 +02:00
Pere Diaz Bou
c3f00475eb state_machine: rename transition -> step 2025-08-01 13:56:57 +02:00
Jussi Saurio
d58d71ad1b perf/vdbe: remove eager cloning in op_comparison 2025-08-01 14:04:56 +03:00