This should be safe to do as:
1. page cache is private per connection
2. since this connection wrote the flushed pages/frames, they are up to
date from its perspective
3. multiple concurrent statements inside one connection are not
snapshot-transactional even in sqlite
Reviewed-by: Pekka Enberg <penberg@iki.fi>
Closes#2407
While working on #2151 I saw myself forced to do things like:
```rust
assert_eq!(
6,
*result
.next()
.await?
.unwrap()
.get_value(0)?
.as_integer()
.unwrap()
);
```
Just to get a simple value from a row, now with this PR users can just
do:
```rust
assert_eq!(6, result.get::<i32>(0)?);
```
(Thanks libsql devs, this is so much better!)
Closes#2377
This will save some work when yielding to IO. Previously, on every
invocation, if the record was a packed record, we parsed it and iterated
through the values to check for nulls. Now, the pre-seeking work is done
only once.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#2394
One step further to help to simplify the API for users.
This is in core and not in Rust bind because, in core,
this could benefit a broader set of users/developers
Unfortunately it seems we are never reaching the point to remove state
machines, so might as well make it easier to make.
There are two points that must be highlighted:
1. There is a `StateTransition` trait implemented like:
```rust
pub trait StateTransition {
type State;
type Context;
fn transition<'a>(&mut self, context: &Self::Context) ->
Result<TransitionResult>;
fn finalize<'a>(&mut self, context: &Self::Context) -> Result<()>;
fn is_finalized(&self) -> bool;
}
```
where there exists `transition` which tries to move state forward, and
`finalize` which marks the state machine as "finalized" so that **no
other call to finalize will forward the state and it will panic instead.
2. Before, we would store the state of a state machine inside the
callee's struct, but I'm proposing we do something different where the
callee will return the state machine and the caller will be responsible
of advancing it. This way we don't need to track many reset operations
in case of failures or rollbacks, and instead we could simply drop a
state machine and all other nested state machines will drop in a
cascade.
Closes#2384
Unfortunately it seems we are never reaching the point to remove state
machines, so might as well make it easier to make.
There are two points that must be highlighted:
1. There is a `StateTransition` trait implemented like:
```rust
pub trait StateTransition {
type State;
type Context;
fn transition<'a>(&mut self, context: &Self::Context) ->
Result<TransitionResult>;
fn finalize<'a>(&mut self, context: &Self::Context) -> Result<()>;
fn is_finalized(&self) -> bool;
}
```
where there exists `transition` which tries to move state forward, and
`finalize` which marks the state machine as "finalized" so that **no
other call to finalize will forward the state and it will panic instead.
2. Before, we would store the state of a state machine inside the
callee's struct, but I'm proposing we do something different where the
callee will return the state machine and the caller will be responsible
of advancing it. This way we don't need to track many reset operations
in case of failures or rollbacks, and instead we could simply drop a
state machine and all other nested state machines will drop in a
cascade.
if the table is an intkey table, we can read the rowid directly without
deserializing the full cell, and we also don't need to start
deserializing the record if only the rowid is requested.
```sql
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/1: Collecting 100 samples in estimated 5.0007 s (11M i
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/1
time: [469.38 ns 470.77 ns 472.40 ns]
change: [-5.8959% -5.5232% -5.1840%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/10: Collecting 100 samples in estimated 5.0088 s (1.9M
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/10
time: [2.6523 µs 2.6596 µs 2.6685 µs]
change: [-8.7117% -8.4083% -8.0949%] (p = 0.00 < 0.05)
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) low mild
3 (3.00%) high mild
3 (3.00%) high severe
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/50: Collecting 100 samples in estimated 5.0197 s (399k
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/50
time: [12.514 µs 12.545 µs 12.578 µs]
change: [-9.5243% -9.0562% -8.6227%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/100: Collecting 100 samples in estimated 5.0600 s (202
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/100
time: [25.135 µs 25.291 µs 25.470 µs]
change: [-8.8822% -8.3943% -7.8854%] (p = 0.00 < 0.05)
Performance has improved.
```
"only" 4x slower than sqlite on `SELECT * FROM users LIMIT 100` after
this!
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#2382
* Previously, deserializing an empty vector used `Vec::new()`, resulting
in zero capacity, which is not guaranteed to be aligned for `f32`/`f64`.
This could lead to undefined behavior when interpreting the data.
* We also inconsistently treated empty input: `"[]"` (text) was accepted
as a zero-length vector, but empty blobs (`&[]`) were rejected.
* Now:
* We initialize empty vectors with at least one element’s capacity to
preserve alignment.
* We allow zero-sized blobs and treat them the same as `"[]""` input
as empty vectors.
Closes#2371
if the table is an intkey table, we can read the rowid directly
without deserializing the full cell, and we also don't need to start
deserializing the record if only the rowid is requested.
```sql
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/1: Collecting 100 samples in estimated 5.0007 s (11M i
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/1
time: [469.38 ns 470.77 ns 472.40 ns]
change: [-5.8959% -5.5232% -5.1840%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/10: Collecting 100 samples in estimated 5.0088 s (1.9M
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/10
time: [2.6523 µs 2.6596 µs 2.6685 µs]
change: [-8.7117% -8.4083% -8.0949%] (p = 0.00 < 0.05)
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) low mild
3 (3.00%) high mild
3 (3.00%) high severe
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/50: Collecting 100 samples in estimated 5.0197 s (399k
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/50
time: [12.514 µs 12.545 µs 12.578 µs]
change: [-9.5243% -9.0562% -8.6227%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/100: Collecting 100 samples in estimated 5.0600 s (202
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/100
time: [25.135 µs 25.291 µs 25.470 µs]
change: [-8.8822% -8.3943% -7.8854%] (p = 0.00 < 0.05)
Performance has improved.
```
We need to load rowids into mvcc's store in order before doing any read
in case there are rows.
This has a performance penalty for now as expected because we should,
ideally, scan for row ids lazily instead.
On Mvcc `commit_txn` we need to persist changes to database, for this case we re-use pager's semantics of transactions:
1. If there are no conflicts, we start `pager.begin_write_txn`
2. `pager.end_txn`: We flush changes to WAL
3. We finish Mvcc transaction by marking rows with new timestamp.
## What
The following sequence of actions is possible:
```sql
-- TRUNCATE checkpoint fails during WAL restart,
-- but OngoingCheckpoint.state is still left at Done for conn 0
Connection 0(op=23): PRAGMA wal_checkpoint(TRUNCATE)
Connection 0(op=23) Checkpoint TRUNCATE: OK: false, wal_page_count: NULL, checkpointed_count: NULL
-- TRUNCATE checkpoint succeeds for conn 1
Connection 1(op=26): PRAGMA wal_checkpoint(TRUNCATE)
Connection 1(op=26) Checkpoint TRUNCATE: OK: true, wal_page_count: 0, checkpointed_count: 0
-- Conn 0 now does a PASSIVE checkpoint, and immediately thinks
-- it's in the Done state, and thinks it checkpointed 17 frames.
-- since mode is PASSIVE, it now thinks both the WAL and the DB have those 17 frames
-- so the first 17 frames of the WAL can be ignored from now on.
Connection 0(op=27): PRAGMA wal_checkpoint(PASSIVE)
Connection 0(op=27) Checkpoint PASSIVE: OK: true, wal_page_count: 0, checkpointed_count: 17
-- Connection 0 starts a txn with min=18 (ignore first 17 frames in WAL),
-- and deletes rowid=690, which becomes WAL frame number 1
Connection 0(op=28): DELETE FROM test_table WHERE id = 690
begin_read_tx(min=18, max=0, slot=1, max_frame_in_wal=0)
-- Connection 1 starts a txn with min=18 (ignore first 17 frames in WAL),
-- and inserts rowid=1128, which becomes WAL frame number 2
Connection 1(op=28): INSERT INTO test_table (id, text) VALUES (1128, text_560)
begin_read_tx(min=18, max=1, slot=1, max_frame_in_wal=1)
-- Connection 0 again starts tx with min=18, and performs a read, and two wrong things happen:
-- 1. it doesn't see row 690 as deleted, because it's in WAL frame 1, which it ignores
-- 2. it doesn't see the new row 1128, because it's in WAL frame 2, which it ignores
Connection 0(op=29): SELECT * FROM test_table
begin_read_tx(min=18, max=2, slot=1, max_frame_in_wal=2)
```
## Fix
Reset `ongoing_checkpoint.state` to `Start` when checkpoint fails.
Issue found in #2364 .
Reviewed-by: bit-aloo (@Shourya742)
Closes#2380