in #2521, I messed up and introduced improper calculation of the current
checkpoint's max safe frame (mostly due to incorrect comments that I had
left on the method).
The confusion partially stems from our lack of Busy handling at the
moment, but essentially when determining the max safe frame for all
readers, for passive mode we cannot simply `break` out of the loop when
we find a reader with a lower read mark than we have, because _another_
reader might have an even _lower_ read mark, and we could proceed with
the first mark < shared_max.
And for !passive modes, we still attempt to backfill with the same lower
frame, we just return `Busy` at the end, after backfilling what we can
(we just don't reset the log for restart/truncate).
Most of the changes in this PR is just the renaming the fields of
Checkpoint Result, because the names were confusing
Closes#2560
We have to update the Transaction State before checking for the Schema
Cookie so that we can rollback the transaction later on correctly.
Closes#2535Closes#2549
Small contribution to my current work on making checkpointing efficient.
We hold a write lock, and especially here on main there is no reason to
mark the pages as dirty in the cache, so we can do away with that
Vec<u64> and just track whether it's `Done`
Closes#2545
Mainly the performance impact here comes from removing some unnecessary
checks and inlining `read_integer_fast()` directly into `op_column()`,
but I also added some fiddly nano-optimizations for fun
On main, we are roughly 3.4x slower than sqlite on `SELECT * FROM users
LIMIT 100`, and here we are roughly 3.2x slower, which ain't much, but
it's honest work.
A more impactful optimization, but a much more annoying refactor, would
be #2304Closes#2516
This PR adds new `updates` column to the CDC table. This column holds
updated fields of the row in the following format:
```
[C boolean values where true set for changed columns]
[C values with updates where NULL is set for not-changed columns]
```
For example:
```
turso> UPDATE t SET y = 'turso', q = 'db' WHERE rowid = 1;
turso> SELECT bin_record_json_object('["x","y","z","q","x","y","z","q"]', updates) as updates FROM turso_cdc;
┌──────────────────────────────────────────────────────────────────┐
│ updates │
├──────────────────────────────────────────────────────────────────┤
│ {"x":0,"y":1,"z":0,"q":1,"x":null,"y":"turso","z":null,"q":"db"} │
└──────────────────────────────────────────────────────────────────┘
```
Also, this column works differently for `ALTER TABLE` statements where
update value for `sql` will be equal to the original `ALTER TABLE`:
```
turso> ALTER TABLE t ADD COLUMN t;
turso> SELECT bin_record_json_object('["type","name","tbl_name","rootpage","sql","type","name","tbl_name","rootpage","sql"]', updates) as updates FROM turso_cdc WHERE rowid = 2;
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ updates │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ {"type":0,"name":0,"tbl_name":0,"rootpage":0,"sql":1,"type":null,"name":null,"tbl_name":null,"rootpage":null,"sql":"ALTER TABLE t ADD COLUMN t;"} │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
```
This will help turso-db to implement logical replication which supports
both column-level updates and schema changes
Closes#2538
1. Introduce state machines to insert and delete to make sure IO is
handled properly
2. When `rowid` is changed in `UPDATE`, it is handled as a combination
of delete and insert, so `op_insert` doesn't need to update the
incremental view with the deleted old column since the preceding
`op_delete` instruction should already do it.
Closes#2542
populate now has its own code path to apply changes to the view. It was
okay until now because all we do is filter. But now that we are also
applying aggregations, we'll end up with two disjoint code paths.
A better approach is to just apply the results of our select to the
delta set, and apply it.
I'm not sure how much this will clash with @TcMits's parser rewrite,
hopefully not too much. If it does and we eventually have to remove it,
at least we'll have two new regression tests.
Closes https://github.com/tursodatabase/turso/issues/2484Closes#2499
Use the same rusqlite version in every crate and use a bundled up-to-
date sqlite version
(the impetus for this PR is still me trying to figure out why sqlite in
the insert benchmark doesn't seem to be fsyncing, even when instructed)
Closes#2507
- When the rowid is changed in UPDATE, it is handled as a combination of DELETE + INSERT,
so we dont need to delete the old values in that case
- We should only update the views after the operation on the btree is done
- A proper state machine is needed to handle IO yielding points
Convert Sorter code to use state machines and not ignore completions.
Also simplifies some logic that seemed redundant to me. Also, I was
getting some IO errors because we were opening one file per Chunk, so I
fixed this by using only one file per sorter, and just using offsets in
the file for each chunk.
Builds on top of #2520Closes#2473
Implements `PRAGMA freelist_count` to return the current number of free
pages in the database.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2531
Currently, we have a borrow problem because parse_schema_rows() already
borrows `schema`, but then `apply_view_deltas` does the same:
```
thread 'main' panicked at core/vdbe/mod.rs:450:49:
already mutably borrowed: BorrowError
stack backtrace:
0: __rustc::rust_begin_unwind
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/panicking.rs:697:5
1: core::panicking::panic_fmt
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/core/src/panicking.rs:75:14
2: core::cell::panic_already_mutably_borrowed
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/core/src/cell.rs:799:5
3: core::cell::RefCell<T>::borrow
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/core/src/cell.rs:987:25
4: turso_core::vdbe::Program::apply_view_deltas
at ./core/vdbe/mod.rs:450:26
5: turso_core::vdbe::Program::commit_txn
at ./core/vdbe/mod.rs:468:9
6: turso_core::vdbe::execute::op_halt
at ./core/vdbe/execute.rs:1954:15
7: turso_core::vdbe::Program::step
at ./core/vdbe/mod.rs:430:19
8: turso_core::Statement::step
at ./core/lib.rs:1914:23
9: turso_core::util::parse_schema_rows
at ./core/util.rs:91:15
10: turso_core::Connection::parse_schema_rows::{{closure}}
at ./core/lib.rs:1518:17
11: turso_core::Connection::with_schema_mut
at ./core/lib.rs:1625:9
12: turso_core::Connection::parse_schema_rows
at ./core/lib.rs:1515:9
```
However, this is a read transaction and views are not even enabled,
let's just make `apply_view_deltas()` return early if there's no
processing needed, to skip the schema borrow altogether.
Currently, the simulator complains of the following error:
```
Error: failed with error: 'attempt to multiply with overflow'
```
However, we don't enable views in the simulator so -- despite being an
issue -- we should never see this. Let's fix `op_delete()` some more not
to not even call rowid() unless view processing is enabled.