Problem:
Currently `WriteState` "owns" the balancing state machine, even
though a separate `DeleteState` can also trigger balancing, which
results in awkward back-and-forth switching between `CursorState::Write`
and `CursorState::Delete` during balancing.
Fix:
1. Extract `balance_state` as a separate state machine, since its
state transitions are exactly the same regardless of whether an
insert or a delete triggered the balancing.
2. This allows to remove the different 'Balance-xxx' variants from
`WriteState`, as well as removing `WriteInfo` and `DeleteInfo`, as
those states become just simple enums now. Each of them now has a state
called `Balancing` which just delegates work to the balancing state
machine.
3. This further allows us to remove the awkward switching between
`CursorState::Delete` and `CursorState::Write` during a balance that
happens as a result of a deletion.
The built-in `unreachable!` macro, believe it or not is just an alias
for `panic!` and does not actually provide the compiler with a hint that
the path is not reachable.
This provides a wrapper around the actual
`std::hint::unreachable_unchecked()`, to be used only in the very hot
path of `execute` where it is not possible to be the incorrect variant.
Closes#2459
## Beef
Removes cloning of `WriteState` in `balance_non_root()` and removes the
`Clone` implementation from `WriteState`. Not only is it unnecessary to
clone the state, but even having the `Clone` implementation available is
a real regression risk in `insert_into_page()` as it will deep clone
vectors that we require stable raw pointers to in `fill_cell_payload()`
## Details
In removing the cloning of `WriteState` in `balance_non_root()`, this
replaces the `next_write_state` logic with a plain loop, which ends up
looking like a lot of changes, but it's really not. Main things that
were changed:
- Loop and mutate state instead of create new state and return
- Make `WriteInfo` an `Option` in `WriteState` for ergonomics so it can
be `.take()`:n instead of `.clone()`:d
- Clone a `Rc<Pager>` in `balance_non_root()` so that we can call
`pager.do_allocate_page()` directly instead of `self.allocate_page()`
which is just a facade for the pager method, and would violate borrowing
rules.
- assign `let usable_space = self.usable_space()` at the beginning of
the balance loop for similar reasons.
- Make the balance debug validation methods static so they don't require
a reference to self, for similar reasons
Reviewed-by: Nikita Sivukhin (@sivukhin)
Reviewed-by: Avinash Sajjanshetty (@avinassh)
Closes#2466
## Problem
We currently clone `WriteState` in every loop iteration of
`insert_into_page()`, which was probably done for borrow checker
reasons, but since `WriteState` has expanded to include buffers that
must not be moved in memory or dropped, it has necessitated a really
annoying workaround of wrapping the buffers in `Arc<Mutex>>` which is
just completely wasteful.
## Fix
Do not clone `WriteState` in `insert_into_page()`, and instead work with
the borrow checker a bit more. Note that `WriteState` still _implements_
`Clone` because it's also cloned in `balance_non_root()` - that can be a
separate refactor.
Reviewed-by: Avinash Sajjanshetty (@avinassh)
Reviewed-by: Nikita Sivukhin (@sivukhin)
Closes#2464
## Background
We currently clone `WriteState` on every iteration of
`insert_into_page()`,
presumably for _Borrow Checker Reasons (tm)_.
## Problem
There was a bug in `WriteState::Insert` handling where if
`fill_cell_payload()`
returned IO, the `fill_cell_payload_state` was not updated in
`write_info.state`, leading to an infinite loop of allocating new pages.
Further, the `new_payload` vector was also always deep-cloned.
## Fix
Update the state if `fill_cell_payload()` returns IO. Because of
`WriteState` cloning we also need to `Arc<Mutex>` wrap the vector so
that the underlying data buffer doesn't get cloned the next time
`WriteState` gets cloned, since `fill_cell_payload` relies on raw
pointers to work. Left a couple of prominent FIXMEs about this.
## Notes
This bug was surfaced by, but not caused by,
https://github.com/tursodatabase/turso/pull/2400.
Closes#2463
We currently clone WriteState on every iteration of `insert_into_page()`,
presumably for Borrow Checker Reasons (tm).
There was a bug in `WriteState::Insert` handling where if `fill_cell_payload()`
returned IO, the `fill_cell_payload_state` was not updated in
`write_info.state`, leading to an infinite loop of allocating new pages.
This bug was surfaced by, but not caused by, #2400.
We are doing unsafe mut borrowing in `PageContent` currently, and when
new BufferPool is merged, it will be all pointers into an arena anyway.
We just need to be sure to clone the Arc and reference the buffers in
the completion callbacks so they are ensured to live for async IO.
naturally in true spirit of all my previous PR's, I needed to introduce
one while another is open that causes an absurd amount of conflicts.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2456
After #2258, we no longer need to clone/deep copy `PageContent` or
`Buffer`.
We can remove the implementation to make any copying of page data
explicit.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2453
SQLite generates those in aggregations like min / max with collation
information either in the table definition or in the column expression.
We currently generate the wrong result here, and properly generating the
bytecode instruction fixes it.
Closes#2440
## Fix 1
Do not start a read transaction when a SELECT is not going to access the
database, which means we can avoid checking whether the schema has
changed.
## Fix 2
Add a field `accesses_db` to `Program` and `Statement` so we can avoid
even checking for `SchemaUpdated` errors when it's not possible to get
one.
## Fix 3
Avoid doing any work in `commit_txn` when not in a transaction. This
optimization is only enabled when `mv_store.is_none()`, because MVCC has
its own logic and this doesn't work with MVCC enabled, and honestly I'm
too tired to find out why. Left an inline comment about it, though.
```sql
Execute `SELECT 1`/limbo_execute_select_1
time: [21.440 ns 21.513 ns 21.586 ns]
change: [-60.766% -60.616% -60.453%] (p = 0.00 < 0.05)
Performance has improved.
```
Effect is even more dramatic in CI where the latency is down over 80%
Closes#2441
In preparation for tracking IO Completions, we need to start to return
IO in places where completions are created. Doing some more plumbing now
to avoid bigger PRs for the future
Closes#2438
## Beef
Adds `AddColumn`, `DropColumn`, `RenameColumn`
## Details
- Previously the test was hardcoded to assume there's always 2 named
columns, so changed a bunch of things for this reason
- Still assumes the primary key column is always `id` and is never
renamed or dropped etc.
Closes#2434
Implement sqlite's fast path defragment algorithm. This path is taken
when:
1. There are 1-2 freeblocks
2. There are at most `max_frag_bytes` fragmented free bytes (-1..=4)
Instead of reconstructing the entire page, it merges the two freeblocks
and then moves the merged freeblock to the left, effectively turning it
into free space in the unallocated region, instead of a freeblock.
`max_frag_bytes` is particularly important when jnserting a new cell,
because if the page contains (in total) ~just enough space for the new
cell, then there can be hardly any fragmented free space because
otherwise, merging the 1-2 freeblocks won't produce enough contiguous
free space to fit the cell.
## Benchmark
```sql
Insert rows in batches/limbo_insert_1_rows
time: [26.692 µs 27.153 µs 27.695 µs]
change: [-9.9033% -2.9097% +1.6336%] (p = 0.55 > 0.05)
No change in performance detected.
Insert rows in batches/limbo_insert_10_rows
time: [38.618 µs 40.022 µs 42.201 µs]
change: [-8.9137% -6.6405% -4.2299%] (p = 0.00 < 0.05)
Performance has improved.
Insert rows in batches/limbo_insert_100_rows
time: [168.94 µs 169.58 µs 170.31 µs]
change: [-22.520% -17.669% -12.790%] (p = 0.00 < 0.05)
Performance has improved.
```
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#2411