Commit Graph

3991 Commits

Author SHA1 Message Date
Jussi Saurio
64c8587f27 Merge 'IO More State Machine' from Pedro Muniz
I swear we just need one more state machine. Just more state machine
until we achieve IO tracking. (PS: this is a meme)

Closes #2462
2025-08-06 21:26:19 +03:00
Jussi Saurio
cc98f9f88b Merge 'Direct schema mutation – add instruction' from Levy A.
**86%** performance improvement. We are **25x** faster than SQLite.
<img width="953" height="511" alt="image" src="https://github.com/user-
attachments/assets/fd717d1e-bbbe-4959-ae48-41afc73e5e9f" />
```
ALTER TABLE _ DROP COLUMN _`/limbo_drop_column/
                        time:   [1.8821 ms 1.8929 ms 1.9047 ms]
                        change: [-86.850% -86.733% -86.614%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe
Benchmarking `ALTER TABLE _ DROP COLUMN _`/sqlite_drop_column/: Warming up for 3.0000 s

`ALTER TABLE _ DROP COLUMN _`/sqlite_drop_column/
                        time:   [46.227 ms 46.258 ms 46.291 ms]
                        change: [-1.3202% -1.0505% -0.8109%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 15 outliers among 100 measurements (15.00%)
  10 (10.00%) high mild
  5 (5.00%) high severe
  ```

Closes #2452
2025-08-06 20:32:22 +03:00
Jussi Saurio
f8f2ad1e7a Merge 'refactor/btree: cleanup write/delete/balancing states' from Jussi Saurio
## Problem:
Currently `WriteState`, usually triggered by an insert operation, "owns"
the balancing state machine, even though a delete operation (tracked by
a separate `DeleteState`) can also trigger balancing, which results in
awkward back-and-forth switching between `CursorState::Write` and
`CursorState::Delete` during balancing.
## Fix:
1. Extract `balance_state` as a separate state machine, since its state
transitions are exactly the same regardless of whether an insert or a
delete triggered the balancing.
2. This allows to remove the different 'Balance-xxx' variants from
`WriteState`, as well as removing `WriteInfo` and `DeleteInfo`, as the
delete&insert states become just simple enums now. Each of them now has
a substate called `Balancing` which just delegates work to the balancing
state machine.
3. This further allows us to remove the awkward switching between
`CursorState::Delete` and `CursorState::Write` during a balance that
happens as a result of a deletion.

Reviewed-by: Nikita Sivukhin (@sivukhin)
Reviewed-by: Avinash Sajjanshetty (@avinassh)

Closes #2468
2025-08-06 20:19:17 +03:00
Levy A.
c9e1eca8dc feat: add DropColumn instruction 2025-08-06 13:39:30 -03:00
Levy A.
3bc1001a93 feat(bench): complete ALTER TABLE benchmarks 2025-08-06 13:38:26 -03:00
pedrocarlo
7b746ccc65 adjust state machine for ptrmap_get 2025-08-06 11:32:21 -03:00
pedrocarlo
b529305b82 state machine for ptrmap_put 2025-08-06 11:32:21 -03:00
pedrocarlo
931384afb6 state machine fix for btree create for AutoVacuum::Full 2025-08-06 11:32:21 -03:00
pedrocarlo
f656d0bc20 header ref state machine 2025-08-06 11:32:21 -03:00
Pekka Enberg
0c9216d1cc Merge 'cdc: emit entries for schema changes' from Nikita Sivukhin
This PR emit CDC entries as changes in `sqlite_schema` table for DDL
statements: `CREATE TABLE` / `CREATE INDEX` / etc.
The logic is a bit tricky as under the hood `turso` can do some implicit
DDL operations like:
1. Creating auto-indexes in case of `CREATE TABLE`
2. Deletion of all attached indices in case of `DROP TABLE`
```
turso> PRAGMA unstable_capture_data_changes_conn('full');
turso> CREATE TABLE t(x, y, z UNIQUE, q, PRIMARY KEY (x, y));
turso> CREATE INDEX t_xy ON t(x, y);
turso> CREATE TABLE q(a, b, c);
turso> ALTER TABLE q DROP COLUMN b;

turso> SELECT
change_id, 
id,
change_type, 
table_name,
bin_record_json_object(table_columns_json_array(table_name), before) AS before,
bin_record_json_object(table_columns_json_array(table_name), after) AS after
FROM turso_cdc;
┌───────────┬────┬─────────────┬───────────────┬─────────────────────────────────────────────────────────────────────┬─────────────────────────────────────────────────────────────────────┐
│ change_id │ id │ change_type │ table_name    │ before                                                              │ after                                                               │
├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│         1 │  2 │           1 │ sqlite_schema │                                                                     │ {"type":"table","name":"t","tbl_name":"t","rootpage":3,"sql":"CREA… │
├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│         2 │  5 │           1 │ sqlite_schema │                                                                     │ {"type":"index","name":"t_xy","tbl_name":"t","rootpage":6,"sql":"C… │
├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│         3 │  6 │           1 │ sqlite_schema │                                                                     │ {"type":"table","name":"q","tbl_name":"q","rootpage":7,"sql":"CREA… │
├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│         4 │  6 │           0 │ sqlite_schema │ {"type":"table","name":"q","tbl_name":"q","rootpage":7,"sql":"CREA… │ {"type":"table","name":"q","tbl_name":"q","rootpage":7,"sql":"CREA… │
└───────────┴────┴─────────────┴───────────────┴─────────────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────┘
```
For now, CDC capture only all explicit operations and ignore all
implicit operations. The reasoning for that is that one use case for CDC
is to apply logical changes as is with simple SQL statements - but if
implicit operations will be logged to the CDC table too - we can have
hard times using simple SQL statement (for example, creation of
`autoindices` will always work; implicit deletion of indices for `DROP
TABLE` also can lead to some troubles and force us to is `DROP INDEX IF
EXISTS ...` statements + we will need to filter out autoindices in this
case too).
Also, to simplify PR, for now `DatabaseTape` from `turso-sync` package
just ignore all schema changes from CDC table.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2426
2025-08-06 14:48:27 +03:00
Jussi Saurio
c8d2a1a480 btree: add a few more assertions about balance state 2025-08-06 13:39:20 +03:00
Jussi Saurio
a86a0e194d refactor/btree: cleanup write/delete/balancing states
Problem:

Currently `WriteState` "owns" the balancing state machine, even
though a separate `DeleteState` can also trigger balancing, which
results in awkward back-and-forth switching between `CursorState::Write`
and `CursorState::Delete` during balancing.

Fix:

1. Extract `balance_state` as a separate state machine, since its
state transitions are exactly the same regardless of whether an
insert or a delete triggered the balancing.
2. This allows to remove the different 'Balance-xxx' variants from
`WriteState`, as well as removing `WriteInfo` and `DeleteInfo`, as
those states become just simple enums now. Each of them now has a state
called `Balancing` which just delegates work to the balancing state
machine.
3. This further allows us to remove the awkward switching between
`CursorState::Delete` and `CursorState::Write` during a balance that
happens as a result of a deletion.
2025-08-06 13:37:35 +03:00
Jussi Saurio
8e4597d11b Merge 'Add load_insn macro for compiler hint in vdbe::execute hot path' from Preston Thorpe
The built-in `unreachable!` macro, believe it or not is just an alias
for `panic!` and does not actually provide the compiler with a hint that
the path is not reachable.
This provides a wrapper around the actual
`std::hint::unreachable_unchecked()`, to be used only in the very hot
path of `execute` where it is not possible to be the incorrect variant.

Closes #2459
2025-08-06 12:05:33 +03:00
Jussi Saurio
5f3cfaac60 refactor/btree: don't clone WriteState in balance_non_root() 2025-08-06 11:30:09 +03:00
Jussi Saurio
a15d7dd2e7 refactor/btree: don't clone WriteState in balance() 2025-08-06 11:30:09 +03:00
Jussi Saurio
3d635ecd67 Merge 'refactor/btree: don't clone WriteState in insert_into_page()' from Jussi Saurio
## Problem
We currently clone `WriteState` in every loop iteration of
`insert_into_page()`, which was probably done for borrow checker
reasons, but since `WriteState` has expanded to include buffers that
must not be moved in memory or dropped, it has necessitated a really
annoying workaround of wrapping the buffers in `Arc<Mutex>>` which is
just completely wasteful.
## Fix
Do not clone `WriteState` in `insert_into_page()`, and instead work with
the borrow checker a bit more. Note that `WriteState` still _implements_
`Clone` because it's also cloned in `balance_non_root()` - that can be a
separate refactor.

Reviewed-by: Avinash Sajjanshetty (@avinassh)
Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #2464
2025-08-06 11:29:55 +03:00
Jussi Saurio
406cbb9e78 Merge 'Coll seq' from Glauber Costa
Implement the CollSeq vdbe opcode.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2454
2025-08-06 11:28:27 +03:00
Jussi Saurio
1c1f55fdfb refactor/btree: remove cloning of WriteState in insert_into_page() 2025-08-06 08:50:56 +03:00
Jussi Saurio
c3a32b63bf refactor/btree: remove unnecessary ref of self in overwrite_content() 2025-08-06 08:45:34 +03:00
Jussi Saurio
6dd08c21e4 refactor/btree: remove unnecessary mut ref of self in rowid() 2025-08-06 08:44:52 +03:00
Jussi Saurio
839d428e36 core/btree: fix re-entrancy bug in insert_into_page()
We currently clone WriteState on every iteration of `insert_into_page()`,
presumably for Borrow Checker Reasons (tm).

There was a bug in `WriteState::Insert` handling where if `fill_cell_payload()`
returned IO, the `fill_cell_payload_state` was not updated in
`write_info.state`, leading to an infinite loop of allocating new pages.

This bug was surfaced by, but not caused by, #2400.
2025-08-06 08:01:49 +03:00
Jussi Saurio
cd3fe523a3 core/types: add IOResult::is_io() helper 2025-08-06 07:46:51 +03:00
PThorpe92
f6fb786cc9 Fix borrow method on WindowsIO 2025-08-05 22:26:19 -04:00
PThorpe92
00a3c7eb52 Apply PR comments, fix syntax 2025-08-05 21:17:23 -04:00
PThorpe92
d5f9e60dfc Add assert_insn for compiler hint in execute hot path 2025-08-05 18:32:27 -04:00
Nikita Sivukhin
c0d5c55d5c fix tests and clippy 2025-08-06 01:03:49 +04:00
Nikita Sivukhin
c6a87d61c7 emit CDC entries if necessary for schema changes 2025-08-06 01:03:49 +04:00
Nikita Sivukhin
0b4c1ac802 refactor code a little bit 2025-08-06 01:03:48 +04:00
PThorpe92
53a0524050 Fix clippy warning 2025-08-05 16:24:50 -04:00
PThorpe92
f6a68cffc2 Remove RefCell from IO and Page apis 2025-08-05 16:24:49 -04:00
Glauber Costa
d1be7ad0bb implement the collseq bytecode instruction
SQLite generates those in aggregations like min / max with collation
information either in the table definition or in the column expression.

We currently generate the wrong result here, and properly generating the
bytecode instruction fixes it.
2025-08-05 13:49:04 -05:00
Glauber Costa
6a66053ca8 make sure value comparisons for min and max are collation aware
They currently aren't, which isn't right.
2025-08-05 13:39:38 -05:00
PThorpe92
914c10e095 Remove Clone impl for Buffer and PageContent 2025-08-05 14:26:53 -04:00
Pekka Enberg
9492a29d47 Merge 'Fix performance regression' from Jussi Saurio
Closes #2440
## Fix 1
Do not start a read transaction when a SELECT is not going to access the
database, which means we can avoid checking whether the schema has
changed.
## Fix 2
Add a field `accesses_db` to `Program` and `Statement` so we can avoid
even checking for `SchemaUpdated` errors when it's not possible to get
one.
## Fix 3
Avoid doing any work in `commit_txn` when not in a transaction. This
optimization is only enabled when `mv_store.is_none()`, because MVCC has
its own logic and this doesn't work with MVCC enabled, and honestly I'm
too tired to find out why. Left an inline comment about it, though.
```sql
Execute `SELECT 1`/limbo_execute_select_1
                        time:   [21.440 ns 21.513 ns 21.586 ns]
                        change: [-60.766% -60.616% -60.453%] (p = 0.00 < 0.05)
                        Performance has improved.
```
Effect is even more dramatic in CI where the latency is down over 80%

Closes #2441
2025-08-05 16:30:18 +03:00
Jussi Saurio
cde8567b1d Merge 'More state machine + Return IO in places where completions are created' from Pedro Muniz
In preparation for tracking IO Completions, we need to start to return
IO in places where completions are created. Doing some more plumbing now
to avoid bigger PRs for the future

Closes #2438
2025-08-05 15:47:51 +03:00
Pekka Enberg
49123db6e8 Merge 'core/mvcc: implement exists' from Pere Diaz Bou
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2446
2025-08-05 15:34:23 +03:00
Jussi Saurio
1feb5ba2d3 perf/vdbe: avoid doing work in commit_txn if not in txn 2025-08-05 15:25:28 +03:00
Jussi Saurio
3f633247f7 perf/stmt: avoid checking for SchemaUpdated errors if it's impossible 2025-08-05 15:10:55 +03:00
Jussi Saurio
c498196c7b fix/perf: fix regression in SELECT 1 benchmark
Do not start a read transaction when a SELECT is not going to access
the database, which means we can avoid checking whether the schema has
changed.
2025-08-05 15:10:55 +03:00
Pere Diaz Bou
474f0d8bbc core/mvcc: implement exists 2025-08-05 13:34:51 +02:00
Jussi Saurio
a28e64bfdd cleanup: remove unused page uptodate flag 2025-08-05 14:25:42 +03:00
Pekka Enberg
d2fea25fef Merge 'perf/btree: implement fast algorithm for defragment_page' from Jussi Saurio
Implement sqlite's fast path defragment algorithm. This path is taken
when:
1. There are 1-2 freeblocks
2. There are at most `max_frag_bytes` fragmented free bytes (-1..=4)
Instead of reconstructing the entire page, it merges the two freeblocks
and then moves the merged freeblock to the left, effectively turning it
into free space in the unallocated region, instead of a freeblock.
`max_frag_bytes` is particularly important when jnserting a new cell,
because if the page contains (in total) ~just enough space for the new
cell, then there can be hardly any fragmented free space because
otherwise, merging the 1-2 freeblocks won't produce enough contiguous
free space to fit the cell.
## Benchmark
```sql
Insert rows in batches/limbo_insert_1_rows
                        time:   [26.692 µs 27.153 µs 27.695 µs]
                        change: [-9.9033% -2.9097% +1.6336%] (p = 0.55 > 0.05)
                        No change in performance detected.
Insert rows in batches/limbo_insert_10_rows
                        time:   [38.618 µs 40.022 µs 42.201 µs]
                        change: [-8.9137% -6.6405% -4.2299%] (p = 0.00 < 0.05)
                        Performance has improved.
Insert rows in batches/limbo_insert_100_rows
                        time:   [168.94 µs 169.58 µs 170.31 µs]
                        change: [-22.520% -17.669% -12.790%] (p = 0.00 < 0.05)
                        Performance has improved.
```

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #2411
2025-08-05 12:44:48 +03:00
Pekka Enberg
aa20c2f1ba Merge 'Relax I/O configuration attribute to cover all Unixes' from Pedro Muniz
hopefully fixes #2268.

Closes #2435
2025-08-05 12:44:34 +03:00
Pekka Enberg
e355fc4c65 Merge 'core/mvcc: implement seeking operations with rowid' from Pere Diaz Bou
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2429
2025-08-05 12:40:48 +03:00
Jussi Saurio
ad35cf07eb Add extra illustrative doodle for pere 2025-08-05 11:24:15 +03:00
Jussi Saurio
a5330aa6fb perf/btree: implement fast algorithm for defragment_page 2025-08-05 11:24:14 +03:00
Jussi Saurio
5b84ad6b0f Merge 'Update defragment page to defragment in-place' from João Severo
Change original code from doing a full copy of the original buffer to
modify the buffer in-place using a temporary vector with offsets.

Closes #2258
2025-08-05 11:22:22 +03:00
Jussi Saurio
c9c5565867 Merge 'Integrate virtual tables with optimizer' from Piotr Rżysko
This PR integrates virtual tables into the query optimizer. It is a
follow-up to https://github.com/tursodatabase/turso/pull/1727.
The most immediate improvement is better support for inner joins
involving TVFs, particularly when TVF arguments are column references.
### Example
The following two queries are semantically equivalent, but require
different join orders to be valid:
```sql
-- TVF depends on `t.id`, so `t` must be evaluated in outer loop
SELECT t.id, series.value
FROM target t, generate_series(t.id, 3) series;

-- Equivalent query, but with reversed table order in the FROM clause
SELECT t.id, series.value
FROM generate_series(t.id, 3) series, target t;
```
Without optimizer integration, the second query would fail because the
planner would attempt to evaluate `generate_series` before `t`. With
this change, the optimizer detects column dependencies and produces the
correct join order in both cases.
### TODO
Support for outer joins with TVFs is still missing and will be addressed
in a follow-up PR.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2439
2025-08-05 09:22:08 +03:00
pedrocarlo
aa8d17cbf1 state machine for ptrmap_get 2025-08-05 01:38:42 -03:00
Piotr Rzysko
59ec2d3949 Replace ConstraintInfo::plan_info with ConstraintInfo::index
The side of the binary expression no longer needs to be stored in
`ConstraintInfo`, since the optimizer now guarantees that it is always
on the right. As a result, only the index of the corresponding constraint
needs to be preserved.
2025-08-05 05:48:29 +02:00