Commit Graph

3216 Commits

Author SHA1 Message Date
Jussi Saurio
d88bbd488f btree/balance: rename leaf_data to is_table_leaf 2025-07-10 13:15:29 +03:00
Jussi Saurio
b306550a69 format 2025-07-10 13:14:57 +03:00
Jussi Saurio
5ef0127409 btree/balance: rename count_cells_in_old_pages to old_cell_count_per_page_cumulative 2025-07-10 13:14:18 +03:00
Jussi Saurio
c31ee0e628 btree/balance: rename number_of_cells_per_page to cell_count_per_page_cumulative 2025-07-10 13:12:17 +03:00
Jussi Saurio
824065a91d btree/balance: rename cells to cell_data 2025-07-10 13:10:31 +03:00
Jussi Saurio
37f2317e49 btree/balance: add comment about divider cell 2025-07-10 13:09:29 +03:00
Jussi Saurio
4d691af3ee btree/balance: clearer variable name 2025-07-10 13:08:58 +03:00
Jussi Saurio
e51f0f5466 btree/balance: improve comment 2025-07-10 13:08:35 +03:00
Jussi Saurio
201edf3668 btree/balance: add comment 2025-07-10 13:05:54 +03:00
Jussi Saurio
fd0a47dc6b btree: simplify pattern match 2025-07-10 13:05:15 +03:00
Jussi Saurio
4dc3e2100f btree: rename balance_non_root related enum variants and add docs 2025-07-10 13:02:50 +03:00
Jussi Saurio
0eeabbb748 Merge 'btree/chore: remove unnecessary parameters to .cell_get()' from Jussi Saurio
we were providing the same damn arguments to `.cell_get()` and
`.cell_get_raw_region()` over and OVER and **OVER** and `O V E R`

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2021
2025-07-10 12:22:37 +03:00
Jussi Saurio
c2b699c356 btree: make cell field names consistent 2025-07-09 23:43:03 +03:00
Jussi Saurio
641df7d7e9 improve my mental health by finally refactoring .cell_get() 2025-07-09 19:15:05 +03:00
Jussi Saurio
24219d2eb2 Merge 'Minor refactoring of btree' from meteorgan
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1917
2025-07-09 19:13:39 +03:00
meteorgan
0001348158 Minor refactoring of btree 2025-07-09 22:01:54 +08:00
Pekka Enberg
3f10427f52 core: Fix resolve_function() error messages
We need to return the original function name, not normalized one to be
compatible with SQLite.

Spotted by SQLite TCL tests.
2025-07-09 15:30:57 +03:00
Jussi Saurio
c9a6c289e0 clippy 2025-07-09 14:26:43 +03:00
Jussi Saurio
38650eee0e VDBE: fix op_insert re-entrancy
when updating last_insert_rowid we call return_if_io!(cursor.rowid())
which yields IO on large records. this causes op_insert to insert and
overwrite the same row many times. we need a state machine to ensure
that the insertion only happens once and the reading of rowid can
independently yield IO without causing a re-insert.
2025-07-09 14:26:40 +03:00
Jussi Saurio
11d4489740 Merge 'sqlite3_ondisk: generalize left-child-pointer reading function to both index/table btrees' from Jussi Saurio
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #2015
2025-07-09 14:24:08 +03:00
Jussi Saurio
c752058a97 VDBE: introduce state machine for op_idx_insert for more granular IO control
Separates cursor.key_exists_in_index() into a state machine. The problem with
the main branch implementation is this:

`return_if_io!(seek)`
`return_if_io!(cursor.record())`

The latter may yield on IO and cause the seek to start over, causing an infinite
loop. With an explicit state machine we can control and prevent this.
2025-07-09 11:43:18 +03:00
Jussi Saurio
c13b2d5d90 sqlite3_ondisk: generalize left-child-pointer reading function to both index/table btrees 2025-07-09 11:07:42 +03:00
Pere Diaz Bou
f7465f665d add checkpoint lock to wal 2025-07-08 17:53:04 +02:00
meteorgan
99e0cf0603 add a constant MINIMUM_CELL_SIZE 2025-07-08 22:57:20 +08:00
meteorgan
04575456a9 fix Minimum cell size must not be less than 4 2025-07-08 22:57:20 +08:00
meteorgan
3065416bb2 cargo fmt 2025-07-08 22:57:20 +08:00
meteorgan
08be906bb1 return early if index is not found in op_idx_delete 2025-07-08 22:57:20 +08:00
meteorgan
f44d818400 cargo fmt 2025-07-08 22:57:20 +08:00
meteorgan
c6ef4898b0 fix: IdxDelete shouldn't raise error if P5 == 0 2025-07-08 22:57:20 +08:00
meteorgan
4a516ab414 Support except operator for compound select 2025-07-08 22:57:20 +08:00
Pere Diaz Bou
511b80a062 do not assert connection is closed and return error on api 2025-07-08 16:47:03 +02:00
Pere Diaz Bou
232beddf62 vdbe: fix compilation 2025-07-08 16:15:29 +02:00
Pere Diaz Bou
ef41c19542 assert is not closed already 2025-07-08 15:58:11 +02:00
Pere Diaz Bou
5319af8fd8 set closed to cell 2025-07-08 15:55:50 +02:00
Pere Diaz Bou
8909e198ae set closed flag for connection to detect force zombies
Let's make sure we don't keep using a connection after it was dropped.
In case of executing a query that was closed we will try to rollback and
return early.
2025-07-08 15:19:20 +02:00
Jussi Saurio
cb8a576501 op_idx_insert: introduce flag for ignoring duplicates 2025-07-08 12:14:20 +03:00
Jussi Saurio
3ab5f07389 btree: fix incorrect comparison implementation in key_exists_in_index()
1. current implementation did not use the custom PartialOrd implementation
   for RefValue
2. current implementation did not take collation into account
2025-07-08 11:58:57 +03:00
Pekka Enberg
1907df825c Merge 'Use binary search in find_cell()' from Ihor Andrianov
Find cell using  bin search

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #1875
2025-07-08 10:22:26 +03:00
Pekka Enberg
97c5bdf408 Merge 'Use str_to_f64 on float conversion' from Levy A.
Closes #1870
2025-07-08 10:21:29 +03:00
Pekka Enberg
84771f17f7 Merge 'core/translate: Fix aggregate star error handling in prepare_one_sele…' from Pekka Enberg
…ct_plan()
For example, if we attempt to do `max(*)`, let's return the error
message from `resolve_function()` to be compatible with SQLite:
```
sqlite> CREATE TABLE test1(f1, f2);
sqlite> SELECT max(*) FROM test1;
Parse error: wrong number of arguments to function max()
  SELECT max(*) FROM test1;
         ^--- error here
```
Spotted by SQLite TCL tests.

Closes #1990
2025-07-08 10:19:59 +03:00
Nikita Sivukhin
29422542cd fix clippy 2025-07-08 10:31:40 +04:00
Nikita Sivukhin
d8fb321b16 treat ImmutableRecord as Value::Blob 2025-07-08 10:28:11 +04:00
Pekka Enberg
341f963a8e Merge 'Fix infinite loops, rollback problems, and other bugs found by I/O fault injection' from Pedro Muniz
Was running the sim with I/O faults enabled and fixed some nasty bugs.
Now, there are some more nasty bugs to fix as well. This is the command
that I use to run the simulator `cargo run -p limbo_sim -- --minimum-
tests 10 --maximum-tests 1000`
This PR mainly fixes the following bugs:
- Not decrementing in flight write counter when `pwrite` fails
- not rolling back the transaction on `step` error
- not rolling back the transaction on `run_once` error
- some functions were just being unwrapped when they could suffer io
errors
- Only change max_frame after wal sync's

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #1946
2025-07-07 21:31:26 +03:00
Pekka Enberg
d4c03d426c core/translate: Fix aggregate star error handling in prepare_one_select_plan()
For example, if we attempt to do `max(*)`, let's return the error
message from `resolve_function()` to be compatible with SQLite:

```
sqlite> CREATE TABLE test1(f1, f2);
sqlite> SELECT max(*) FROM test1;
Parse error: wrong number of arguments to function max()
  SELECT max(*) FROM test1;
         ^--- error here
```

Spotted by SQLite TCL tests.
2025-07-07 19:56:09 +03:00
pedrocarlo
6b60dd06c6 only rollback on write transaction 2025-07-07 12:10:54 -03:00
Pekka Enberg
0d17d35ef4 Merge 'Change data capture' from Nikita Sivukhin
This PR add basic CDC functionality to the `turso-db`.
### Feature components
1. `unstable_capture_data_changes_conn` pragma which allow user to turn
on/off CDC logging for **specific connection**
    * CDC will have multiple modes, but for now only `off` / `rowid-
only` are supported
    * Default CDC table is `turso_cdc` but user can override this with
`PRAGMA` update syntax and use arbitrary table for the CDC needs
      * This can be helpful in future if turso will need to break table
format compatibility and custom tables can be a way to migrate between
different schemas
    * Update syntax for the pragma accepts one string argument in
format, where only mode is set or custom cdc table name is provided as
second part of the string, separated with comma from the mode
```sql
turso> PRAGMA unstable_capture_data_changes_conn('rowid-only');
turso> PRAGMA unstable_capture_data_changes_conn('off');
turso> PRAGMA unstable_capture_data_changes_conn('rowid-only,custom_cdc_table');
turso> PRAGMA unstable_capture_data_changes_conn;
┌────────────┬──────────────────┐
│ mode       │ table            │
├────────────┼──────────────────┤
│ rowid-only │ custom_cdc_table │
└────────────┴──────────────────┘
```
2. CDC table schema right now is simple but it will be evolved soon to
support logging of row values before/after the change:
```sql
CREATE TABLE custom_cdc_table (
  operation_id INTEGER PRIMARY KEY AUTOINCREMENT,
  operation_time INTEGER, -- unixepoch() at the moment of insert, can drift if machine clocks is not monotonic
  operation_type INTEGER, -- -1 = delete, 0 = update, 1 = insert
  table_name TEXT,
  id
)
```
  * Note, that `operation_id` is marked as `AUTOINCREMENT` but `turso-
db` needs to implement
https://github.com/tursodatabase/turso/issues/1976 in order to properly
support that keyword
3. Query planner changes are made in `INSERT`/`UPDATE`/`DELETE` plans in
order to emit updates to the CDC table for changes in the table
  * Note, that row `UPDATE` which change primary key generate `DELETE` +
`INSERT` statement instead of single `UPDATE`
### Implementation details
- `PRAGMA` to enable CDC is **unstable** which means that publicly
visible side-effects/public API can change in future (and it will change
soon in order to support more rich CDC modes)
- CDC table is just a regular table with its benefits and downsides:
  * benefits: user can perform maintenance operations with that table
just with regular SQL like `DELETE FROM turso_cdc WHERE operation_id <
?` to cleanup old not needed CDC entries
  * downsides: user can accidentally make unwanted change to CDC table
- Changes to CDC table is not logged to itself
  * Note, that different connections (e.g. `C1`, `C2`) can have
different CDC tables set (e.g. `A` and `B`) - in which case changes made
to CDC table `B` through connection `C1` will be reflected in CDC table
`A`

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #1926
2025-07-07 18:03:07 +03:00
pedrocarlo
367002fb72 rename change_schema to schema_did_change 2025-07-07 11:58:16 -03:00
pedrocarlo
d8ad4a27f8 only finish appending frames when we are done in cacheflush 2025-07-07 11:53:45 -03:00
pedrocarlo
b85687658d change instrumentation level to INFO 2025-07-07 11:53:45 -03:00
pedrocarlo
4639a4565f change max_frame count only after wal sync in cacheflush 2025-07-07 11:53:45 -03:00