Commit Graph

3155 Commits

Author SHA1 Message Date
Pekka Enberg
0d17d35ef4 Merge 'Change data capture' from Nikita Sivukhin
This PR add basic CDC functionality to the `turso-db`.
### Feature components
1. `unstable_capture_data_changes_conn` pragma which allow user to turn
on/off CDC logging for **specific connection**
    * CDC will have multiple modes, but for now only `off` / `rowid-
only` are supported
    * Default CDC table is `turso_cdc` but user can override this with
`PRAGMA` update syntax and use arbitrary table for the CDC needs
      * This can be helpful in future if turso will need to break table
format compatibility and custom tables can be a way to migrate between
different schemas
    * Update syntax for the pragma accepts one string argument in
format, where only mode is set or custom cdc table name is provided as
second part of the string, separated with comma from the mode
```sql
turso> PRAGMA unstable_capture_data_changes_conn('rowid-only');
turso> PRAGMA unstable_capture_data_changes_conn('off');
turso> PRAGMA unstable_capture_data_changes_conn('rowid-only,custom_cdc_table');
turso> PRAGMA unstable_capture_data_changes_conn;
┌────────────┬──────────────────┐
│ mode       │ table            │
├────────────┼──────────────────┤
│ rowid-only │ custom_cdc_table │
└────────────┴──────────────────┘
```
2. CDC table schema right now is simple but it will be evolved soon to
support logging of row values before/after the change:
```sql
CREATE TABLE custom_cdc_table (
  operation_id INTEGER PRIMARY KEY AUTOINCREMENT,
  operation_time INTEGER, -- unixepoch() at the moment of insert, can drift if machine clocks is not monotonic
  operation_type INTEGER, -- -1 = delete, 0 = update, 1 = insert
  table_name TEXT,
  id
)
```
  * Note, that `operation_id` is marked as `AUTOINCREMENT` but `turso-
db` needs to implement
https://github.com/tursodatabase/turso/issues/1976 in order to properly
support that keyword
3. Query planner changes are made in `INSERT`/`UPDATE`/`DELETE` plans in
order to emit updates to the CDC table for changes in the table
  * Note, that row `UPDATE` which change primary key generate `DELETE` +
`INSERT` statement instead of single `UPDATE`
### Implementation details
- `PRAGMA` to enable CDC is **unstable** which means that publicly
visible side-effects/public API can change in future (and it will change
soon in order to support more rich CDC modes)
- CDC table is just a regular table with its benefits and downsides:
  * benefits: user can perform maintenance operations with that table
just with regular SQL like `DELETE FROM turso_cdc WHERE operation_id <
?` to cleanup old not needed CDC entries
  * downsides: user can accidentally make unwanted change to CDC table
- Changes to CDC table is not logged to itself
  * Note, that different connections (e.g. `C1`, `C2`) can have
different CDC tables set (e.g. `A` and `B`) - in which case changes made
to CDC table `B` through connection `C1` will be reflected in CDC table
`A`

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #1926
2025-07-07 18:03:07 +03:00
Nikita Sivukhin
1655c0b84f small fixes 2025-07-07 12:50:10 +04:00
Pekka Enberg
7f91768ff6 core/translate: Unify no such table error messages
We're now mixing different error messages, which makes compatibility
testing pretty hard. Unify on a single, SQLite compatible error message
"no such table".
2025-07-07 11:10:46 +03:00
Nikita Sivukhin
1ee475f04a rename pragma to unsable_capture_data_changes_conn 2025-07-06 22:32:42 +04:00
Nikita Sivukhin
a10d423aac adjust schema 2025-07-06 22:30:57 +04:00
Nikita Sivukhin
62c1e38805 small fixes 2025-07-06 22:26:34 +04:00
Nikita Sivukhin
32fa2ac3ee avoid capturing changes in cdc table 2025-07-06 22:24:35 +04:00
Nikita Sivukhin
a988bbaffe allow to specify table in the capture_data_changes PRAGMA 2025-07-06 22:19:32 +04:00
Nikita Sivukhin
a3732939bd fix clippy again 2025-07-06 21:16:58 +04:00
Nikita Sivukhin
271b8e5bcd fix clippy 2025-07-06 21:16:58 +04:00
Nikita Sivukhin
40769618c1 small refactoring 2025-07-06 21:16:58 +04:00
Nikita Sivukhin
04f2efeaa4 small renames 2025-07-06 21:16:57 +04:00
Nikita Sivukhin
a82529f55a emit cdc changes for UPDATE / DELETE statements 2025-07-06 21:16:25 +04:00
Nikita Sivukhin
d72ba9877a emit turso_cdc table changes in Insert query plan 2025-07-06 21:16:25 +04:00
Nikita Sivukhin
cf7ae031c7 add ProgramBuilderFlags to the builder 2025-07-06 21:16:25 +04:00
Nikita Sivukhin
234dda322f handle change_capture pragma 2025-07-06 21:16:25 +04:00
Nikita Sivukhin
b0fc67a314 pass ownership or program to the pragma translators - just as with other statements 2025-07-06 21:16:25 +04:00
Nikita Sivukhin
3f0716b2a4 add capture_changes per-connection flag 2025-07-06 21:16:24 +04:00
Nikita Sivukhin
7ba8ab6efc add simple method for parsing pragma boolean value 2025-07-06 21:15:41 +04:00
Nikita Sivukhin
3e5bfb0083 copy comments about pragma flags from SQLite source code 2025-07-06 21:15:41 +04:00
Krishna Vishal
fc8403991b Fix Glob ScalarFunc to handle NULL and other Value types.
Fixes: https://github.com/tursodatabase/turso/issues/1953
2025-07-06 13:18:21 +05:30
Pekka Enberg
6e79a11dc7 Merge 'bindings/dart initial implementation' from Andika Tanuwijaya
re-upload

Closes #1911
2025-07-04 10:43:19 +03:00
Pekka Enberg
97ea4f1a80 Merge 'add libsql_disable_wal_checkpoint' from Pedro Muniz
Closes #1894

Closes #1920
2025-07-04 10:04:00 +03:00
Krishna Vishal
19d949521a Add a threshold to clip large page cache values to 0. This prevents
panic at runtime.
2025-07-04 10:24:10 +05:30
pedrocarlo
56d87cb916 move disable behavior to connection instead of checkpoint 2025-07-03 12:05:53 -03:00
pedrocarlo
db005c81a0 add option to disable wal checkpoint 2025-07-03 12:04:17 -03:00
Pekka Enberg
90e035b6b0 Merge 'Rollback schema support' from Pere Diaz Bou
Fixes #1890
Once rollback was implement we quickly saw that it lacked support for
schema changes so we had to re-estructure things a bit.
## Example of failure:
```bash
turso> begin;
turso> create table t(x);
turso> rollback;
turso> pragma integrity_check;
thread 'main' panicked at core/storage/sqlite3_ondisk.rs:386:36:
called `Result::unwrap()` on an `Err` value: Corrupt("Invalid page type: 83")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```
This happened because it thought table `t` existed because we didn't
rollback that schema.
## Changes:
* The most important change: now every connection has a private copy of
schema. On write txn commit we update a global schema shared between
connections in order for new connections to get updated version from
there. In case of rollback, we simply change connection's schema to
previous version. This change allowed us to remove locks for schema
private copy and keeping schema changes locally in case of concurrency.
 Sqlite does things differently, they lazily parse schema in case of
outdated schema, this many schema changes to trigger reading schema from
db file which is slow. If we are able to keep local copy in memory, even
when if we add multiprocessing, it will speed up schema reloading by a
bunch.
* `schema_cookie` is now update for every schema change
* `Insn::ParseSchema` had a nasty bug where it would commit all the
changes made in a query that changed a schema, we fixed that by setting
`auto_commit` to `false` before parsing schema, and setting it back to
previous value once schema is parsed.

Closes #1928
2025-07-03 14:18:00 +03:00
Pere Diaz Bou
5eca507867 fix type null spacing 2025-07-03 12:53:15 +02:00
Pere Diaz Bou
2cfd209e56 clippy 2025-07-03 12:40:08 +02:00
Pere Diaz Bou
151debcb63 fix to_sql btreetable 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
6b16950488 fmt 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
470fb8d23b rollabck translate remove querymode 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
cde7202981 Revert "Merge 'core: Disable ROLLBACK statement' from Pekka Enberg"
This reverts commit 8a13e4b02f, reversing
changes made to cc935f97cc.
2025-07-03 12:36:48 +02:00
Pere Diaz Bou
f37893eb8f set cookie for index operations 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
ba988685cf set cookie create virtual table 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
5d856499c4 move update schema global on commit and not on rollback txn 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
2414502268 parse schema set auto_commit false in nested query 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
d8658264d9 alter set cookie 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
c799396c3d rollback schema in connection 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
5b733663ab update schema in case it's outdated 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
65a7fe13cf remove lock from private schema copy 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
abf1699dd2 set scheam version and update shared schema in txn 2025-07-03 12:36:48 +02:00
Pekka Enberg
fa442ecd6e core/storage: Switch to turso_assert in btree.rs
Let's help out Antithesis to find interesting bugs.
2025-07-03 13:25:13 +03:00
Pekka Enberg
c76625eb64 Merge 'fix: buffer pool is not thread safe problem' from KaguraMilet
This PR will close #1446. Buffer pool implementation should be thread
safe after this PR.

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #1910
2025-07-03 13:12:21 +03:00
Pekka Enberg
471d26a632 Merge 'Fix index update when INTEGER PRIMARY KEY (rowid alias)' from Adrian-Ryan Acala
When an `UPDATE` statement modifies a table's `INTEGER PRIMARY KEY`
(which acts as a `rowid` alias) alongside other indexed columns, the
index entries were incorrectly retaining the old `rowid`. This led to
stale index references, causing subsequent queries to return incorrect
results.
This change ensures that when the `rowid` alias is part of the `SET`
clause in an `UPDATE` statement, the new `rowid` value is used for
generating and updating index records. This guarantees that all index
entries correctly point to the updated row, resolving the data
inconsistency.
Fixes #1897

Closes #1916
2025-07-03 13:10:53 +03:00
KaguraMilet
f339e9c1ad fix integrity check error 2025-07-03 13:47:30 +08:00
KaguraMilet
562dd389db Merge branch 'tursodatabase:main' into buffer 2025-07-03 13:46:37 +08:00
Pekka Enberg
3bd5d4c732 core: Drop debugging code 2025-07-02 19:55:38 +03:00
Pekka Enberg
e8af5f1022 Merge 'from_uri was not passing mvcc and indexes flag to database creation for memory path' from Pedro Muniz
Closes #1932
2025-07-02 19:55:27 +03:00
pedrocarlo
191f732088 from_uri was not passing mvcc and indexes flag to database creation for memory path 2025-07-02 13:46:49 -03:00