Commit Graph

218 Commits

Author SHA1 Message Date
Pekka Enberg
0c9216d1cc Merge 'cdc: emit entries for schema changes' from Nikita Sivukhin
This PR emit CDC entries as changes in `sqlite_schema` table for DDL
statements: `CREATE TABLE` / `CREATE INDEX` / etc.
The logic is a bit tricky as under the hood `turso` can do some implicit
DDL operations like:
1. Creating auto-indexes in case of `CREATE TABLE`
2. Deletion of all attached indices in case of `DROP TABLE`
```
turso> PRAGMA unstable_capture_data_changes_conn('full');
turso> CREATE TABLE t(x, y, z UNIQUE, q, PRIMARY KEY (x, y));
turso> CREATE INDEX t_xy ON t(x, y);
turso> CREATE TABLE q(a, b, c);
turso> ALTER TABLE q DROP COLUMN b;

turso> SELECT
change_id, 
id,
change_type, 
table_name,
bin_record_json_object(table_columns_json_array(table_name), before) AS before,
bin_record_json_object(table_columns_json_array(table_name), after) AS after
FROM turso_cdc;
┌───────────┬────┬─────────────┬───────────────┬─────────────────────────────────────────────────────────────────────┬─────────────────────────────────────────────────────────────────────┐
│ change_id │ id │ change_type │ table_name    │ before                                                              │ after                                                               │
├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│         1 │  2 │           1 │ sqlite_schema │                                                                     │ {"type":"table","name":"t","tbl_name":"t","rootpage":3,"sql":"CREA… │
├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│         2 │  5 │           1 │ sqlite_schema │                                                                     │ {"type":"index","name":"t_xy","tbl_name":"t","rootpage":6,"sql":"C… │
├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│         3 │  6 │           1 │ sqlite_schema │                                                                     │ {"type":"table","name":"q","tbl_name":"q","rootpage":7,"sql":"CREA… │
├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│         4 │  6 │           0 │ sqlite_schema │ {"type":"table","name":"q","tbl_name":"q","rootpage":7,"sql":"CREA… │ {"type":"table","name":"q","tbl_name":"q","rootpage":7,"sql":"CREA… │
└───────────┴────┴─────────────┴───────────────┴─────────────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────┘
```
For now, CDC capture only all explicit operations and ignore all
implicit operations. The reasoning for that is that one use case for CDC
is to apply logical changes as is with simple SQL statements - but if
implicit operations will be logged to the CDC table too - we can have
hard times using simple SQL statement (for example, creation of
`autoindices` will always work; implicit deletion of indices for `DROP
TABLE` also can lead to some troubles and force us to is `DROP INDEX IF
EXISTS ...` statements + we will need to filter out autoindices in this
case too).
Also, to simplify PR, for now `DatabaseTape` from `turso-sync` package
just ignore all schema changes from CDC table.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2426
2025-08-06 14:48:27 +03:00
PThorpe92
4010c7bf0b Make clippy happy 2025-08-05 21:24:54 -04:00
Nikita Sivukhin
c0d5c55d5c fix tests and clippy 2025-08-06 01:03:49 +04:00
Nikita Sivukhin
9c4147e8d6 add simple integration tests 2025-08-06 01:03:49 +04:00
PThorpe92
f6a68cffc2 Remove RefCell from IO and Page apis 2025-08-05 16:24:49 -04:00
Jussi Saurio
d13a1a0eec Merge 'test/fuzz: add ALTER TABLE column ops to tx isolation fuzz test' from Jussi Saurio
## Beef
Adds `AddColumn`, `DropColumn`, `RenameColumn`
## Details
- Previously the test was hardcoded to assume there's always 2 named
columns, so changed a bunch of things for this reason
- Still assumes the primary key column is always `id` and is never
renamed or dropped etc.

Closes #2434
2025-08-05 15:42:31 +03:00
Pere Diaz Bou
2a3e2349ca tests/fuzz_transactions: add tests for fuzzing transactions with MVCC 2025-08-05 11:43:26 +02:00
Jussi Saurio
cd79d2dce5 test/fuzz: add ALTER TABLE column ops to tx isolation fuzz test 2025-08-05 10:05:56 +03:00
Jussi Saurio
685615dc98 test/fuzz/txn: remove assumption about hardcoded column count 2025-08-05 10:04:25 +03:00
Jussi Saurio
a66b56678d Merge 'Reprepare Statements when Schema changes' from Pedro Muniz
Closes #1967
To support this I had to change how we did `epilogue` similarly to how
SQLite does it. SQLIte first declares a `beginWriteOperation` when some
statement is going to necessitate a Write Transaction. And as we now
need to pass the current schema cookie to `epilogue` it was easier to
call epilogue only in one location (like we do with prologue), and just
have each statement declare their intentions separately. This allows us
to not have to pass the Schema around just to do the epilogue. I believe
this is something that @jussisaurio would be interested in.
~Also had to disable the MVCC test, as it was extremely buggy for me.~
Just disabled reprepare statements for MVCC

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2214
2025-08-05 00:01:14 +03:00
Jussi Saurio
1e59165ea6 Merge 'More State Machines in preparation for tracking IO Completions' from Pedro Muniz
More changes. I want to avoid big PRs, so doing these changes in small
increments. I think in like 2 PRs after this one, I will be able make
the change effectively.

Closes #2400
2025-08-05 00:00:09 +03:00
pedrocarlo
aa05616845 fix tests 2025-08-04 13:08:30 -03:00
pedrocarlo
0e3e64878c workaround the fact that to reparse schema we have to avoid falling into a reprepared statement loop 2025-08-04 12:32:34 -03:00
pedrocarlo
f0ff85a43c add test 2025-08-04 12:32:34 -03:00
Nikita Sivukhin
76bdf0c1ab small fixes 2025-08-04 17:02:53 +04:00
Nikita Sivukhin
2e23230e79 extend raw WAL API with few more methods
- try_wal_watermark_read_page - try to read page from the DB with given WAL watermark value
- wal_changed_pages_after - return set of unique pages changed after watermark WAL position
2025-08-04 16:55:50 +04:00
Nikita Sivukhin
83b1e99a61 fix compilation 2025-08-04 12:53:07 +04:00
Jussi Saurio
addb067416 chore: move tx isolation fuzz test to 'tests' 2025-08-01 13:02:05 +03:00
meteorgan
6262ff4267 support offset for values 2025-08-01 00:46:46 +08:00
Jussi Saurio
e128bd477e Merge 'Support VALUES clauses for compound select' from meteorgan
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2293
2025-07-30 21:34:40 +03:00
Pekka Enberg
895f2acbfb Merge 'Fix concat_ws to match sqlite behavior' from bit-aloo
closes: #2101
Refactors exec_concat_ws to skip null and blob arguments instead of
inserting separators for them. Also adds a fuzz test.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2338
2025-07-30 21:31:58 +03:00
Jussi Saurio
438cbf2872 test/wal api: add comment about purpose of rollback test 2025-07-30 18:17:07 +03:00
bit-aloo
be36fe12c6 add fuzz test for concat_ws 2025-07-30 17:54:51 +05:30
PThorpe92
8ec99a9143 Remove assert for !NO_LOCK_HELD, properly handle writing header if reset 2025-07-30 14:08:51 +03:00
PThorpe92
5c1dbd1a9f Remove unused import 2025-07-30 14:08:33 +03:00
PThorpe92
3db72cf111 Just forget Full checkpoint mode for now, comment out compat test 2025-07-30 14:08:33 +03:00
PThorpe92
436747536c Add integration test for truncate checkpointing wal 2025-07-30 14:08:33 +03:00
pedrocarlo
3831e0db39 convert must_use compile warnings to unused_variables to track locations where we need to refactor in the future 2025-07-28 16:09:26 -03:00
Nikita Sivukhin
d8be1cbef1 fix after rebase 2025-07-28 17:20:57 +04:00
Nikita Sivukhin
4d25cda1e2 slightly adjust one test 2025-07-28 17:20:10 +04:00
Nikita Sivukhin
eb32ea49e6 fix tests 2025-07-28 17:20:10 +04:00
meteorgan
f0c2c377c4 fix typo 2025-07-28 01:01:03 +08:00
meteorgan
ea660b947d support VALUES clauses for compound select 2025-07-27 19:13:23 +08:00
FHaggs
ef88b9914a Fix clippy warnings 2025-07-25 15:41:49 -03:00
FHaggs
ab8040aa89 Add fuzz test for float sums 2025-07-25 15:26:43 -03:00
Pere Diaz Bou
805bcfe633 Merge 'Ignore WAL frames after bad checksum' from Pere Diaz Bou
SQLite basically ignores bad frames instead of panicking, let's try to
do the same.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1956
2025-07-25 15:31:12 +02:00
Nikita Sivukhin
27fcb81f48 add more complex schema changes test for raw WAL API 2025-07-24 22:43:31 +04:00
Pere Diaz Bou
8150a72550 check frame number is not 0
clippy

fmt

fix after rebase

clippy
2025-07-24 17:30:17 +02:00
Jussi Saurio
37955e9a04 Pager/WAL: fix not clearing stale page cache
SQLite behavior is: if another connection has modified the DB when a
read tx starts, it must clear its page cache due to the potentiality
of there being stale versions of pages in it.

In the future, we may want to do either:
1. a more granular invalidation logic for per-conn cache, or
2. a shared versioned page cache

But right now we must follow SQLite to make our current behavior not
corrupt data
2025-07-24 16:23:12 +03:00
Pere Diaz Bou
2ae3b3004e ignore wal frames after bad checksum
SQLite basically ignores bad frames instead of panicking, let's try to
do the same.
2025-07-24 15:11:35 +02:00
Pekka Enberg
62f5a42008 Merge 'WAL insert API: force schema re-parse if necessary after WAL sync session end' from Nikita Sivukhin
This PR partially fixes issue when schema changes were invisible after
WAL sync calls. Now, `wal_insert_end` always read fresh schema cookie
and re-parse schema from scratch if cookie changed.
Generally, the problem of "silent" schema update can be more generic
if(when?) `turso-db` will support multi-process setup. But for now only
single-process can work with `turso-db`, so I decided to inject re-parse
logic explicitly in WAL raw API in order to not introduce any
unnecessary overhead in the ordinary execution path.
This fix is not complete, as if we will have already prepared statements
- they should be re-prepared too in case of schema changes. But this
problem already tracked in the PR
https://github.com/tursodatabase/turso/pull/2214

Reviewed-by: Pedro Muniz (@pedrocarlo)

Closes #2246
2025-07-24 14:39:46 +03:00
Jussi Saurio
025ea8808a Merge 'WAL insert: mark pages as dirty' from Nikita Sivukhin
WAL insert API introduced in the #2231 works incorrectly as it never
mark inserted pages as dirty.
This PR fixes this issue and also add simple fuzz test which fails
without fixes.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2245
2025-07-24 12:58:01 +03:00
Nikita Sivukhin
10836510df remove tracing_subscriber 2025-07-24 11:52:07 +04:00
Nikita Sivukhin
6daa6d07f1 re-parse schema if necessary after WAL sync end 2025-07-24 11:52:07 +04:00
Nikita Sivukhin
fb83862013 fix clippy 2025-07-24 11:49:39 +04:00
Nikita Sivukhin
435ca7fe7a add fuzz tests for raw WAL API 2025-07-24 11:49:39 +04:00
Jussi Saurio
52b4c22be9 Merge 'fix: SUM returns correct float for mixed numeric/non-numeric types & return value on empty set' from Axel Tobieson Rova
# Fix SUM aggregate function for mixed types
Fixes #2133
The SUM aggregate function was returning incorrect results when
processing tables with mixed numeric and non-numeric values. According
to SQLite documentation:
> "If any input to sum() is neither an integer nor a NULL, then sum()
returns a floating point value"
[*](https://sqlite.org/lang_aggfunc.html)
Now both SQLite and Turso yield the same output of 44.0.
--
I modified `Sum` to increment only for numeric values, skipping non-
numeric values. However, if we have mixed numeric values or non-numeric
values, we return a float output. Added a flag to keep track of it.
as pointed out by @FHaggs , If there are no non-NULL input rows then
sum() returns NULL but total() returns 0.0. I decided to include it in
this PR as well. Empty was such a natural test case.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2182
2025-07-24 10:08:01 +03:00
Nikita Sivukhin
2283a04aab add more tests 2025-07-23 11:31:00 +04:00
Nikita Sivukhin
16763e1500 implement raw WAL write api 2025-07-23 11:30:59 +04:00
Axel
9d05344258 Fix Sum() return value if there are no non-NULL input rows
Add simple fuzz test for total and sum.
2025-07-22 17:38:09 +02:00