Commit Graph

10494 Commits

Author SHA1 Message Date
Henrik Ingo
54a9821bcf Tighten Nyrkio p-value to 0.00001
This will produce even less alerts than so far, but still catches
actual changes in performance.
2025-10-27 07:09:02 +02:00
Preston Thorpe
a024265d23 Merge 'Return null terminated strings from sqlite3_column_text' from Preston Thorpe
closes #3811
adds `text_cache` which owns the null terminated bytes, which get cached
if a subsequent call to `sqlite3_column_text` is made.
#3809 depends on this fix

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3817
2025-10-23 13:21:12 -04:00
PThorpe92
8ed4e7cac1 Add test for null terminated string from sqlite3_column_text 2025-10-23 10:54:19 -04:00
PThorpe92
23cddbcad9 Return null terminated strings from sqlite3_column_text 2025-10-23 10:13:42 -04:00
Jussi Saurio
ae22468d8b Merge 'Order by heap sort' from Nikita Sivukhin
This PR implements simple heap-sort approach for query plans like
`SELECT ... FROM t WHERE ... ORDER BY ... LIMIT N` in order to maintain
small set of top N elements in the ephemeral B-tree and avoid sort and
materialization of whole dataset.
I removed all optimizations not related to this particular change in
order to make branch lightweight.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3726
2025-10-23 15:00:42 +03:00
Jussi Saurio
64560a61c3 Merge 'Support statement-level rollback via anonymous savepoints' from Jussi Saurio
## Gist
This PR implements _statement subtransactions_, which means that a
single statement within an interactive transaction can individually be
rolled back.
## Background
The default constraint violation resolution strategy in SQLite is
`ABORT`, which means to rollback the statement that caused the conflict.
For example:
```sql
CREATE TABLE t(x UNIQUE);
INSERT INTO t VALUES (1);

BEGIN;
  INSERT INTO t VALUES (2),(3); -- ok
  INSERT INTO t VALUES (4),(1); -- conflict on 1, this statement should rollback
  INSERT INTO t VALUES (5); -- ok
COMMIT; -- ok

SELECT * FROM t;
1
2
3
5
```
 So far we haven't been able to support this due to lack of support for
subtransactions, and have used the `ROLLBACK` strategy, which means to
rollback the entire transaction on any constraint error.
## Problem
Although PRIMARY KEY and UNIQUE constraints allow defining the conflict
resolution strategy (e.g. `id INTEGER PRIMARY KEY ON CONFLICT
ROLLBACK`), FOREIGN KEY violations do not support this: they always use
`ABORT` i.e. statement subtransaction rollback. For this reason alone it
is important to implement this mechanism now rather than later, since we
already have FOREIGN KEY support implemented.
## Details
This PR implements statement subtransactions with _anonymous
savepoints_. This means that whenever a statement begins, it will open a
new savepoint which will write "page undo images" into a temporary file
called a _subjournal_. Whenever the statement marks a page as dirty, it
will write the before-image of the page into the subjournal so that its
modifications can be undone in the event of an ABORT (statement
rollback).
- Right now, only anonymous savepoints are supported, so the explicit
`SAVEPOINT` syntax is not.
- Due to the above, there can be only one savepoint open per pager, and
this is enforced with assertions.
- The subjournal file is currently entirely in memory. If it were not,
we would either have to block on IO or refactor many usages of code to
account for potentially pending completions.
- Constraint errors no longer cause transactions to abort nor do they
cause the page cache to be cleared - instead, subjournaled pages will be
brought back into the page cache which effectively handles the same
behavior albeit more fine-grained.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3792
2025-10-23 15:00:11 +03:00
Pekka Enberg
418fc90f8a Merge 'core/storage: Cache schema cookie in Pager' from Pekka Enberg
Every transaction was reading page 1 from the WAL to check the schema
cookie in op_transaction, causing unnecessary WAL lookups.
This commit caches the schema_cookie in Pager as AtomicU64, similar to
how page_size and reserved_space are already cached. The cache is
updated when the header is read/modified and invalidated in
begin_read_tx() when WAL changes are detected from other connections.
This matches SQLite's approach of caching frequently accessed header
fields to avoid repeated page 1 reads. Improves write throughput by 5%
in our benchmarks.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3727
2025-10-23 14:00:27 +03:00
Jussi Saurio
c2b84f7484 Randomly inject txn control statements into index_mutation_upsert_fuzz 2025-10-22 23:40:45 +03:00
Jussi Saurio
2b73260dd9 Handle cases where DB grows or shrinks due to savepoint rollback 2025-10-22 23:40:45 +03:00
Jussi Saurio
fe51804e6b Implement crude way of making opening subtransaction conditional
We don't want something like `BEGIN IMMEDIATE` to start a subtransaction,
so instead we will open it if:

- Statement is write, AND

a) Statement has >0 table_references, or
b) The statement is an INSERT (INSERT doesn't track table_references in
   the same way as other program types)
2025-10-22 23:40:45 +03:00
Jussi Saurio
ea98d8086f Change default ON CONFLICT mode back to ABORT now that we support it 2025-10-22 23:40:45 +03:00
Jussi Saurio
e04c6c9b46 Mark pages_to_balance as dirty only after loading 2025-10-22 23:40:45 +03:00
Jussi Saurio
a14bbdecf2 Add assertion that page is loaded when pager.add_dirty() is called 2025-10-22 23:40:45 +03:00
Jussi Saurio
7376475cb3 Do not start statement subtransactions when MVCC is enabled
MVCC does not support statement-level rollback.
2025-10-22 23:40:45 +03:00
Jussi Saurio
e9bfb57065 Fix incorrectly implemented test
Test started executing another statement when previous statement
returned IO the last time and didn't run to completion
2025-10-22 23:40:45 +03:00
Jussi Saurio
1dcfd3d068 fix stale test: constraint errors do not roll back tx anymore 2025-10-22 23:40:45 +03:00
Jussi Saurio
d8cc57cf14 clippy: Remove unnecessary referencing 2025-10-22 23:40:45 +03:00
Jussi Saurio
2d3ac79fe9 Modify fk_deferred_constraints_fuzz
- Add more statements per iteration
- Allow interactive transaction to contain multiple statements
- add VERBOSE flag to print all statements executed in a successful
  iteration
2025-10-22 23:40:45 +03:00
Jussi Saurio
1fdc0258cd Unignore fk_deferred_constraints_fuzz because it doesnt fail anymore 2025-10-22 23:40:45 +03:00
Jussi Saurio
97aad78b3f Allow dead code - SQLITE_CONSTRAINT_FOREIGNKEY is currently unused 2025-10-22 23:40:45 +03:00
Jussi Saurio
086ba8c946 VDBE: begin statement subtransaction in op_transaction 2025-10-22 23:40:45 +03:00
Jussi Saurio
904cbe535d VDBE: handle subtransaction commits/aborts in op_halt 2025-10-22 23:40:45 +03:00
Jussi Saurio
f0548c280f ProgramState: add begin_statement() and end_statement() 2025-10-22 23:40:45 +03:00
Jussi Saurio
734eeb5bab VDBE: constraint errors do not cause a tx rollback by default 2025-10-22 23:40:45 +03:00
Jussi Saurio
25f8ba0025 Pager: clear savepoints when tx rolls back 2025-10-22 23:40:45 +03:00
Jussi Saurio
a8cf8e4594 Pager: subjournal page if required when it's marked as dirty 2025-10-22 23:40:45 +03:00
Jussi Saurio
97177dae02 add missing imports 2025-10-22 23:40:44 +03:00
Jussi Saurio
f4af7c2242 Pager: add begin_statement() method 2025-10-22 23:40:44 +03:00
Jussi Saurio
a19c5c22ac Pager: add rollback_to_newest_savepoint() method 2025-10-22 23:40:44 +03:00
Jussi Saurio
86d5ad6815 pager: allow upserted cached page not to be dirty 2025-10-22 23:40:44 +03:00
Jussi Saurio
5b01605fae Pager: add subjournal_page_if_required() method 2025-10-22 23:40:44 +03:00
Jussi Saurio
e8226c0e4b Pager: add clear_savepoint() method 2025-10-22 23:40:44 +03:00
Jussi Saurio
aa1eebbfcb Pager: add open_savepoint() and release_savepoint() methods 2025-10-22 23:40:44 +03:00
Jussi Saurio
77be1f08ae Pager: add open_subjournal method 2025-10-22 23:40:44 +03:00
Jussi Saurio
2a03c1a617 Add subjournal and savepoints to Pager struct 2025-10-22 23:40:44 +03:00
Jussi Saurio
8b15a06a85 Add Savepoint struct 2025-10-22 23:40:44 +03:00
Jussi Saurio
459c01f93c Add subjournal module
The subjournal is a temporary file where stmt subtransactions write an
'undo log' of pages before modifying them. If a stmt subtransaction
rolls back, the pages are restored from the subjournal.
2025-10-22 23:40:44 +03:00
Jussi Saurio
ad80285437 Rename is_scope to deferred and invert respective boolean logic
Much clearer name for what it is/does
2025-10-22 23:40:44 +03:00
Jussi Saurio
d4a9797f79 Store two foreign key counters in ProgramState
1. The number of deferred FK violations when the statement started.
   When a statement subtransaction rolls back, the connection's
   deferred violation counter will be reset to this value.
2. The number of immediate FK violations that occurred during the
   statement. In practice we just need to know whether this number
   is nonzero, and if it is, the statement subtransaction will roll
   back.

Statement subtransactions will be implemented in future commits.
2025-10-22 23:40:44 +03:00
Jussi Saurio
6557a41503 Refactor emit_fk_violation() to always issue a FkCounter instruction 2025-10-22 23:40:44 +03:00
Nikita Sivukhin
6aa67c6ea0 Revert "slight reorder of operations"
This reverts commit 8e107ab18e.
2025-10-22 20:21:52 +04:00
Nikita Sivukhin
a071d40d5f Revert "faster extend_from_slice"
This reverts commit ae8adc0449.
2025-10-22 20:21:47 +04:00
Nikita Sivukhin
91ffb4e249 Revert "avoid allocations"
This reverts commit dba195bdfa.
2025-10-22 20:21:39 +04:00
Nikita Sivukhin
53957b6d22 Revert "simplify serial_type size calculation"
This reverts commit f19c73822e.
2025-10-22 20:21:00 +04:00
Nikita Sivukhin
b32d22a2fd Revert "move more possible option higher"
This reverts commit c0fdaeb475.
2025-10-22 20:20:54 +04:00
Nikita Sivukhin
8e1cec5104 Revert "alternative read_variant implementation"
This reverts commit 68650cf594.
2025-10-22 19:30:43 +04:00
Pekka Enberg
7b6667d079 Merge 'Add AtomicEnum proc macro to generate atomic wrappers to replace RwLocks' from Preston Thorpe
This PR adds the following derive macro
`AtomicEnum`
for the cases like the following:
```rust
pub enum SyncMode {
    Off = 0,
    Full = 2,
}
// or
pub enum CipherMode {
    Aes128Gcm,
    Aes256Gcm,
    Aegis256,
    Aegis128L,
    Aegis128X2,
    Aegis128X4,
    Aegis256X2,
    Aegis256X4,
}
```
Which are very basic enums, but which currently either require a
`RwLock` (the current solution for both of the above), or they require a
hand rolled atomic wrapper to keep the state without the lock.
```rust
pub struct AtomicDbState(AtomicUsize);

impl AtomicDbState {
    #[inline]
    pub const fn new(state: DbState) -> Self {
        Self(AtomicUsize::new(state as usize))
    }
```
This PR adds `AtomicEnum` derive macro which generates and let's us use
`AtomicDbState` or `AtomicCipherMode`, and derives `get`, `set` and
`swap` methods on them.
Each enum can have up to 1 named or unnamed field, and it supports i8/u8
and boolean types, which it encodes into half of a u16, with the
discriminant in the other half. Otherwise, it will just use a u8 and
encode the boolean into the 7th bit.

Closes #3766
2025-10-22 17:07:58 +03:00
Pekka Enberg
5dd503b7b9 core/storage: Cache schema cookie in Pager
Every transaction was reading page 1 from the WAL to check the schema cookie
in op_transaction, causing unnecessary WAL lookups.

This commit caches the schema_cookie in Pager as AtomicU64, similar to how
page_size and reserved_space are already cached. The cache is updated when the
header is read/modified and invalidated in begin_read_tx() when WAL changes
are detected from other connections.

This matches SQLite's approach of caching frequently accessed header fields to
avoid repeated page 1 reads. Improves write throughput by 5% in our
benchmarks.
2025-10-22 16:51:15 +03:00
Nikita Sivukhin
689c11a21a cargo fmt 2025-10-22 17:45:49 +04:00
Nikita Sivukhin
0fb149c4c9 fix bug 2025-10-22 17:44:02 +04:00