MVCC bootstrap connection got stuck into an infinite statement
reparsing loop because the bootstrap procedure happened before the
on-disk schema was deserialized.
Sqlite has a crazy easter egg where a 1 Gib file offset, it creates a
`PENDING_BYTE_PAGE` that is used only by the VFS layer, and is never
read or written into.
To properly test this, I took inspiration from SQLITE testing framework,
and defined a helper method, that is conditionally compiled with the
`test_helper` feature enabled.
https://github.com/sqlite/sqlite/blob/7e38287da43ea3b661da3d8c1f431aa907
d648c9/src/main.c#L4327
As the `PENDING_BYTE` is normally at the 1 Gib mark, I created a
function that modifies the static `PENDING_BYTE` atomic to whatever
value we want. This means we can test this unusual behaviours at any DB
file size we want.
`fuzz_pending_byte_database` is the test that fuzzes different pending
byte offsets and does an integrity check at the end to confirm, we are
compatible with SQLITE
Closes#2749
<img width="1100" height="740" alt="image" src="https://github.com/user-
attachments/assets/06eb258f-b4b4-47bf-85f9-df1cf411e1df" />
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#3431
**Handle table ID / rootpages properly for both checkpointed and non-
checkpointed tables**
Table ID is an opaque identifier that is only meaningful to the MV
store.
Each checkpointed MVCC table corresponds to a single B-tree on the
pager,
which naturally has a root page.
**We cannot use root page as the MVCC table ID directly because:**
- We assign table IDs during MVCC commit, but
- we commit pages to the pager only during checkpoint
which means the root page is not easily knowable ahead of time.
**Hence:**
- MVCC table ids are always negative
- sqlite_schema rows will have a negative rootpage column if the
table has not been checkpointed yet.
- on checkpoint when the table is allocated a real root page, we update
the row in sqlite_schema and in MV store's internal mapping
**On recovery:**
- All sqlite_schema tables are read directly from disk and assigned
`table_id = -1 * root_page` -- root_page on disk must be positive
- Logical log is deserialized and inserted into MV store
- Schema changes from logical_log are captured into the DB's global
schema
**Note about recovery:**
I changed MVCC recovery to happen on DB initialization which should
prevent any races, so no need for `recover_lock`, right @pereman2 ?
Closes#3419
This PR bundles 2 fixes:
1. Index search must skip NULL values
2. UPDATE must avoid using index which column is used in the SET clause
* This was an optimization to not do full scan in case of `UPDATE t
SET ... WHERE col = ?` but instead of doing this hacks we must properly
load updated row set to the ephemeral index and flush it after update
will be finished instead of modifying BTree inplace
* So, for now we completely remove this optimization and quitely
wait for proper optimization to land
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#3459
We had encryption feature behind a compiler flag. However, it wasn't
enabled by default. This patch:
- enables compiler flag by default
- it also adds an opt in runtime flag `experimental-encryption`
- the runtime flag is disabled by default
Closes#3457
Table ID is an opaque identifier that is only meaningful to the MV store.
Each checkpointed MVCC table corresponds to a single B-tree on the pager,
which naturally has a root page.
We cannot use root page as the MVCC table ID directly because:
- We assign table IDs during MVCC commit, but
- we commit pages to the pager only during checkpoint
which means the root page is not easily knowable ahead of time.
Hence, we:
- store the mapping between table id and btree rootpage
- sqlite_schema rows will have a negative rootpage column if the
table has not been checkpointed yet.
We caught a pretty bad bug quite late because this fuzz test only
ran on btree changes - let's run it on every CI run but with less
iterations than before.
Before, we validated that condition during program emit - which works
for fixed values of parameters but doesn't work with variables provided
externally to the prepared statement
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3421
fixes#1976
and #1605
```zsh
turso> DROP TABLE IF EXISTS t;
CREATE TABLE t (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT
);
turso> INSERT INTO t (name) VALUES ('A'); SELECT * FROM sqlite_sequence;
┌──────┬─────┐
│ name │ seq │
├──────┼─────┤
│ t │ 1 │
└──────┴─────┘
turso> DROP TABLE IF EXISTS t;
CREATE TABLE t (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT
);
turso> INSERT INTO t (name) VALUES ('A'); SELECT * FROM sqlite_sequence;
┌──────┬─────┐
│ name │ seq │
├──────┼─────┤
│ t │ 1 │
└──────┴─────┘
turso> INSERT INTO t (name) VALUES ('A'); SELECT * FROM sqlite_sequence;
┌──────┬─────┐
│ name │ seq │
├──────┼─────┤
│ t │ 2 │
└──────┴─────┘
turso>
```
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#2983