we were providing the same damn arguments to `.cell_get()` and
`.cell_get_raw_region()` over and OVER and **OVER** and `O V E R`
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes#2021
Separates cursor.key_exists_in_index() into a state machine. The problem with
the main branch implementation is this:
`return_if_io!(seek)`
`return_if_io!(cursor.record())`
The latter may yield on IO and cause the seek to start over, causing an infinite
loop. With an explicit state machine we can control and prevent this.
Fixes#1890
Once rollback was implement we quickly saw that it lacked support for
schema changes so we had to re-estructure things a bit.
## Example of failure:
```bash
turso> begin;
turso> create table t(x);
turso> rollback;
turso> pragma integrity_check;
thread 'main' panicked at core/storage/sqlite3_ondisk.rs:386:36:
called `Result::unwrap()` on an `Err` value: Corrupt("Invalid page type: 83")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```
This happened because it thought table `t` existed because we didn't
rollback that schema.
## Changes:
* The most important change: now every connection has a private copy of
schema. On write txn commit we update a global schema shared between
connections in order for new connections to get updated version from
there. In case of rollback, we simply change connection's schema to
previous version. This change allowed us to remove locks for schema
private copy and keeping schema changes locally in case of concurrency.
Sqlite does things differently, they lazily parse schema in case of
outdated schema, this many schema changes to trigger reading schema from
db file which is slow. If we are able to keep local copy in memory, even
when if we add multiprocessing, it will speed up schema reloading by a
bunch.
* `schema_cookie` is now update for every schema change
* `Insn::ParseSchema` had a nasty bug where it would commit all the
changes made in a query that changed a schema, we fixed that by setting
`auto_commit` to `false` before parsing schema, and setting it back to
previous value once schema is parsed.
Closes#1928
Page 1 must be initialized and written as soon as possible without
marking page as dirty.
OpenEphemeral now requires a state machine to accomodate new
begin_write_tx semantics.
Closes#1839
Makes it easier to test the feature:
```
$ cargo run -- --experimental-indexes
Limbo v0.0.22
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database
limbo> CREATE TABLE t(x);
limbo> CREATE INDEX t_idx ON t(x);
limbo> DROP INDEX t_idx;
```
Support for simple interactive rollback like:
```sql
create table t (x);
insert into t values (1);
begin;
insert into t values (2);
rollback;
select * from t;
```
This PR also fixes some other issues I found while debugging:
* Checkpoint would never `clear_dirty` on pages in page cache.
* Auto commit for interactive transactions was not respected so any
`insert` after `begin` would flush frames regardless of `auto_commit`
state.
* `max_frame` on wal shared state was being updated after every
`append_frame` which was incorrect, as another transaction would be able
to use that new `max_frame` even tho the transaction could've rolled
back. Instead we update the private copy of `max_frame` and only update
it at the end.
Follow up for later are savepoints which require implementing a
subjournal to track savepoints and their modified pages.
Closes#1825
When `struct Database` is constructed, store `is_empty` as an
`Arc<AtomicBool>` - the value is true if:
1. DB size is zero
2. WAL has no frames
When `struct Pager` is constructed, this `Arc` is simply cloned.
When any connection runs a transaction it will first check `is_empty`,
and if the DB is empty, it will lock `init_lock` and then check `is_empty`
again, and if it's still true, it allocates page1 and stores `false` in
the `is_empty` `AtomicBool` and drops the lock.
---
Note that Limbo can currently have a zero DB and a WAL with frames,
as we have no special logic for folding page1 to the main DB file
during initialization.
Page 1 allocation currently happens on the first transaction (read or
write, due to having to support `select * from sqlite_schema` on an
empty DB; we should really check how SQLite actually does this.).
This PR adds support for the instruction `IntegrityCk` which performs an
integrity check on the contents of a single table. Next PR I will try to
implement the rest of the integrity check where we would check indexes
containt correct amount of data and some more.
<img width="1151" alt="image" src="https://github.com/user-
attachments/assets/29d54148-55ba-480f-b972-e38587f0a483" />
Closes#1719