Fixes#1890
Once rollback was implement we quickly saw that it lacked support for
schema changes so we had to re-estructure things a bit.
## Example of failure:
```bash
turso> begin;
turso> create table t(x);
turso> rollback;
turso> pragma integrity_check;
thread 'main' panicked at core/storage/sqlite3_ondisk.rs:386:36:
called `Result::unwrap()` on an `Err` value: Corrupt("Invalid page type: 83")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```
This happened because it thought table `t` existed because we didn't
rollback that schema.
## Changes:
* The most important change: now every connection has a private copy of
schema. On write txn commit we update a global schema shared between
connections in order for new connections to get updated version from
there. In case of rollback, we simply change connection's schema to
previous version. This change allowed us to remove locks for schema
private copy and keeping schema changes locally in case of concurrency.
Sqlite does things differently, they lazily parse schema in case of
outdated schema, this many schema changes to trigger reading schema from
db file which is slow. If we are able to keep local copy in memory, even
when if we add multiprocessing, it will speed up schema reloading by a
bunch.
* `schema_cookie` is now update for every schema change
* `Insn::ParseSchema` had a nasty bug where it would commit all the
changes made in a query that changed a schema, we fixed that by setting
`auto_commit` to `false` before parsing schema, and setting it back to
previous value once schema is parsed.
Closes#1928
We should recreate original box to drop it properly
Also made a fast path for hashing. When key div by 2. It should decrease
cpu cycles on hot path by x10 approximately
This thing is tricky, made a long running test that verify bug, put
#[ignore] on it to not slow down CI
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes#1873
`db_size` is `>0` in case of last frame written of a transaction. This
is necessary as we need to know -- while recovering wal contents -- that
we have read a transaction fully instead of treating every frame as its
own transaction.
Closes#1866
`db_size` is `>0` in case of last frame written of a transaction. This
is necessary as we need to know -- while recovering wal contents -- that
we have read a transaction fully instead of treating every frame as its
own transaction.
### Problem
Profiling revealed that `usable_space()` calls were consuming 60% of
total execution time for simple SELECT queries, making Limbo
approximately `6x` slower than SQLite for SELECT operations.
The bottleneck was caused by `usable_space()` performing expensive I/O
operations on every call to read `page_size` and `reserved_space` from
the database header, despite `page_size` values being effectively
immutable after database initialization. Only `reserved_space` is
allowed to increase in SQLite.
Evidence: https://share.firefox.dev/44tCUIy
### Solution
Implemented OnceCell-based caching for both page_size and reserved_space
values in the Pager struct:
`page_size: OnceCell<u16>` - Page size is immutable after database
initialization per SQLite specification
`reserved_space: OnceCell<u8>` - Reserved space rarely changes and only
grows, safe to cache
### Performance Impact
Benchmark results: Simple SELECT query time reduced from ~2.89ms to
~1.29ms (~55% improvement)
Closes#1852
Page 1 must be initialized and written as soon as possible without
marking page as dirty.
OpenEphemeral now requires a state machine to accomodate new
begin_write_tx semantics.
Closes#1839
Makes it easier to test the feature:
```
$ cargo run -- --experimental-indexes
Limbo v0.0.22
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database
limbo> CREATE TABLE t(x);
limbo> CREATE INDEX t_idx ON t(x);
limbo> DROP INDEX t_idx;
```
Support for simple interactive rollback like:
```sql
create table t (x);
insert into t values (1);
begin;
insert into t values (2);
rollback;
select * from t;
```
This PR also fixes some other issues I found while debugging:
* Checkpoint would never `clear_dirty` on pages in page cache.
* Auto commit for interactive transactions was not respected so any
`insert` after `begin` would flush frames regardless of `auto_commit`
state.
* `max_frame` on wal shared state was being updated after every
`append_frame` which was incorrect, as another transaction would be able
to use that new `max_frame` even tho the transaction could've rolled
back. Instead we update the private copy of `max_frame` and only update
it at the end.
Follow up for later are savepoints which require implementing a
subjournal to track savepoints and their modified pages.
Closes#1825
When `struct Database` is constructed, store `is_empty` as an
`Arc<AtomicBool>` - the value is true if:
1. DB size is zero
2. WAL has no frames
When `struct Pager` is constructed, this `Arc` is simply cloned.
When any connection runs a transaction it will first check `is_empty`,
and if the DB is empty, it will lock `init_lock` and then check `is_empty`
again, and if it's still true, it allocates page1 and stores `false` in
the `is_empty` `AtomicBool` and drops the lock.
---
Note that Limbo can currently have a zero DB and a WAL with frames,
as we have no special logic for folding page1 to the main DB file
during initialization.
Page 1 allocation currently happens on the first transaction (read or
write, due to having to support `select * from sqlite_schema` on an
empty DB; we should really check how SQLite actually does this.).