Commit Graph

730 Commits

Author SHA1 Message Date
pedrocarlo
db005c81a0 add option to disable wal checkpoint 2025-07-03 12:04:17 -03:00
Pekka Enberg
90e035b6b0 Merge 'Rollback schema support' from Pere Diaz Bou
Fixes #1890
Once rollback was implement we quickly saw that it lacked support for
schema changes so we had to re-estructure things a bit.
## Example of failure:
```bash
turso> begin;
turso> create table t(x);
turso> rollback;
turso> pragma integrity_check;
thread 'main' panicked at core/storage/sqlite3_ondisk.rs:386:36:
called `Result::unwrap()` on an `Err` value: Corrupt("Invalid page type: 83")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```
This happened because it thought table `t` existed because we didn't
rollback that schema.
## Changes:
* The most important change: now every connection has a private copy of
schema. On write txn commit we update a global schema shared between
connections in order for new connections to get updated version from
there. In case of rollback, we simply change connection's schema to
previous version. This change allowed us to remove locks for schema
private copy and keeping schema changes locally in case of concurrency.
 Sqlite does things differently, they lazily parse schema in case of
outdated schema, this many schema changes to trigger reading schema from
db file which is slow. If we are able to keep local copy in memory, even
when if we add multiprocessing, it will speed up schema reloading by a
bunch.
* `schema_cookie` is now update for every schema change
* `Insn::ParseSchema` had a nasty bug where it would commit all the
changes made in a query that changed a schema, we fixed that by setting
`auto_commit` to `false` before parsing schema, and setting it back to
previous value once schema is parsed.

Closes #1928
2025-07-03 14:18:00 +03:00
Pere Diaz Bou
5d856499c4 move update schema global on commit and not on rollback txn 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
c799396c3d rollback schema in connection 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
65a7fe13cf remove lock from private schema copy 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
abf1699dd2 set scheam version and update shared schema in txn 2025-07-03 12:36:48 +02:00
Pekka Enberg
fa442ecd6e core/storage: Switch to turso_assert in btree.rs
Let's help out Antithesis to find interesting bugs.
2025-07-03 13:25:13 +03:00
KaguraMilet
f339e9c1ad fix integrity check error 2025-07-03 13:47:30 +08:00
KaguraMilet
562dd389db Merge branch 'tursodatabase:main' into buffer 2025-07-03 13:46:37 +08:00
Pekka Enberg
36b550cca4 Merge 'Fix boxed memory leaks' from Ihor Andrianov
We should recreate original box to drop it properly
Also made a fast path for hashing. When key div by 2. It should decrease
cpu cycles on hot path by x10 approximately
This thing is tricky, made a long running test that verify bug, put
#[ignore] on it to not slow down CI

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #1873
2025-07-02 19:42:54 +03:00
Ihor Andrianov
564bb28dea rewrite test to make fix verifiable 2025-07-01 17:27:58 +03:00
Ihor Andrianov
68e638e955 fix second occurance 2025-07-01 17:27:58 +03:00
Ihor Andrianov
647183938f fix sub with below 0 in tests 2025-07-01 17:27:58 +03:00
Ihor Andrianov
56b1fcf3b3 remove unused imports 2025-07-01 17:27:58 +03:00
Ihor Andrianov
41a11afe7c leaking box memory 2025-07-01 17:27:47 +03:00
KaguraMilet
aca08238d8 fix buffer pool is not thread safe problem 2025-07-01 16:06:55 +08:00
Pekka Enberg
51edab7032 Merge 'WAL record db_size frame on commit last frame' from Pere Diaz Bou
`db_size` is `>0` in case of last frame written of a transaction. This
is necessary as we need to know -- while recovering wal contents -- that
we have read a transaction fully instead of treating every frame as its
own transaction.

Closes #1866
2025-06-30 13:46:40 +03:00
Pekka Enberg
9c1b7897ac Fix URLs to point to github.com/tursodatabase/turso 2025-06-30 11:23:53 +03:00
Pere Diaz Bou
081f09ce20 fix read wal max frame 2025-06-30 10:16:44 +02:00
Pere Diaz Bou
486c4b69fb WAL record db_size frame on commit last frame
`db_size` is `>0` in case of last frame written of a transaction. This
is necessary as we need to know -- while recovering wal contents -- that
we have read a transaction fully instead of treating every frame as its
own transaction.
2025-06-27 16:21:48 +02:00
Pekka Enberg
5791ab9dff Merge 'Cache reserved_space and page_size values at Pager init to prevent doing redundant IO' from Krishna Vishal
### Problem
Profiling revealed that `usable_space()` calls were consuming 60% of
total execution time for simple SELECT queries, making Limbo
approximately `6x` slower than SQLite for SELECT operations.
The bottleneck was caused by `usable_space()` performing expensive I/O
operations on every call to read `page_size` and `reserved_space` from
the database header, despite `page_size` values being effectively
immutable after database initialization. Only `reserved_space` is
allowed to increase in SQLite.
Evidence: https://share.firefox.dev/44tCUIy
### Solution
Implemented OnceCell-based caching for both page_size and reserved_space
values in the Pager struct:
`page_size: OnceCell<u16>` - Page size is immutable after database
initialization per SQLite specification
`reserved_space: OnceCell<u8>` - Reserved space rarely changes and only
grows, safe to cache
### Performance Impact
Benchmark results: Simple SELECT query time reduced from ~2.89ms to
~1.29ms (~55% improvement)

Closes #1852
2025-06-27 16:40:14 +03:00
Pere Diaz Bou
8e0f8041ed properly set database header contents on initialization
After moving page1 write to be async I moved the contents update to
wrong place. This should fix it.
2025-06-27 11:44:11 +02:00
Krishna Vishal
cda1ab8d76 Use OnceCell instead of OnceLock. 2025-06-27 13:32:03 +05:30
Krishna Vishal
af2ab87810 Cache reserved_space and page_size values at Pager init.
We use `OnceLock` for this. TODO: Invalidate reserved_space when
we make functionality the to change it.
2025-06-27 12:51:11 +05:30
pedrocarlo
1dc28e32f0 fix io_uring completion + clippy 2025-06-26 22:17:28 -03:00
pedrocarlo
bac5e4b563 refactor File and Database Storage to remove Arc<Connection> and return Arc<Connection> for caller to wait for completion 2025-06-26 22:17:28 -03:00
pedrocarlo
64d9193e7b refactor Completion to have a type field and lift common is_complete property 2025-06-26 22:17:27 -03:00
Pekka Enberg
572c722390 Merge 'write page1 on database initialization' from Pere Diaz Bou
Page 1 must be initialized and written as soon as possible without
marking page as dirty.
OpenEphemeral now requires a state machine to accomodate new
begin_write_tx semantics.

Closes #1839
2025-06-26 20:43:40 +03:00
Pere Diaz Bou
aa93b70a96 empty -> unitialized 2025-06-26 17:59:23 +02:00
Pere Diaz Bou
e341b80051 clippy 2025-06-26 15:01:54 +02:00
Pere Diaz Bou
4d80b8237d write page1 on database initialization
Page 1 must be initialized and written as soon as possible without
marking page as dirty.
2025-06-26 14:44:23 +02:00
Pekka Enberg
2fc5c0ce5c Switch to runtime flag for enabling indexes
Makes it easier to test the feature:

```
$ cargo run --  --experimental-indexes
Limbo v0.0.22
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database
limbo> CREATE TABLE t(x);
limbo> CREATE INDEX t_idx ON t(x);
limbo> DROP INDEX t_idx;
```
2025-06-26 10:07:28 +03:00
Pekka Enberg
a48198ec60 Merge 'Rollback simple support' from Pere Diaz Bou
Support for simple interactive rollback like:
```sql
    create table t (x);
    insert into t values (1);
    begin;
    insert into t values (2);
    rollback;
    select * from t;
```
This PR also fixes some other issues I found while debugging:
* Checkpoint would never `clear_dirty` on pages in page cache.
* Auto commit for interactive transactions was not respected so any
`insert` after `begin` would flush frames regardless of `auto_commit`
state.
* `max_frame` on wal shared state was being updated after every
`append_frame` which was incorrect, as another transaction would be able
to use that new `max_frame` even tho the transaction could've rolled
back. Instead we update the private copy of `max_frame` and only update
it at the end.
Follow up for later are savepoints which require implementing a
subjournal to track savepoints and their modified pages.

Closes #1825
2025-06-25 20:02:09 +03:00
Pere Diaz Bou
c02337c8cc clear dirty pages on rollback 2025-06-25 14:01:53 +02:00
Pere Diaz Bou
34a6d236ab fix comp 2025-06-25 14:01:53 +02:00
Pere Diaz Bou
96c30be488 more clippy 2025-06-25 14:01:53 +02:00
Pere Diaz Bou
0119b0f99d clippy 2025-06-25 14:01:53 +02:00
Pere Diaz Bou
a3ad138df8 checkpoint clear dirty page if it was on cache 2025-06-25 14:01:53 +02:00
Pere Diaz Bou
22f9cd695d commit_txn track rollback case 2025-06-25 14:00:57 +02:00
Pere Diaz Bou
8517cea530 add finish append frames log 2025-06-25 14:00:57 +02:00
Pere Diaz Bou
bdd2010df3 autocommit rollback 2025-06-25 14:00:57 +02:00
Jussi Saurio
27b3ecf599 core/db&pager: fix locking for initializing empty database
When `struct Database` is constructed, store `is_empty` as an
`Arc<AtomicBool>` - the value is true if:

1. DB size is zero
2. WAL has no frames

When `struct Pager` is constructed, this `Arc` is simply cloned.
When any connection runs a transaction it will first check `is_empty`,
and if the DB is empty, it will lock `init_lock` and then check `is_empty`
again, and if it's still true, it allocates page1 and stores `false` in
the `is_empty` `AtomicBool` and drops the lock.

---

Note that Limbo can currently have a zero DB and a WAL with frames,
as we have no special logic for folding page1 to the main DB file
during initialization.

Page 1 allocation currently happens on the first transaction (read or
write, due to having to support `select * from sqlite_schema` on an
empty DB; we should really check how SQLite actually does this.).
2025-06-25 14:45:21 +03:00
Jussi Saurio
480f0a04b5 make clippy happy about mutating database_size immediately after default construction 2025-06-24 14:41:50 -03:00
Jussi Saurio
f21cde9501 post-rebase fixes 2025-06-24 14:41:50 -03:00
Jussi Saurio
920e88a6a9 clippy 2025-06-24 14:41:50 -03:00
Diego Reis
1921fcb943 Add comments to clarify current behaviour 2025-06-24 14:41:50 -03:00
Diego Reis
6ae196d7b3 Add mutex to allocating page1
This is to prevent race conditions where two threads could try to initialize database at the same time
2025-06-24 14:41:50 -03:00
Diego Reis
a1b7b3c6f6 Fix clippy complains 2025-06-24 14:41:50 -03:00
Jussi Saurio
a5d71a65be clippy doesnt get it 2025-06-24 14:41:50 -03:00
Jussi Saurio
133d498724 Implement a header_accessor module so that DatabaseHeader structs arent initialized on every access 2025-06-24 14:41:50 -03:00