The `run_once()` name is just a historical accident. Furthermore, it now
started to appear elsewhere as well, so let's just call it IO::step() as we
should have from the beginning.
hell yeah
concurrency tests passing now woosh
finally write tests passed
Most of the cdc tests are passing yay
autoincremeent draft
remove shared schema code that broke transactions
sequnce table should reset if table is drop
fmt
fmt
fmt
This fairly long commit implements persistence for materialized view.
It is hard to split because of all the interdependencies between components,
so it is a one big thing. This commit message will at least try to go into
details about the basic architecture.
Materialized Views as tables
============================
Materialized views are now a normal table - whereas before they were a virtual
table. By making a materialized view a table, we can reuse all the
infrastructure for dealing with tables (cursors, etc).
One of the advantages of doing this is that we can create indexes on view
columns. Later, we should also be able to write those views to separate files
with ATTACH write.
Materialized Views as Zsets
===========================
The contents of the table are a ZSet: rowid, values, weight. Readers will
notice that because of this, the usage of the ZSet data structure dwindles
throughout the codebase. The main difference between our materialized ZSet and
the standard DBSP ZSet, is that obviously ours is backed by a BTree, not a Hash
(since SQLite tables are BTrees)
Aggregator State
================
In DBSP, the aggregator nodes also have state. To store that state, there is a
second table. The table holds all aggregators in the view, and there is one
table per view. That is __turso_internal_dbsp_state_{view_name}. The format of
that table is similar to a ZSet: rowid, serialized_values, weight. We serialize
the values because there will be many aggregators in the table. We can't rely
on a particular format for the values.
The Materialized View Cursor
============================
Reading from a Materialized View essentially means reading from the persisted
ZSet, and enhancing that with data that exists within the transaction.
Transaction data is ephemeral, so we do not materialize this anywhere: we have
a carefully crafted implementation of seek that takes care of merging weights
and stitching the two sets together.
Now supported:
- AEGIS variants: 256, 256X2, 256X4, 128L, 128X2, 128X4
- AES-GCM variants: AES-128-GCM, AES-256-GCM
With minor changes in order to make it easy to add new ciphers later
regardless of their key size.
Reviewed-by: Avinash Sajjanshetty (@avinassh)
Closes#2899
This adds support for "OFF" and "FULL" (default) synchronous modes. As
future work, we need to add NORMAL and EXTRA as well because
applications expect them.
This patch brings a bunch of quality of life improvements to encryption:
1. Previously, we just let any string to be used as a key. I have
updated the `PRAGMA hexkey=''` to get the key in hex. I have also
renamed from `key`, because that will be used to get passphrase
2. Added `PRAGMA cipher` so that now users can select which cipher they
want to use (for now, either `aegis256` or `aes256gcm`)
3. We now set the encryption context when both cipher and key are set
I also updated tests to reflect this.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#2779
Previously, the encryption module had hardcoded a lot of things. This
refactor makes it slightly nice and makes it configurable.
Right now cipher algorithm is assumed and hardcoded, I will make that
configurable in the upcoming PR
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2722
This PR make it possible to do 2 pretty crazy things with turso-db:
1. Now we can mix WAL frames inserts with SQL execution within same
transaction. This will allow sync engine to execute rebase of local
changes within atomically over main database file (the operation first
require us to push new frames to physically revert local changes and
then we need to replay local logical changes on top of the modified DB
state)
2. Under `conn_raw_api` Cargo feature turso-db now expose method which
allow caller to specify WAL file path. This dangerous capability exposed
for sync-engine which maintain 2 databases: main one and "revert"-DB
which shares same DB file but has it's own separate WAL. As sync-engine
has full control over checkpoint - it can guarantee that DB file will be
consistent with both main and "revert" DB WALs.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2716
This PR adds information about checkpoint sequence number to the WAL raw
API. Will be used in the sync engine.
Depends on the #2699
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2707
We were storing `txid` in `ProgramState`, this meant it was impossible
to track interactive transactions. This was extracted to `Connection`
instead.
Moreover, transaction state for mvcc now is reset on commit.
Closes#2689
This patch adds support for per page encryption. The code is of alpha
quality, was to test my hypothesis. All the encryption code is gated
behind a `encryption` flag. To play with it, you can do:
```sh
cargo run --features encryption -- database.db
turso> PRAGMA key='turso_test_encryption_key_123456';
turso> CREATE TABLE t(v);
```
Right now, most stuff is hard coded. We use AES GCM 256. This
information is not stored anywhere, but in future versions we will start
saving this info in the file. When writing to disk, we will generate a
cryptographically secure random salt, use that to encrypt the page. Then
we will store the authentication tag and the salt in the page itself. To
accommodate this encryption hardcodes reserved space of 28 bytes.
Once the key is set in the connection, we propagate that information to
pager and the WAL, to encrypt / decrypt when reading from disk.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2567