The `run_once()` name is just a historical accident. Furthermore, it now
started to appear elsewhere as well, so let's just call it IO::step() as we
should have from the beginning.
This fairly long commit implements persistence for materialized view.
It is hard to split because of all the interdependencies between components,
so it is a one big thing. This commit message will at least try to go into
details about the basic architecture.
Materialized Views as tables
============================
Materialized views are now a normal table - whereas before they were a virtual
table. By making a materialized view a table, we can reuse all the
infrastructure for dealing with tables (cursors, etc).
One of the advantages of doing this is that we can create indexes on view
columns. Later, we should also be able to write those views to separate files
with ATTACH write.
Materialized Views as Zsets
===========================
The contents of the table are a ZSet: rowid, values, weight. Readers will
notice that because of this, the usage of the ZSet data structure dwindles
throughout the codebase. The main difference between our materialized ZSet and
the standard DBSP ZSet, is that obviously ours is backed by a BTree, not a Hash
(since SQLite tables are BTrees)
Aggregator State
================
In DBSP, the aggregator nodes also have state. To store that state, there is a
second table. The table holds all aggregators in the view, and there is one
table per view. That is __turso_internal_dbsp_state_{view_name}. The format of
that table is similar to a ZSet: rowid, serialized_values, weight. We serialize
the values because there will be many aggregators in the table. We can't rely
on a particular format for the values.
The Materialized View Cursor
============================
Reading from a Materialized View essentially means reading from the persisted
ZSet, and enhancing that with data that exists within the transaction.
Transaction data is ephemeral, so we do not materialize this anywhere: we have
a carefully crafted implementation of seek that takes care of merging weights
and stitching the two sets together.
Now supported:
- AEGIS variants: 256, 256X2, 256X4, 128L, 128X2, 128X4
- AES-GCM variants: AES-128-GCM, AES-256-GCM
With minor changes in order to make it easy to add new ciphers later
regardless of their key size.
Reviewed-by: Avinash Sajjanshetty (@avinassh)
Closes#2899
This adds support for "OFF" and "FULL" (default) synchronous modes. As
future work, we need to add NORMAL and EXTRA as well because
applications expect them.
This patch brings a bunch of quality of life improvements to encryption:
1. Previously, we just let any string to be used as a key. I have
updated the `PRAGMA hexkey=''` to get the key in hex. I have also
renamed from `key`, because that will be used to get passphrase
2. Added `PRAGMA cipher` so that now users can select which cipher they
want to use (for now, either `aegis256` or `aes256gcm`)
3. We now set the encryption context when both cipher and key are set
I also updated tests to reflect this.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#2779
Previously, the encryption module had hardcoded a lot of things. This
refactor makes it slightly nice and makes it configurable.
Right now cipher algorithm is assumed and hardcoded, I will make that
configurable in the upcoming PR
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2722
This PR make it possible to do 2 pretty crazy things with turso-db:
1. Now we can mix WAL frames inserts with SQL execution within same
transaction. This will allow sync engine to execute rebase of local
changes within atomically over main database file (the operation first
require us to push new frames to physically revert local changes and
then we need to replay local logical changes on top of the modified DB
state)
2. Under `conn_raw_api` Cargo feature turso-db now expose method which
allow caller to specify WAL file path. This dangerous capability exposed
for sync-engine which maintain 2 databases: main one and "revert"-DB
which shares same DB file but has it's own separate WAL. As sync-engine
has full control over checkpoint - it can guarantee that DB file will be
consistent with both main and "revert" DB WALs.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2716
This PR adds information about checkpoint sequence number to the WAL raw
API. Will be used in the sync engine.
Depends on the #2699
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2707