mirror of
https://github.com/aljazceru/turso.git
synced 2025-12-30 14:34:22 +01:00
This fairly long commit implements persistence for materialized view.
It is hard to split because of all the interdependencies between components,
so it is a one big thing. This commit message will at least try to go into
details about the basic architecture.
Materialized Views as tables
============================
Materialized views are now a normal table - whereas before they were a virtual
table. By making a materialized view a table, we can reuse all the
infrastructure for dealing with tables (cursors, etc).
One of the advantages of doing this is that we can create indexes on view
columns. Later, we should also be able to write those views to separate files
with ATTACH write.
Materialized Views as Zsets
===========================
The contents of the table are a ZSet: rowid, values, weight. Readers will
notice that because of this, the usage of the ZSet data structure dwindles
throughout the codebase. The main difference between our materialized ZSet and
the standard DBSP ZSet, is that obviously ours is backed by a BTree, not a Hash
(since SQLite tables are BTrees)
Aggregator State
================
In DBSP, the aggregator nodes also have state. To store that state, there is a
second table. The table holds all aggregators in the view, and there is one
table per view. That is __turso_internal_dbsp_state_{view_name}. The format of
that table is similar to a ZSet: rowid, serialized_values, weight. We serialize
the values because there will be many aggregators in the table. We can't rely
on a particular format for the values.
The Materialized View Cursor
============================
Reading from a Materialized View essentially means reading from the persisted
ZSet, and enhancing that with data that exists within the transaction.
Transaction data is ephemeral, so we do not materialize this anywhere: we have
a carefully crafted implementation of seek that takes care of merging weights
and stitching the two sets together.
101 lines
3.5 KiB
Rust
101 lines
3.5 KiB
Rust
use crate::types::Value;
|
|
use std::collections::hash_map::DefaultHasher;
|
|
use std::hash::{Hash, Hasher};
|
|
|
|
// The DBSP paper uses as a key the whole record, with both the row key and the values. This is a
|
|
// bit confuses for us in databases, because when you say "key", it is easy to understand that as
|
|
// being the row key.
|
|
//
|
|
// Empirically speaking, using row keys as the ZSet keys will waste a competent but not brilliant
|
|
// engineer around 82 and 88 hours, depending on how you count. Hours that are never coming back.
|
|
//
|
|
// One of the situations in which using row keys completely breaks are table updates. If the "key"
|
|
// is the row key, let's say "5", then an update is a delete + insert. Imagine a table that had k =
|
|
// 5, v = 5, and a view that filters v > 2.
|
|
//
|
|
// Now we will do an update that changes v => 1. If the "key" is 5, then inside the Delta set, we
|
|
// will have (5, weight = -1), (5, weight = +1), and the whole thing just disappears. The Delta
|
|
// set, therefore, has to contain ((5, 5), weight = -1), ((5, 1), weight = +1).
|
|
//
|
|
// It is theoretically possible to use the rowkey in the ZSet and then use a hash of key ->
|
|
// Vec(changes) in the Delta set. But deviating from the paper here is just asking for trouble, as
|
|
// I am sure it would break somewhere else.
|
|
#[derive(Debug, Clone, PartialEq, Eq)]
|
|
pub struct HashableRow {
|
|
pub rowid: i64,
|
|
pub values: Vec<Value>,
|
|
// Pre-computed hash: DBSP rows are immutable and frequently hashed during joins,
|
|
// making caching worthwhile despite the memory overhead
|
|
cached_hash: u64,
|
|
}
|
|
|
|
impl HashableRow {
|
|
pub fn new(rowid: i64, values: Vec<Value>) -> Self {
|
|
let cached_hash = Self::compute_hash(rowid, &values);
|
|
Self {
|
|
rowid,
|
|
values,
|
|
cached_hash,
|
|
}
|
|
}
|
|
|
|
fn compute_hash(rowid: i64, values: &[Value]) -> u64 {
|
|
let mut hasher = DefaultHasher::new();
|
|
|
|
rowid.hash(&mut hasher);
|
|
|
|
for value in values {
|
|
match value {
|
|
Value::Null => {
|
|
0u8.hash(&mut hasher);
|
|
}
|
|
Value::Integer(i) => {
|
|
1u8.hash(&mut hasher);
|
|
i.hash(&mut hasher);
|
|
}
|
|
Value::Float(f) => {
|
|
2u8.hash(&mut hasher);
|
|
f.to_bits().hash(&mut hasher);
|
|
}
|
|
Value::Text(s) => {
|
|
3u8.hash(&mut hasher);
|
|
s.value.hash(&mut hasher);
|
|
(s.subtype as u8).hash(&mut hasher);
|
|
}
|
|
Value::Blob(b) => {
|
|
4u8.hash(&mut hasher);
|
|
b.hash(&mut hasher);
|
|
}
|
|
}
|
|
}
|
|
|
|
hasher.finish()
|
|
}
|
|
}
|
|
|
|
impl Hash for HashableRow {
|
|
fn hash<H: Hasher>(&self, state: &mut H) {
|
|
self.cached_hash.hash(state);
|
|
}
|
|
}
|
|
|
|
impl PartialOrd for HashableRow {
|
|
fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
|
|
Some(self.cmp(other))
|
|
}
|
|
}
|
|
|
|
impl Ord for HashableRow {
|
|
fn cmp(&self, other: &Self) -> std::cmp::Ordering {
|
|
// First compare by rowid, then by values if rowids are equal
|
|
// This ensures Ord is consistent with Eq (which compares all fields)
|
|
match self.rowid.cmp(&other.rowid) {
|
|
std::cmp::Ordering::Equal => {
|
|
// If rowids are equal, compare values to maintain consistency with Eq
|
|
self.values.cmp(&other.values)
|
|
}
|
|
other => other,
|
|
}
|
|
}
|
|
}
|