turso

mirror of https://github.com/aljazceru/turso.git synced 2025-12-18 17:14:20 +01:00

Author	SHA1	Message	Date
Glauber Costa	b419db489a	Implement the DBSP merge operator The Merge operator is a stateless operator that merges two deltas. There are two modes: Distinct, where we merge together values that are the same, and All, where we preserve all values. We use the rowid of the hashable row to guarantee that: In Distinct mode, the rowid is set to 0 in both sides. If they values are the same, they will hash to the same thing. For All, the rowids are different. The merge operator is used for the UNION statement, which is a cornerstone of Recursive CTEs.	2025-09-21 21:00:27 -03:00
Glauber Costa	e2f0e372a1	move the join operator to its own file. The code is becoming impossible to reason about with everything in operator.rs	2025-09-19 03:59:28 -05:00
Glauber Costa	aa8fcdbe54	move the aggregate operator to its own file. The code is becoming impossible to reason about with everything in operator.rs	2025-09-19 03:59:24 -05:00
Glauber Costa	7178d8d31c	move the project operator to its own file. The code is becoming impossible to reason about with everything in operator.rs	2025-09-19 03:57:11 -05:00
Glauber Costa	ee914fc543	move the filter operator to its own file. The code is becoming impossible to reason about with everything in operator.rs	2025-09-19 03:57:11 -05:00
Glauber Costa	9747d6c6b6	move the input operator to its own file. The code is becoming impossible to reason about with everything in operator.rs	2025-09-19 03:57:11 -05:00
Glauber Costa	6541a43670	move hashable_row to dbsp.rs There will be a new type for joins, so it makes less sense to have a separate file just for it. dbsp.rs is good.	2025-09-11 05:30:46 -07:00
Glauber Costa	1fd345f382	unify code used for persistence. We have code written for BTree (ZSet) persistence in both compiler.rs and operator.rs, because there are minor differences between them. With joins coming, it is time to unify this code.	2025-09-11 05:30:46 -07:00
Glauber Costa	08b2e685d5	Persistence for DBSP-based materialized views This fairly long commit implements persistence for materialized view. It is hard to split because of all the interdependencies between components, so it is a one big thing. This commit message will at least try to go into details about the basic architecture. Materialized Views as tables ============================ Materialized views are now a normal table - whereas before they were a virtual table. By making a materialized view a table, we can reuse all the infrastructure for dealing with tables (cursors, etc). One of the advantages of doing this is that we can create indexes on view columns. Later, we should also be able to write those views to separate files with ATTACH write. Materialized Views as Zsets =========================== The contents of the table are a ZSet: rowid, values, weight. Readers will notice that because of this, the usage of the ZSet data structure dwindles throughout the codebase. The main difference between our materialized ZSet and the standard DBSP ZSet, is that obviously ours is backed by a BTree, not a Hash (since SQLite tables are BTrees) Aggregator State ================ In DBSP, the aggregator nodes also have state. To store that state, there is a second table. The table holds all aggregators in the view, and there is one table per view. That is __turso_internal_dbsp_state_{view_name}. The format of that table is similar to a ZSet: rowid, serialized_values, weight. We serialize the values because there will be many aggregators in the table. We can't rely on a particular format for the values. The Materialized View Cursor ============================ Reading from a Materialized View essentially means reading from the persisted ZSet, and enhancing that with data that exists within the transaction. Transaction data is ephemeral, so we do not materialize this anywhere: we have a carefully crafted implementation of seek that takes care of merging weights and stitching the two sets together.	2025-09-05 07:04:33 -05:00
Glauber Costa	29b93e3e58	add DBSP circuit compiler The next step is to adapt the view code to use circuits instead of listing the operators manually.	2025-08-27 14:21:32 -05:00
Glauber Costa	38def26704	Add expr_compiler To be used in DBSP-based projections. This will compile an expression to VDBE bytecode and execute it. To do that we need to add a new type of Expression, which we call a Register. This is a way for us to pass parameters to a DBSP program which will be not columns or literals, but inputs from the DBSP deltas.	2025-08-25 17:48:17 +03:00
Glauber Costa	145d6eede7	Implement very basic views using DBSP This is just the bare minimum that I needed to convince myself that this approach will work. The only views that we support are slices of the main table: no aggregations, no joins, no projections. drop view is implemented. view population is implemented. deletes, inserts and updates are implemented. much like indexes before, a flag must be passed to enable views.	2025-08-10 23:34:04 -05:00
Glauber Costa	d5b7533ff8	Implement a DBSP module We are not using the DBSP crate because it is very heavy on Tokio and other dependencies that won't make sense for us to consume.	2025-08-10 23:15:26 -05:00

13 Commits