turso

mirror of https://github.com/aljazceru/turso.git synced 2025-12-18 09:04:19 +01:00

Author	SHA1	Message	Date
Pekka Enberg	d808db6af9	core: Switch to parking_lot::Mutex It's faster and we eliminate bunch of unwrap() calls.	2025-11-20 10:42:02 +02:00
pedrocarlo	1db13889e3	Change `Value::Text` to use a `Cow<'static, str>` instead of `Vec<u8>`	2025-11-11 16:11:46 -03:00
Nikita Sivukhin	05f0ee6a72	add more integration in order to properly skip backing_btree index_method	2025-10-27 17:00:26 +04:00
Pekka Enberg	ca073b5ecd	Merge 'core: Switch RwLock<Arc<Pager>> to ArcSwap<Pager>' from Pekka Enberg We don't actually need the RwLock locking capabilities, just the ability to swap the instance. Closes #3814	2025-10-26 12:29:11 +02:00
Pekka Enberg	c3fb867173	core: Switch RwLock<Arc<Pager>> to ArcSwap<Pager> We don't actually need the RwLock locking capabilities, just the ability to swap the instance.	2025-10-24 14:10:08 +03:00
Glauber Costa	92751e621b	Add DISTINCT support to aggregate operator Implements COUNT/SUM/AVG(DISTINCT) and SELECT DISTINCT for materialized views. To do this we have to keep a list of the actual distinct values (similarly to how we do for min/max). We then update the operator (and issue deltas) only when there is a state transition (for example, if we already count the value x = 1, and we see an insert for x = 1, we do nothing). SELECT DISTINCT (with no aggregator) is similar. We already have to keep a list of the values anyway to power the aggregates. So we just issue new deltas based on the transition, without updating the aggregator.	2025-10-22 16:32:18 -05:00
Pere Diaz Bou	ea04e9033a	core/mvcc: add btree_cursor under MVCC cursor	2025-10-21 18:22:37 +02:00
Pekka Enberg	bf5de920f2	core: Unsafe Send and Sync pushdown This patch pushes unsafe Send and Sync to individual components instead of doing it at Database level. This makes it easier for us to incrementally fix thread-safety, but avoid developers adding more thread unsafe code.	2025-10-16 11:26:50 +03:00
Pere Diaz Bou	160a84250e	core: add CursorTrait imports where needed	2025-10-10 15:04:15 +02:00
Pere Diaz Bou	0f631101df	core: change page idx type from usize to i64 MVCC is like the annoying younger cousin (I know because I was him) that needs to be treated differently. MVCC requires us to use root_pages that might not be allocated yet, and the plan is to use negative root_pages for that case. Therefore, we need i64 in order to fit this change.	2025-09-29 18:38:43 +02:00
Glauber Costa	3dc1dca5a8	use 128-bit hashes for the zset_id We have used i64 before because that is the size of an integer in SQLite. However, I believe that for large enough databases, the chances of collision here are just too high. The effect of a collision is the database silently returning incorrect data in the materialized view. So now that everything else is working, we should move to i128.	2025-09-25 22:52:08 -03:00
Glauber Costa	b9011dfa16	Replace custom serialization with a saner version The Materialized View code had custom serialization written so we could move this code forward. Now that we have many operators and the views work, replace it with something saner. The main insight is that if we transform the AggregateState into Values before the serialization, we are able to just use standard SQLite serialization for the values. We then just have to add sizes, codes for the functions, etc (which are also represented as Values).	2025-09-25 22:52:08 -03:00
Pekka Enberg	b857f94fe4	Merge 'core: Wrap Connection::pager in RwLock' from Pekka Enberg Closes #3247	2025-09-23 07:29:09 +03:00
Pekka Enberg	aa454a6637	core: Wrap Connection::pager in RwLock	2025-09-22 17:02:08 +03:00
Preston Thorpe	44dc4c9636	Merge 'translate/emitter: Implement partial indexes' from Preston Thorpe This PR adds support for partial indexes, e.g. `CREATE INDEX` with a provided predicate ```sql CREATE UNIQUE INDEX idx_expensive ON products(sku) where price > 100; ``` The PR does not yet implement support for using the partial indexes in the optimizer. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #3228	2025-09-22 09:09:54 -04:00
Glauber Costa	b419db489a	Implement the DBSP merge operator The Merge operator is a stateless operator that merges two deltas. There are two modes: Distinct, where we merge together values that are the same, and All, where we preserve all values. We use the rowid of the hashable row to guarantee that: In Distinct mode, the rowid is set to 0 in both sides. If they values are the same, they will hash to the same thing. For All, the rowids are different. The merge operator is used for the UNION statement, which is a cornerstone of Recursive CTEs.	2025-09-21 21:00:27 -03:00
PThorpe92	a0f574d279	Add where_clause expr field to Index	2025-09-20 14:38:47 -04:00
Glauber Costa	e80dd8e5e1	move the filter operator to accept indexes instead of names We already did similarly for the AggregateOperator: for joins you can have the same column name in many tables. And passing schema information to the operator is a layering violation (the operator may be operating on the result of a previous node, and at that point there is no more "schema"). Therefore we pass indexes into the column set the operator has. The FilterOperator has a complication: we are using it to generate the SQL for the populate statement, and that needs column names. However, we should not be using the FilterOperator for that, and that is a relic from the time where we had operator information directly inside the IncrementalView. To enable moving the FilterOperator to index-based, we rework that code. For joins, we'll need to populate many tables anyway, so we take the time to do that work here.	2025-09-19 03:59:28 -05:00
Glauber Costa	f149b40e75	Implement JOINs in the DBSP circuit This PR improves the DBSP circuit so that it handles the JOIN operator. The JOIN operator exposes a weakness of our current model: we usually pass a list of columns between operators, and find the right column by name when needed. But with JOINs, many tables can have the same columns. The operators will then find the wrong column (same name, different table), and produce incorrect results. To fix this, we must do two things: 1) Change the Logical Plan. It needs to track table provenance. 2) Fix the aggregators: it needs to operate on indexes, not names. For the aggregators, note that table provenance is the wrong abstraction. The aggregator is likely working with a logical table that is the result of previous nodes in the circuit. So we just need to be able to tell it which index in the column array it should use.	2025-09-19 03:59:28 -05:00
Glauber Costa	9f3d119a5a	move hashable row tests to dbsp.rs The operator.rs file was so huge, that we didn't even notice there was a test block in the middle of the file that was testing things that were long moved to dbsp.rs (the HashableRow). Move the tests there now.	2025-09-19 03:59:28 -05:00
Glauber Costa	e2f0e372a1	move the join operator to its own file. The code is becoming impossible to reason about with everything in operator.rs	2025-09-19 03:59:28 -05:00
Glauber Costa	aa8fcdbe54	move the aggregate operator to its own file. The code is becoming impossible to reason about with everything in operator.rs	2025-09-19 03:59:24 -05:00
Glauber Costa	7178d8d31c	move the project operator to its own file. The code is becoming impossible to reason about with everything in operator.rs	2025-09-19 03:57:11 -05:00
Glauber Costa	ee914fc543	move the filter operator to its own file. The code is becoming impossible to reason about with everything in operator.rs	2025-09-19 03:57:11 -05:00
Glauber Costa	9747d6c6b6	move the input operator to its own file. The code is becoming impossible to reason about with everything in operator.rs	2025-09-19 03:57:11 -05:00
Glauber Costa	6be5eb74d9	Implement the Join Operator The join operator is also a stateful operator. It keeps the input deltas stored in the state, for both the left and right branches of the join. JOINs extract a join key, which is the values that were used in the join's equality statement. That key is now our zset_id, and it points to a collection of rows.	2025-09-19 03:57:11 -05:00
Samuel Marks	e333f151ba	[*.rs] Resolve warnings (mostly "hiding a lifetime that's elided elsewhere is confusing")	2025-09-18 22:47:43 -05:00
Pekka Enberg	17e9f05ea4	core: Convert Rc<Pager> to Arc<Pager>	2025-09-17 09:32:49 +03:00
Glauber Costa	6bee6bb785	implement min/max We have not implemented them before because they require the raw elements to be kept. It is easy to see why in the following example: current_min = 3; insert(2) => current_min = 2 // can be done without state delete(2) => needs to look at the state to determine new min! The aggregator state was a very simple key-value structure. To accomodate for min/max, we will make it into a more complex table, where we can encode a more complex structure. The key insight is that we can use a primary key composed of: 1) storage_id 2) zset_id, 3) element The storage_id and zset_id are our previous key, except they are now exploded to support a larger range of storage_id. With more bits available in the storage_id, we can encode information about which column we are storing. For aggregations in multiple columns, we will need to keep a different list of values for min/max! The element is just the values of the columns. Because this is a primary key, the data will be sorted in the btree. We can then just do a prefix search in the first two components of the key and easily find the min/max when needed. This new format is also adequate for joins. Joins will just have a new storage_id which encodes two "columns" (left side, right side).	2025-09-15 22:30:48 -05:00
Glauber Costa	3565e7978a	Add an index to the dbsp internal table And also change the schema of the main table. I have come to see the current key-value schema as inadequate for non-aggregate operators. Calculating Min/Max, for example, doesn't feat in this schema because we have to be able to track existing values and index them. Another alternative is to keep one table per operator type, but this quickly leads to an explosion of tables.	2025-09-15 22:30:48 -05:00
Glauber Costa	e6008e532a	Add a second delta to the EvalState, Commit We will assert that the second one is always empty for the existing operators - as they should be! But joins will need both.	2025-09-11 05:30:46 -07:00
Glauber Costa	6541a43670	move hashable_row to dbsp.rs There will be a new type for joins, so it makes less sense to have a separate file just for it. dbsp.rs is good.	2025-09-11 05:30:46 -07:00
Glauber Costa	1fd345f382	unify code used for persistence. We have code written for BTree (ZSet) persistence in both compiler.rs and operator.rs, because there are minor differences between them. With joins coming, it is time to unify this code.	2025-09-11 05:30:46 -07:00
Glauber Costa	08b2e685d5	Persistence for DBSP-based materialized views This fairly long commit implements persistence for materialized view. It is hard to split because of all the interdependencies between components, so it is a one big thing. This commit message will at least try to go into details about the basic architecture. Materialized Views as tables ============================ Materialized views are now a normal table - whereas before they were a virtual table. By making a materialized view a table, we can reuse all the infrastructure for dealing with tables (cursors, etc). One of the advantages of doing this is that we can create indexes on view columns. Later, we should also be able to write those views to separate files with ATTACH write. Materialized Views as Zsets =========================== The contents of the table are a ZSet: rowid, values, weight. Readers will notice that because of this, the usage of the ZSet data structure dwindles throughout the codebase. The main difference between our materialized ZSet and the standard DBSP ZSet, is that obviously ours is backed by a BTree, not a Hash (since SQLite tables are BTrees) Aggregator State ================ In DBSP, the aggregator nodes also have state. To store that state, there is a second table. The table holds all aggregators in the view, and there is one table per view. That is __turso_internal_dbsp_state_{view_name}. The format of that table is similar to a ZSet: rowid, serialized_values, weight. We serialize the values because there will be many aggregators in the table. We can't rely on a particular format for the values. The Materialized View Cursor ============================ Reading from a Materialized View essentially means reading from the persisted ZSet, and enhancing that with data that exists within the transaction. Transaction data is ephemeral, so we do not materialize this anywhere: we have a carefully crafted implementation of seek that takes care of merging weights and stitching the two sets together.	2025-09-05 07:04:33 -05:00
TcMits	37f33dc45f	add eq/contains/starts_with/ends_with_ignore_ascii_case	2025-08-31 16:18:42 +07:00
Glauber Costa	29b93e3e58	add DBSP circuit compiler The next step is to adapt the view code to use circuits instead of listing the operators manually.	2025-08-27 14:21:32 -05:00
Glauber Costa	898c0260f3	move operator to eval / commit pattern We need a read only phase and a commit phase. Otherwise we will never be able to rollback changes properly. We currently do that, but we do that in the view. Before we move to circuits, this needs to be internalized by the operator.	2025-08-27 14:21:32 -05:00
Glauber Costa	7e4bacca55	remove join operator I am 100% sure they are total bullshit by now, since we don't implement the join operator yet. The code evolved a lot, and in every turn there are issues with aggregators, projectors, filters... some subtle, some not so subtle. We keep having to patch join slightly as we make changes to the API, but we don't truly exercise whether or not they keep working because there is no support for them in the views. Therefore: let's remove it. We'll bring it back later.	2025-08-27 11:18:54 -05:00
Glauber Costa	05b275f865	remove min/max and add more tests for other aggregations min/max require O(N) storage because of deletions. It is easy to see why: if you add a new row, you can quickly and incrementally check if it is smaller / larger than the previous accumulator. But when you delete a row you can't do that and have to check the previous values. Feldera uses something called "traces" which to me look a lot like indexes. When we implement materialization, this is easy to do. But to avoid having something broken, we'll just disable min / max until then.	2025-08-27 11:18:54 -05:00
Glauber Costa	6e2bd364ee	fix issue with rowids and deletions The operator itself should handle deletions and updates that change the rowid by consolidating its state. Our current materialized views track state themselves, so we don't see this problem now. But it becomes apparent once we switch the views to use circuits.	2025-08-27 11:18:54 -05:00
Glauber Costa	dbe29e4bab	fix aggregator operator It needs to keep track of the old values to emit retractions (when the aggregation changes, remove old value, insert new)	2025-08-27 11:18:54 -05:00
Pekka Enberg	e3ffc82a1d	core/incremental: Fix expression compiler to use new parser	2025-08-25 17:48:20 +03:00
Glauber Costa	ffab4a89a2	addressed review comments from Jussi	2025-08-25 17:48:17 +03:00
Glauber Costa	097510216e	implement the projector operator for DBSP My goal with this patch is to be able to implement the ProjectOperator for DBSP circuits using VDBE for expression evaluation. not doing so is dangerous for the following reason: we will end up with different, subtle, and incompatible behavior between SQLite expressions if they are used in views versus outside of views. In fact, even in our prototype had them: our projection tests, which used to pass, were actually wrong =) (sqlite would return something different if those functions were executed outside the view context) For optimization reasons, we single out trivial expressions: they don't have go through VDBE. Trivial expressions are expressions that only involve Columns, Literals, and simple operators on elements of the same type. Even type coercion takes this out of the realm of trivial. Everything that is not trivial, is then translated with translate_expr - in the same way SQLite will, and then compiled with VDBE. We can, over time, make this process much better. There are essentially infinite opportunities for optimization here. But for now, the main warts are: * VDBE execution needs a connection * There is no good way in VDBE to pass parameters to a program. * It is almost trivial to pollute the original connection. For example, we need to issue HALT for the program to stop, but seeing that halt will usually cause the program to try and halt the original program. Subprograms, like the ones we use in triggers are a possible solution, but they are much more expensive to execute, especially given that our execution would essentially have to have a program with no other role than to wrap the subprogram. Therefore, what I am doing is: * There is an in-memory database inside the projection operator (an obvious optimization is to share it with all projection operators). * We obtain a connection to that database when the operator is created * We use that connection to execute our VDBE, which offers a clean, safe and isolated way to execute the expression. * We feed the values to the program manually by editing the registers directly.	2025-08-25 17:48:17 +03:00
Levy A.	4ba1304fb9	complete parser integration	2025-08-21 15:23:59 -03:00
Levy A.	186e2f5d8e	switch to new parser	2025-08-21 15:19:16 -03:00
Jussi Saurio	a50c799e05	stop silently ignoring unsupported features in incremental view WHERE clauses	2025-08-11 17:44:41 +03:00
Pekka Enberg	62f1fd2038	core/incremental: Make clippy happy	2025-08-11 08:36:53 +03:00
Pekka Enberg	87322ad1e4	core/incremental: Evaluate view expressions ...tests were failing because we are testing with expressions, but didn't support them.	2025-08-11 08:27:10 +03:00
Glauber Costa	d5b7533ff8	Implement a DBSP module We are not using the DBSP crate because it is very heavy on Tokio and other dependencies that won't make sense for us to consume.	2025-08-10 23:15:26 -05:00

50 Commits