turso

mirror of https://github.com/aljazceru/turso.git synced 2026-02-11 11:14:21 +01:00

Author	SHA1	Message	Date
Glauber Costa	d28022b491	support mixed integer and float expressions in the expr_compiler Fixes #3373	2025-09-26 21:11:38 -03:00
Pekka Enberg	9461e22c06	Merge 'Improve DBSP view serialization' from Glauber Costa Improve serialization for DBSP views. The serialization code was written organically, without much forward thinking about stability as we evolved the table and operator format. Now that this is done, we are at at point where we can actually make it suck less and take a considerable step towards making this production ready. We also add a simple version check (in the table name, because that is much easier than reading contents in parse_schema_row) to prevent views to be used if we had to do anything to evolve the format of the circuit (including the operators) Closes #3351	2025-09-26 09:18:45 +03:00
Glauber Costa	1b5e74060a	make sure that we are able to prevent views from being corrupted as we make changes to the way materialized views are generated (think adding new operators, changing the id of existing operators, etc), we will need to persist the topology of the circuit itself. This is a change that I believe to be premature. For now, it is enough to reserve the first operator id for it, and add a version number to the table name. We can just detect that something changed, and ask the user to drop the view. We can get away with it due to the fact that the views are experimental.	2025-09-25 22:52:08 -03:00
Glauber Costa	3dc1dca5a8	use 128-bit hashes for the zset_id We have used i64 before because that is the size of an integer in SQLite. However, I believe that for large enough databases, the chances of collision here are just too high. The effect of a collision is the database silently returning incorrect data in the materialized view. So now that everything else is working, we should move to i128.	2025-09-25 22:52:08 -03:00
Glauber Costa	b9011dfa16	Replace custom serialization with a saner version The Materialized View code had custom serialization written so we could move this code forward. Now that we have many operators and the views work, replace it with something saner. The main insight is that if we transform the AggregateState into Values before the serialization, we are able to just use standard SQLite serialization for the values. We then just have to add sizes, codes for the functions, etc (which are also represented as Values).	2025-09-25 22:52:08 -03:00
Pere Diaz Bou	91cff65e44	Merge 'Autoincrement' from Pavan Nambi fixes #1976 and #1605 ```zsh turso> DROP TABLE IF EXISTS t; CREATE TABLE t ( id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT ); turso> INSERT INTO t (name) VALUES ('A'); SELECT * FROM sqlite_sequence; ┌──────┬─────┐ │ name │ seq │ ├──────┼─────┤ │ t │ 1 │ └──────┴─────┘ turso> DROP TABLE IF EXISTS t; CREATE TABLE t ( id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT ); turso> INSERT INTO t (name) VALUES ('A'); SELECT * FROM sqlite_sequence; ┌──────┬─────┐ │ name │ seq │ ├──────┼─────┤ │ t │ 1 │ └──────┴─────┘ turso> INSERT INTO t (name) VALUES ('A'); SELECT * FROM sqlite_sequence; ┌──────┬─────┐ │ name │ seq │ ├──────┼─────┤ │ t │ 2 │ └──────┴─────┘ turso> ``` Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #2983	2025-09-25 18:57:24 +02:00
Pekka Enberg	a50771fe38	core: Wrap Connection::query_only with AtomicBool	2025-09-24 19:23:13 +03:00
Pavan-Nambi	49d5141f2d	Merge remote-tracking branch 'origin/main' into cdc_fail_autoincrement	2025-09-24 18:06:02 +05:30
Pekka Enberg	fa8065ca52	core: Wrap Connection::autocommit in AtomicBool	2025-09-23 13:18:49 +03:00
Pekka Enberg	b94aa22499	core: Wrap Connection::schema in RwLock	2025-09-23 10:31:20 +03:00
Pekka Enberg	b857f94fe4	Merge 'core: Wrap Connection::pager in RwLock' from Pekka Enberg Closes #3247	2025-09-23 07:29:09 +03:00
Pavan Nambi	f1ac855441	Merge branch 'main' into cdc_fail_autoincrement	2025-09-22 21:11:26 +05:30
PThorpe92	10662ee5c5	Fix error in test missing DatabaseOpts field	2025-09-22 11:28:20 -04:00
Pekka Enberg	aa454a6637	core: Wrap Connection::pager in RwLock	2025-09-22 17:02:08 +03:00
Preston Thorpe	44dc4c9636	Merge 'translate/emitter: Implement partial indexes' from Preston Thorpe This PR adds support for partial indexes, e.g. `CREATE INDEX` with a provided predicate ```sql CREATE UNIQUE INDEX idx_expensive ON products(sku) where price > 100; ``` The PR does not yet implement support for using the partial indexes in the optimizer. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #3228	2025-09-22 09:09:54 -04:00
Pekka Enberg	0144ea8059	Merge 'Support UNION queries in DBSP-based Materialized Views' from Glauber Costa UNION queries, while useful on their own, are a cornerstone of recursive CTEs. This PR implements: * the merge operator, required to merge both sides of a union query. * the circuitry necessary to issue the Merge operator. * extraction of tables mentioned in union and CTE expressions, so we can correctly populate tables that contain them. Closes #3234	2025-09-22 11:33:19 +03:00
Glauber Costa	2627ad44de	support union statements in the DBSP circuit compiler	2025-09-21 21:00:27 -03:00
Glauber Costa	b419db489a	Implement the DBSP merge operator The Merge operator is a stateless operator that merges two deltas. There are two modes: Distinct, where we merge together values that are the same, and All, where we preserve all values. We use the rowid of the hashable row to guarantee that: In Distinct mode, the rowid is set to 0 in both sides. If they values are the same, they will hash to the same thing. For All, the rowids are different. The merge operator is used for the UNION statement, which is a cornerstone of Recursive CTEs.	2025-09-21 21:00:27 -03:00
Glauber Costa	9f54f60d45	make sure that complex select statements are captured by MV populate The population code extracts table information from the select statement so it can populate the materialized view. But the code, as written today, is naive. It doesn't capture table information correctly if there is more than one select statement (such in the case of a union query).	2025-09-21 21:00:27 -03:00
Pavan-Nambi	51cf410b56	add has_autoincrement to all test tables from main branch	2025-09-21 16:10:45 +05:30
Pavan Nambi	47194d7658	Merge branch 'tursodatabase:main' into cdc_fail_autoincrement	2025-09-21 16:03:38 +05:30
Glauber Costa	13260349b0	Return a parse error for a non-equality join We currently don't handle non equality, but end up just returning a bogus result. Let's parse error.	2025-09-20 20:35:10 -03:00
PThorpe92	a0f574d279	Add where_clause expr field to Index	2025-09-20 14:38:47 -04:00
Glauber Costa	f2f7f817e4	populate all tables in IncrementalView For joins to work, we have to populate all referenced tables when we create the view.	2025-09-19 03:59:28 -05:00
Glauber Costa	e5a106d8d6	enable joins in IncrementalView	2025-09-19 03:59:28 -05:00
Glauber Costa	832a4d7034	generate projection nodes inside filter clauses We are currently not able to properly compute things like WHERE a+b=2. Let's generate a projection node inside a filter when needed.	2025-09-19 03:59:28 -05:00
Glauber Costa	627f61aa81	support column comparisons in the filter operator We currently only support column / literal comparisons in the filter operator. But with JOINs, comparisons are usually against two columns. Do the work to support it.	2025-09-19 03:59:28 -05:00
Glauber Costa	47097fbec6	Add tests for project operator working with ambiguous columns Unlike the other operators, project works just fine with ambiguous columsn, because it works with compiled expressions. We don't need to patch it, but let's make sure it keeps working by writing a test.	2025-09-19 03:59:28 -05:00
Glauber Costa	e80dd8e5e1	move the filter operator to accept indexes instead of names We already did similarly for the AggregateOperator: for joins you can have the same column name in many tables. And passing schema information to the operator is a layering violation (the operator may be operating on the result of a previous node, and at that point there is no more "schema"). Therefore we pass indexes into the column set the operator has. The FilterOperator has a complication: we are using it to generate the SQL for the populate statement, and that needs column names. However, we should not be using the FilterOperator for that, and that is a relic from the time where we had operator information directly inside the IncrementalView. To enable moving the FilterOperator to index-based, we rework that code. For joins, we'll need to populate many tables anyway, so we take the time to do that work here.	2025-09-19 03:59:28 -05:00
Glauber Costa	f149b40e75	Implement JOINs in the DBSP circuit This PR improves the DBSP circuit so that it handles the JOIN operator. The JOIN operator exposes a weakness of our current model: we usually pass a list of columns between operators, and find the right column by name when needed. But with JOINs, many tables can have the same columns. The operators will then find the wrong column (same name, different table), and produce incorrect results. To fix this, we must do two things: 1) Change the Logical Plan. It needs to track table provenance. 2) Fix the aggregators: it needs to operate on indexes, not names. For the aggregators, note that table provenance is the wrong abstraction. The aggregator is likely working with a logical table that is the result of previous nodes in the circuit. So we just need to be able to tell it which index in the column array it should use.	2025-09-19 03:59:28 -05:00
Glauber Costa	9f3d119a5a	move hashable row tests to dbsp.rs The operator.rs file was so huge, that we didn't even notice there was a test block in the middle of the file that was testing things that were long moved to dbsp.rs (the HashableRow). Move the tests there now.	2025-09-19 03:59:28 -05:00
Glauber Costa	e2f0e372a1	move the join operator to its own file. The code is becoming impossible to reason about with everything in operator.rs	2025-09-19 03:59:28 -05:00
Glauber Costa	aa8fcdbe54	move the aggregate operator to its own file. The code is becoming impossible to reason about with everything in operator.rs	2025-09-19 03:59:24 -05:00
Glauber Costa	7178d8d31c	move the project operator to its own file. The code is becoming impossible to reason about with everything in operator.rs	2025-09-19 03:57:11 -05:00
Glauber Costa	ee914fc543	move the filter operator to its own file. The code is becoming impossible to reason about with everything in operator.rs	2025-09-19 03:57:11 -05:00
Glauber Costa	9747d6c6b6	move the input operator to its own file. The code is becoming impossible to reason about with everything in operator.rs	2025-09-19 03:57:11 -05:00
Glauber Costa	6be5eb74d9	Implement the Join Operator The join operator is also a stateful operator. It keeps the input deltas stored in the state, for both the left and right branches of the join. JOINs extract a join key, which is the values that were used in the join's equality statement. That key is now our zset_id, and it points to a collection of rows.	2025-09-19 03:57:11 -05:00
Glauber Costa	5b4a6e5c2d	view: catch all tables mentioned, instead of just one. Ahead of the implementation of JOINs, we need to evolve the IncrementalView, which currently only accepts a single base table, to keep a list of tables mentioned in the statement.	2025-09-19 03:57:11 -05:00
Glauber Costa	0b3317d449	extract columns from all tables in case of joins. Our code for view needs to extract the list of columns used in the view. We currently extract only from "the base table", but once we have joins, we need a more complex structure, that keeps the mapping of (tables, columns). This actually affects both views and materialized views: for views, the queries with joins work just fine, because views are just aliases for a query. But the list of columns returned by pragma table_info on the view is incorrect. We add a test to make sure it is fixed. For materialized views, we add extensive tests to make sure that the columns are extracted correctly.	2025-09-19 03:57:11 -05:00
Pekka Enberg	3f35267b7c	core/mvcc: Kill noop storage We don't need it for anything.	2025-09-19 08:52:57 +03:00
Pekka Enberg	0ce6469a4b	Merge 'Fix some Rust compilation warnings' from Samuel Marks Nothing fancy yet, assuming you merge this I'll do this one next: ``` warning: function pointer comparisons do not produce meaningful results since their addresses are not guaranteed to be unique --> core/types.rs:403:5 \| 398 \| #[derive(Debug, Clone, PartialEq)] \| --------- in this derive macro expansion ... 402 \| pub step_fn: StepFunction, \| ^^^^^^^^^^^^^^^^^^^^^^^^^ 403 \| pub finalize_fn: FinalizeFunction, \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ \| = note: the address of the same function can vary between different codegen units = note: furthermore, different functions could have the same address after being merged together = note: for more information visit <https://doc.rust-lang.org/nightly/core/ptr/fn.fn_addr_eq.html> ``` And fix a test failure that I resolved in Python (specific to macOS hosts). Basically this PR is putting my toe in the water to see how open you are to contribs! Closes #3211	2025-09-19 08:28:53 +03:00
Samuel Marks	e333f151ba	[*.rs] Resolve warnings (mostly "hiding a lifetime that's elided elsewhere is confusing")	2025-09-18 22:47:43 -05:00
Pere Diaz Bou	ff3c79d5d7	remove mvvmode and set logical log as default	2025-09-18 18:22:25 +02:00
Pere Diaz Bou	e2824835dc	fix all open_file use cases for mvcc mode	2025-09-18 18:22:05 +02:00
Pere Diaz Bou	de8a975a0b	core/mvcc: introduce MvccMode Logical Log	2025-09-18 18:21:04 +02:00
Pavan-Nambi	020921f803	Merge remote-tracking branch 'upstream/main' into cdc_fail_autoincrement	2025-09-18 19:27:19 +05:30
Jussi Saurio	91ef4e5e9d	Merge 'Introduce instruction VTABLE' from Lâm Hoàng Phúc this PR improves 3-6% for `prepare` benchmark without slowing down others. After this PR we don't have to store `InsnFunction` in `Program` and `ProgramBuilder` anymore, because `to_function` will return result without matching. Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3098	2025-09-18 09:18:48 +03:00
Pekka Enberg	c2b8bb0a2f	core/incremental: Wrap ViewTransactionState in Arc Make it Send.	2025-09-17 12:23:29 +03:00
Jussi Saurio	9a2797963a	Merge 'Remove LimboResult enum and InsnFunctionStepResult::Busy variant' from Jussi Saurio We can just use `LimboError::Busy` for both of these. Reviewed-by: Pekka Enberg <penberg@iki.fi> Closes #3170	2025-09-17 12:06:54 +03:00
TcMits	668f1f721c	resolve conflict	2025-09-17 15:25:58 +07:00

1 2 3

106 Commits