turso

mirror of https://github.com/aljazceru/turso.git synced 2025-12-18 17:14:20 +01:00

Author	SHA1	Message	Date
pedrocarlo	dbb7d6f532	reprepare optimization using reset()	2025-09-11 16:59:53 -03:00
pedrocarlo	6264d694d5	on reprepare create new state with updated number of cursors and registers, so that the Program insns are in sync with ProgramState	2025-09-11 12:50:22 -03:00
TcMits	4c17fa87c5	remove .explain()	2025-09-11 18:28:46 +07:00
TcMits	68e8d5a36b	clippy	2025-09-11 18:16:01 +07:00
TcMits	b574b4bcea	finish EXPLAIN	2025-09-11 18:04:59 +07:00
Nikita Sivukhin	87d49cd039	cargo fmt after rebase	2025-09-07 20:08:10 +04:00
Nikita Sivukhin	db7c6b3370	try to speed up count(*) where 1 = 1	2025-09-07 19:55:42 +04:00
pedrocarlo	e6344db5b1	remove Refcell from Cursor	2025-09-06 01:46:21 -03:00
Glauber Costa	08b2e685d5	Persistence for DBSP-based materialized views This fairly long commit implements persistence for materialized view. It is hard to split because of all the interdependencies between components, so it is a one big thing. This commit message will at least try to go into details about the basic architecture. Materialized Views as tables ============================ Materialized views are now a normal table - whereas before they were a virtual table. By making a materialized view a table, we can reuse all the infrastructure for dealing with tables (cursors, etc). One of the advantages of doing this is that we can create indexes on view columns. Later, we should also be able to write those views to separate files with ATTACH write. Materialized Views as Zsets =========================== The contents of the table are a ZSet: rowid, values, weight. Readers will notice that because of this, the usage of the ZSet data structure dwindles throughout the codebase. The main difference between our materialized ZSet and the standard DBSP ZSet, is that obviously ours is backed by a BTree, not a Hash (since SQLite tables are BTrees) Aggregator State ================ In DBSP, the aggregator nodes also have state. To store that state, there is a second table. The table holds all aggregators in the view, and there is one table per view. That is __turso_internal_dbsp_state_{view_name}. The format of that table is similar to a ZSet: rowid, serialized_values, weight. We serialize the values because there will be many aggregators in the table. We can't rely on a particular format for the values. The Materialized View Cursor ============================ Reading from a Materialized View essentially means reading from the persisted ZSet, and enhancing that with data that exists within the transaction. Transaction data is ephemeral, so we do not materialize this anywhere: we have a carefully crafted implementation of seek that takes care of merging weights and stitching the two sets together.	2025-09-05 07:04:33 -05:00
Pere Diaz Bou	8db5cead07	core/mvcc: only commit if there is a txn	2025-09-03 14:12:48 +02:00
Pere Diaz Bou	b8f83e1fc0	clippy and fmt stuff because if not pekka will tweet	2025-09-03 12:47:55 +02:00
Pere Diaz Bou	13c505109a	core/mvcc: make commit_txn return on I/O	2025-09-02 17:07:38 +02:00
pedrocarlo	53cfae1db4	return Error from step if IO failed	2025-09-01 11:10:39 -03:00
Pekka Enberg	9fc5947fa6	core/vdbe: Micro-optimize "zero_or_null" opcode It's a hot instruction for TPC-H, for example, so worth optimizing. Reduces op_zero_or_null() from 5.6% to 2.4% in CPU flamegraph for TCP-H Q1.	2025-08-29 14:38:50 +03:00
Glauber Costa	097510216e	implement the projector operator for DBSP My goal with this patch is to be able to implement the ProjectOperator for DBSP circuits using VDBE for expression evaluation. not doing so is dangerous for the following reason: we will end up with different, subtle, and incompatible behavior between SQLite expressions if they are used in views versus outside of views. In fact, even in our prototype had them: our projection tests, which used to pass, were actually wrong =) (sqlite would return something different if those functions were executed outside the view context) For optimization reasons, we single out trivial expressions: they don't have go through VDBE. Trivial expressions are expressions that only involve Columns, Literals, and simple operators on elements of the same type. Even type coercion takes this out of the realm of trivial. Everything that is not trivial, is then translated with translate_expr - in the same way SQLite will, and then compiled with VDBE. We can, over time, make this process much better. There are essentially infinite opportunities for optimization here. But for now, the main warts are: * VDBE execution needs a connection * There is no good way in VDBE to pass parameters to a program. * It is almost trivial to pollute the original connection. For example, we need to issue HALT for the program to stop, but seeing that halt will usually cause the program to try and halt the original program. Subprograms, like the ones we use in triggers are a possible solution, but they are much more expensive to execute, especially given that our execution would essentially have to have a program with no other role than to wrap the subprogram. Therefore, what I am doing is: * There is an in-memory database inside the projection operator (an obvious optimization is to share it with all projection operators). * We obtain a connection to that database when the operator is created * We use that connection to execute our VDBE, which offers a clean, safe and isolated way to execute the expression. * We feed the values to the program manually by editing the registers directly.	2025-08-25 17:48:17 +03:00
Nikita Sivukhin	f7ad55b680	remove unnecessary argument	2025-08-25 12:24:39 +04:00
Jussi Saurio	54ff656c9d	Do not clear txn state inside nested statement If a connection does e.g. CREATE TABLE, it will start a "child statement" to reparse the schema. That statement does not start its own transaction, and so should not try to end the existing one either. We had a logic bug where these steps would happen: - `CREATE TABLE` executed successfully - pread fault happens inside `ParseSchema` child stmt - `handle_program_error()` is called - `pager.end_tx()` returns immediately because `is_nested_stmt` is true and we correctly no-op it. - however, crucially: `handle_program_error()` then sets tx state to None - parent statement now catches error from nested stmt and calls `handle_program_error()`, which calls `pager.end_tx()` again, and since txn state is None, when it calls `rollback()` we panic on the assertion `"dirty pages should be empty for read txn"` Solution: Do not do _any_ error processing in `handle_program_error()` inside a nested stmt. This means that the parent write txn is still active when it processes the error from the child and we avoid this panic.	2025-08-25 08:49:22 +03:00
Pekka Enberg	9d2f26bb04	sqlite3: Implement sqlite3_clear_bindings()	2025-08-24 19:33:18 +03:00
Pekka Enberg	1dc6fb97c0	Merge 'core/mvcc: store txid in conn and reset transaction state on commit ' from Pere Diaz Bou We were storing `txid` in `ProgramState`, this meant it was impossible to track interactive transactions. This was extracted to `Connection` instead. Moreover, transaction state for mvcc now is reset on commit. Closes #2689	2025-08-20 16:51:41 +03:00
Pere Diaz Bou	9e3b7b0c98	core/mvcc: store txid in conn and reset transaction state on commit	2025-08-20 12:23:28 +02:00
Jussi Saurio	e5f04ae100	Merge 'refactor/vdbe: move insert-related seeking to VDBE from BTreeCursor' from Jussi Saurio This gets rid of `InsertState` in `BTreeCursor` plus the `moved_before` parameter to `BTreeCursor::insert` -- instead, seek logic is now in the existing state machines for `op_insert` and `op_idx_insert` Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #2639	2025-08-20 11:15:09 +03:00
pedrocarlo	7e98a464a7	check if completion finished instead of completed for step	2025-08-20 00:38:16 -03:00
Jussi Saurio	c2855cb0db	refactor/idx_insert: move seeking to VDBE instead of BTreeCursor Also removes `InsertState` and `moved_before` since neither are needed anymore.	2025-08-19 19:04:42 +03:00
Glauber Costa	36fc8e8fdb	add metrics and implement the .stats command This adds basic statement and connection metrics like SQLite (and libSQL) have. This is particularly useful to show that materialized views are working: turso> create table t(a); turso> insert into t(a) values (1) , (2), (3), (4), (5), (6), (7), (8), (9), (10); turso> create materialized view v as select count() from t; turso> .stats on Stats display enabled. turso> select count() from t; ┌───────────┐ │ count () │ ├───────────┤ │ 10 │ └───────────┘ Statement Metrics: Row Operations: Rows read: 10 Rows written: 0 [ ... other metrics ... ] turso> select from v; ┌───────────┐ │ count (*) │ ├───────────┤ │ 10 │ └───────────┘ Statement Metrics: Row Operations: Rows read: 1 Rows written: 0 [ ... other metrics ... ]	2025-08-18 09:11:06 -05:00
Mikaël Francoeur	2ee0132afe	rename functions	2025-08-15 17:08:53 -04:00
Glauber Costa	337f27a433	rename some structures to mention materialized views A lot of the structures we have - like the ones under Schema, are specific for materialized views. In preparation to adding normal views, rename them, so things are less confusing.	2025-08-13 14:13:16 -05:00
Nikita Sivukhin	5838efe7dd	rename flag to wal_auto_checkpoint_disabled	2025-08-13 15:26:25 +04:00
pedrocarlo	fbe7e685ce	adjust mvcc code to return completions in state machines	2025-08-13 10:24:55 +03:00
pedrocarlo	78cb61c1fe	before stepping to next insntruction check for io	2025-08-13 10:24:55 +03:00
pedrocarlo	10cadd4037	do not use `StepResult` for `commit_txn`	2025-08-12 12:28:35 -03:00
pedrocarlo	9ab07f59ad	adjust mvcc state transitions	2025-08-12 12:28:35 -03:00
pedrocarlo	fc5492bf2c	state machine for `op_row_id`	2025-08-12 12:28:35 -03:00
pedrocarlo	1221f65d10	state machine for `op_column`	2025-08-12 12:28:35 -03:00
Jussi Saurio	9b5e61eacd	Merge 'Reprepare fix on write statement' from Pedro Muniz We have to update the Transaction State before checking for the Schema Cookie so that we can rollback the transaction later on correctly. Closes #2535 Closes #2549	2025-08-12 10:18:12 +03:00
pedrocarlo	96a6bc5125	`end_tx` does not need schema_did_change variable	2025-08-11 18:59:11 -03:00
Jussi Saurio	44c91f6752	fix/vdbe: fix state handling for incremental views in op_delete	2025-08-11 19:07:29 +03:00
Jussi Saurio	f38333b373	fix/vdbe: fix state handling for incremental views - When the rowid is changed in UPDATE, it is handled as a combination of DELETE + INSERT, so we dont need to delete the old values in that case - We should only update the views after the operation on the btree is done - A proper state machine is needed to handle IO yielding points	2025-08-11 19:02:15 +03:00
Pekka Enberg	cdaea7f274	core/vdbe: Make apply_view_deltas() return early if views are disabled Currently, we have a borrow problem because parse_schema_rows() already borrows `schema`, but then `apply_view_deltas` does the same: ``` thread 'main' panicked at core/vdbe/mod.rs:450:49: already mutably borrowed: BorrowError stack backtrace: 0: __rustc::rust_begin_unwind at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/panicking.rs:697:5 1: core::panicking::panic_fmt at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/core/src/panicking.rs:75:14 2: core::cell::panic_already_mutably_borrowed at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/core/src/cell.rs:799:5 3: core::cell::RefCell<T>::borrow at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/core/src/cell.rs:987:25 4: turso_core::vdbe::Program::apply_view_deltas at ./core/vdbe/mod.rs:450:26 5: turso_core::vdbe::Program::commit_txn at ./core/vdbe/mod.rs:468:9 6: turso_core::vdbe::execute::op_halt at ./core/vdbe/execute.rs:1954:15 7: turso_core::vdbe::Program::step at ./core/vdbe/mod.rs:430:19 8: turso_core::Statement::step at ./core/lib.rs:1914:23 9: turso_core::util::parse_schema_rows at ./core/util.rs:91:15 10: turso_core::Connection::parse_schema_rows::{{closure}} at ./core/lib.rs:1518:17 11: turso_core::Connection::with_schema_mut at ./core/lib.rs:1625:9 12: turso_core::Connection::parse_schema_rows at ./core/lib.rs:1515:9 ``` However, this is a read transaction and views are not even enabled, let's just make `apply_view_deltas()` return early if there's no processing needed, to skip the schema borrow altogether.	2025-08-11 12:26:11 +03:00
Glauber Costa	145d6eede7	Implement very basic views using DBSP This is just the bare minimum that I needed to convince myself that this approach will work. The only views that we support are slices of the main table: no aggregations, no joins, no projections. drop view is implemented. view population is implemented. deletes, inserts and updates are implemented. much like indexes before, a flag must be passed to enable views.	2025-08-10 23:34:04 -05:00
Glauber Costa	d1be7ad0bb	implement the collseq bytecode instruction SQLite generates those in aggregations like min / max with collation information either in the table definition or in the column expression. We currently generate the wrong result here, and properly generating the bytecode instruction fixes it.	2025-08-05 13:49:04 -05:00
Jussi Saurio	1feb5ba2d3	perf/vdbe: avoid doing work in commit_txn if not in txn	2025-08-05 15:25:28 +03:00
Jussi Saurio	3f633247f7	perf/stmt: avoid checking for SchemaUpdated errors if it's impossible	2025-08-05 15:10:55 +03:00
pedrocarlo	f8eb4ba14d	implement reprepare for statements	2025-08-04 12:32:34 -03:00
pedrocarlo	54636241c2	store Sql String inside `Program` for reprepare	2025-08-04 12:32:34 -03:00
Mikaël Francoeur	81412b4a17	use state machine for NoConflict opcode	2025-08-01 17:29:57 -04:00
Pere Diaz Bou	764523a8bb	core/mvcc: fix tests with state machines	2025-08-01 15:48:09 +02:00
Pere Diaz Bou	c3f00475eb	state_machine: rename transition -> step	2025-08-01 13:56:57 +02:00
Pere Diaz Bou	0f70e7101f	core/state_machine: move state_machine to its own file	2025-08-01 12:49:32 +02:00
Pere Diaz Bou	27757ab4eb	core/mvcc commit_txn generic state machinery Unfortunately it seems we are never reaching the point to remove state machines, so might as well make it easier to make. There are two points that must be highlighted: 1. There is a `StateTransition` trait implemented like: ```rust pub trait StateTransition { type State; type Context; fn transition<'a>(&mut self, context: &Self::Context) -> Result<TransitionResult>; fn finalize<'a>(&mut self, context: &Self::Context) -> Result<()>; fn is_finalized(&self) -> bool; } ``` where there exists `transition` which tries to move state forward, and `finalize` which marks the state machine as "finalized" so that **no other call to finalize will forward the state and it will panic instead. 2. Before, we would store the state of a state machine inside the callee's struct, but I'm proposing we do something different where the callee will return the state machine and the caller will be responsible of advancing it. This way we don't need to track many reset operations in case of failures or rollbacks, and instead we could simply drop a state machine and all other nested state machines will drop in a cascade.	2025-08-01 12:36:02 +02:00
Pere Diaz Bou	c4318cac36	core/mvcc: fix tests	2025-08-01 10:38:41 +02:00

1 2 3 4 5 ...

622 Commits