turso

mirror of https://github.com/aljazceru/turso.git synced 2026-01-08 02:34:20 +01:00

Author	SHA1	Message	Date
Piotr Rzysko	f5efcbe745	Add support for window functions Adds initial support for window functions. For now, only existing aggregate functions can be used as window functions—no specialized window-specific functions are supported yet. Currently, only the default frame definition is implemented: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW EXCLUDE NO OTHERS.	2025-09-13 11:12:44 +02:00
Jussi Saurio	e3bd00883b	Fix creation of automatic indexes indexes with the naming scheme "sqlite_autoindex_<tblname>_<number>" are automatically created when a table is created with UNIQUE or PRIMARY KEY definitions. these indexes must map to the table definition SQL in definition order, i.e. sqlite_autoindex_foo_1 must be the first instance of UNIQUE or PRIMARY KEY and so on. this commit fixes our autoindex creation / parsing so that this invariant is upheld.	2025-09-11 14:11:30 +03:00
Glauber Costa	08b2e685d5	Persistence for DBSP-based materialized views This fairly long commit implements persistence for materialized view. It is hard to split because of all the interdependencies between components, so it is a one big thing. This commit message will at least try to go into details about the basic architecture. Materialized Views as tables ============================ Materialized views are now a normal table - whereas before they were a virtual table. By making a materialized view a table, we can reuse all the infrastructure for dealing with tables (cursors, etc). One of the advantages of doing this is that we can create indexes on view columns. Later, we should also be able to write those views to separate files with ATTACH write. Materialized Views as Zsets =========================== The contents of the table are a ZSet: rowid, values, weight. Readers will notice that because of this, the usage of the ZSet data structure dwindles throughout the codebase. The main difference between our materialized ZSet and the standard DBSP ZSet, is that obviously ours is backed by a BTree, not a Hash (since SQLite tables are BTrees) Aggregator State ================ In DBSP, the aggregator nodes also have state. To store that state, there is a second table. The table holds all aggregators in the view, and there is one table per view. That is __turso_internal_dbsp_state_{view_name}. The format of that table is similar to a ZSet: rowid, serialized_values, weight. We serialize the values because there will be many aggregators in the table. We can't rely on a particular format for the values. The Materialized View Cursor ============================ Reading from a Materialized View essentially means reading from the persisted ZSet, and enhancing that with data that exists within the transaction. Transaction data is ephemeral, so we do not materialize this anywhere: we have a carefully crafted implementation of seek that takes care of merging weights and stitching the two sets together.	2025-09-05 07:04:33 -05:00
Preston Thorpe	2ea2be6f85	Merge 'prevent modification to system tables.' from Glauber Costa SQLite does not allow us to modify system tables, but we do. Let's fix it. Reviewed-by: Preston Thorpe <preston@turso.tech> Reviewed-by: Avinash Sajjanshetty (@avinassh) Closes #2855	2025-09-04 19:57:04 -04:00
Glauber Costa	032eabb3a4	prevent modification to system tables. SQLite does not allow us to modify system tables, but we do. Let's fix it.	2025-09-04 17:34:47 -05:00
bit-aloo	51d40092db	add empty table references, and error out in case if the table references are present in limit/offset	2025-08-26 19:56:25 +05:30
bit-aloo	a3b87cd97f	add review comments	2025-08-26 19:56:25 +05:30
Pekka Enberg	26ba09c45f	Revert "Merge 'Remove double indirection in the Parser' from Pedro Muniz" This reverts commit `71c1b357e4`, reversing changes made to `6bc568ff69` because it actually makes things slower.	2025-08-26 14:58:21 +03:00
pedrocarlo	d3240844ec	refactor Core to remove the double indirection	2025-08-25 22:59:31 -03:00
themixednuts	80eca66be9	fix: normalize quotes in update fixes: #2744	2025-08-23 03:17:03 -05:00
Levy A.	4ba1304fb9	complete parser integration	2025-08-21 15:23:59 -03:00
Levy A.	186e2f5d8e	switch to new parser	2025-08-21 15:19:16 -03:00
Jussi Saurio	9d44e97a7a	Fix: all indexes need to be updated if the rowid changes	2025-08-21 15:48:46 +03:00
Nikita Sivukhin	5d0ada9fb9	add "updates" column for cdc table	2025-08-11 12:46:15 +04:00
Jussi Saurio	21dc2d0161	translate: return parse errors for unsupported features instead of silently ignoring	2025-08-08 11:39:30 +03:00
Piotr Rzysko	82491ceb6a	Integrate virtual tables with optimizer This change connects virtual tables with the query optimizer. The optimizer now considers virtual tables during join order search and invokes their best_index callbacks to determine feasible access paths. Currently, this is not a visible change, since none of the existing extensions return information indicating that a plan is invalid.	2025-08-05 05:48:28 +02:00
Piotr Rzysko	718598eab8	Introduce scan type Different scan parameters are required for different table types. Currently, index and iteration direction are only used by B-tree tables, while the remaining table types don’t require any parameters. Planning access to virtual tables, however, will require passing additional information from the planner, such as the virtual table index (distinct from a B-tree index) and the constraints that must be forwarded to the `filter` method.	2025-08-04 20:27:22 +02:00
Jussi Saurio	86b1232268	chore: enable indexes by default	2025-08-01 15:44:56 +03:00
Diego Reis	ab01b4e8ca	Refactor `UPDATE .. SET` row values logic and add some comments	2025-07-31 00:08:15 -03:00
Diego Reis	31c73f3c9a	Add basic support for row values in `UPDATE .. SET` statements e.g `.. SET (a, b) = (1, 2)` is equivalent to `.. SET a = 1, b = 2`. Alongside, to repeated lhs values, `(a, a)`, the last rhs prevail; so `.. SET (a, a) = (1, 2)` is equivalent to `.. SET a = 2`	2025-07-31 00:08:15 -03:00
Pere Diaz Bou	752a876f9a	change every Rc to Arc in schema internals	2025-07-28 10:51:17 +02:00
Pekka Enberg	6bf6cc28e4	Merge 'Implement the Returning statement for inserts and updates' from Glauber Costa They are very similar. DELETE is very different, so that one we'll do it later. Closes #2276	2025-07-27 09:11:16 +03:00
Glauber Costa	5d8d08d1b6	Implement the Returning statement for inserts and updates They are very similar. DELETE is very different, so that one we'll do it later.	2025-07-26 09:01:09 -05:00
Iaroslav Zeigerman	f13b9105b9	Fix error handling when binding column references while translating the UPDATE statement	2025-07-26 04:51:42 -07:00
Glauber Costa	b5927dcfd5	support doubly qualified identifiers	2025-07-25 14:52:45 -05:00
Pekka Enberg	669b231714	Merge 'parser: Distinguish quoted identifiers and unify Id into Name enum' from bit-aloo Closes: #1947 This PR replaces the `Name(pub String)` struct with a `Name` enum that explicitly models how the name appeared in the source either as an unquoted identifier (`Ident`) or a quoted string (`Quoted`). In the process, the separate `Id` wrapper type has been coalesced into the `Name` enum, simplifying the AST and reducing duplication in identifier handling logic. While this increases the size of some AST nodes (notably `yyStackEntry`). cc: @levydsa Reviewed-by: Levy A. (@levydsa) Reviewed-by: Preston Thorpe (@PThorpe92) Closes #2251	2025-07-25 12:08:54 +03:00
Glauber Costa	988b16f962	Support ATTACH (read only) Support for attaching databases. The main difference from SQLite is that we support an arbitrary number of attached databases, and we are not bound to just 100ish. We for now only support read-only databases. We open them as read-only, but also, to keep things simple, we don't patch any of the insert machinery to resolve foreign tables. So if an insert is tried on an attached database, it will just fail with a "no such table" error - this is perfect for now. The code in core/translate/attach.rs is written by Claude, who also played a key part in the boilerplate for stuff like the .databases command and extending the pragma database_list, and also aided me in the test cases.	2025-07-24 19:19:48 -05:00
bit-aloo	9a54ef214e	parser: Distinguish quoted identifiers and unify Id into Name enum This commit replaces the `Name(pub String)` struct with a `Name` enum that explicitly models how the name appeared in the source either as an unquoted identifier (`Ident`) or a quoted string (`Quoted`). In the process, the separate `Id` wrapper type has been coalesced into the `Name` enum, simplifying the AST and reducing duplication in identifier handling logic. While this increases the size of some AST nodes (notably `yyStackEntry`), it improves correctness and makes source structure more explicit for later phases.	2025-07-24 14:40:19 +05:30
Ihor Andrianov	37dd5da436	clippy	2025-07-16 15:05:06 +03:00
Ihor Andrianov	6d4e542522	last set clause wins	2025-07-16 14:56:10 +03:00
Piotr Rzysko	000d70f1f3	Propagate info about hidden columns	2025-07-14 07:16:53 +02:00
Nikita Sivukhin	c9c5ef4e25	remote query_mode from ProgramBuilderOpts and from function arguments - mode never changes and ProgramBuilder already created with proper mode set correctly	2025-07-02 13:24:12 +04:00
pedrocarlo	7e0225b1af	add some comments	2025-06-29 17:37:46 -03:00
pedrocarlo	738e2cc06c	do not emit ephemeral plan when doing a SeekRowId + emit Delete instruction when rowid in set clause	2025-06-29 17:12:24 -03:00
Pekka Enberg	725c3e4ddc	Rename `limbo_sqlite3_parser` crate to `turso_sqlite3_parser`	2025-06-29 12:34:46 +03:00
Pekka Enberg	2fc5c0ce5c	Switch to runtime flag for enabling indexes Makes it easier to test the feature: ``` $ cargo run -- --experimental-indexes Limbo v0.0.22 Enter ".help" for usage hints. Connected to a transient in-memory database. Use ".open FILENAME" to reopen on a persistent database limbo> CREATE TABLE t(x); limbo> CREATE INDEX t_idx ON t(x); limbo> DROP INDEX t_idx; ```	2025-06-26 10:07:28 +03:00
Nils Koch	2827b86917	chore: fix clippy warnings	2025-06-23 19:52:13 +01:00
pedrocarlo	6596ee28a8	introduce EphemeralTable query destination	2025-06-20 16:30:21 -03:00
pedrocarlo	e53a290a48	move ephemeral table logic to update plan and reuse select logic for ephemeral index	2025-06-20 16:30:21 -03:00
pedrocarlo	9048ad398b	modify loop functions to accomodate for ephemeral tables	2025-06-20 16:29:10 -03:00
Pere Diaz Bou	f91d2c5e99	fix disable in write cases	2025-06-17 19:33:23 +02:00
Pere Diaz Bou	b5f2f375b8	disable alter, delete, create index, insert and update for indexes	2025-06-17 19:33:23 +02:00
Levy A.	de2ac89ad2	feat: complete ALTER TABLE implementation	2025-06-11 14:17:36 -03:00
Jussi Saurio	cc405dea7e	Use new TableReferences struct everywhere	2025-05-29 11:44:56 +03:00
Jussi Saurio	d2a287f67f	Add Schema reference to Resolver - needed for adhoc subquery planning	2025-05-27 19:12:47 +03:00
Jussi Saurio	7c07c09300	Add stable internal_id property to TableReference Currently our "table id"/"table no"/"table idx" references always use the direct index of the `TableReference` in the plan, e.g. in `SelectPlan::table_references`. For example: ```rust Expr::Column { table: 0, column: 3, .. } ``` refers to the 0'th table in the `table_references` list. This is a fragile approach because it assumes the table_references list is stable for the lifetime of the query processing. This has so far been the case, but there exist certain query transformations, e.g. subquery unnesting, that may fold new table references from a subquery (which has its own table ref list) into the table reference list of the parent. If such a transformation is made, then potentially all of the Expr::Column references to tables will become invalid. Consider this example: ```sql -- Assume tables: users(id, age), orders(user_id, amount) -- Get total amount spent per user on orders over $100 SELECT u.id, sub.total FROM users u JOIN (SELECT user_id, SUM(amount) as total FROM orders o WHERE o.amount > 100 GROUP BY o.user_id) sub WHERE u.id = sub.user_id -- Before subquery unnesting: -- Main query table_references: [users, sub] -- u.id refers to table 0, column 0 -- sub.total refers to table 1, column 1 -- -- Subquery table_references: [orders] -- o.user_id refers to table 0, column 0 -- o.amount refers to table 0, column 1 -- -- After unnesting and folding subquery tables into main query, -- the query might look like this: SELECT u.id, SUM(o.amount) as total FROM users u JOIN orders o ON u.id = o.user_id WHERE o.amount > 100 GROUP BY u.id; -- Main query table_references: [users, orders] -- u.id refers to table index 0 (correct) -- o.amount refers to table index 0 (incorrect, should be 1) -- o.user_id refers to table index 0 (incorrect, should be 1) ``` We could ofc traverse every expression in the subquery and rewrite the table indexes to be correct, but if we instead use stable identifiers for each table reference, then all the column references will continue to be correct. Hence, this PR introduces a `TableInternalId` used in `TableReference` as well as `Expr::Column` and `Expr::Rowid` so that this kind of query transformations can happen with less pain.	2025-05-25 20:26:17 +03:00
pedrocarlo	53bf5d5ef5	adjust translate functions to take a program instead of `Option<ProgramBuilder>` + remove any Init emission in traslate functions + use epilogue in all places necessary	2025-05-21 16:41:10 -03:00
pedrocarlo	517c7c81cd	refactor to include optional program builder argument	2025-05-21 12:47:51 -03:00
Levy A.	023a116b0d	feat: initial implementation of `ALTER TABLE` only supporting renaming tables	2025-05-08 09:24:56 -03:00
Jussi Saurio	306e097950	Merge 'Fix bug: we cant remove order by terms from the head of the list' from Jussi Saurio we had an incorrect optimization in `eliminate_orderby_like_groupby()` where it could remove e.g. the first term of the ORDER BY if it matched the first GROUP BY term and the result set was naturally ordered by that term. this is invalid. see e.g.: ```sql main branch - BAD: removes the `ORDER BY id` term because the results are naturally ordered by id. However, this results in sorting the entire thing by last name only! limbo> select id, last_name, count(1) from users GROUP BY 1,2 order by id, last_name desc limit 3; ┌──────┬───────────┬───────────┐ │ id │ last_name │ count (1) │ ├──────┼───────────┼───────────┤ │ 6235 │ Zuniga │ 1 │ ├──────┼───────────┼───────────┤ │ 8043 │ Zuniga │ 1 │ ├──────┼───────────┼───────────┤ │ 944 │ Zimmerman │ 1 │ └──────┴───────────┴───────────┘ after fix - GOOD: limbo> select id, last_name, count(1) from users GROUP BY 1,2 order by id, last_name desc limit 3; ┌────┬───────────┬───────────┐ │ id │ last_name │ count (1) │ ├────┼───────────┼───────────┤ │ 1 │ Foster │ 1 │ ├────┼───────────┼───────────┤ │ 2 │ Salazar │ 1 │ ├────┼───────────┼───────────┤ │ 3 │ Perry │ 1 │ └────┴───────────┴───────────┘ I also refactored sorters to always use the ast `SortOrder` instead of boolean vectors, and use the `compare_immutable()` utility we use inside btrees too. Closes #1365	2025-05-03 12:48:08 +03:00

1 2

71 Commits