turso

mirror of https://github.com/aljazceru/turso.git synced 2025-12-28 05:24:22 +01:00

Author	SHA1	Message	Date
Preston Thorpe	b09dcceeef	Merge 'Fixes views' from Glauber Costa This is a collection of fixes for materialized views ahead of adding support for JOINs. It is mostly issues with how we assume there is a single table, with a single delta, but we have to send more than one. Those are things that are just objectively wrong, so I am sending it separately to make the JOIN PR smaller. Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #3009	2025-09-12 07:43:32 -04:00
Pekka Enberg	d80814fa2c	core/schema: Optimize get_dependent_materialized_views() when no views Eliminates get_dependent_materialized_views() overhead when there are no views. Note that we need to optimize the case when there are views as well because this ends up being pretty hot in write-intensive workloads.	2025-09-12 07:22:18 +03:00
Glauber Costa	8997670936	include dbsp tables in the list of tables that cannot be modified	2025-09-11 05:30:46 -07:00
Jussi Saurio	e3bd00883b	Fix creation of automatic indexes indexes with the naming scheme "sqlite_autoindex_<tblname>_<number>" are automatically created when a table is created with UNIQUE or PRIMARY KEY definitions. these indexes must map to the table definition SQL in definition order, i.e. sqlite_autoindex_foo_1 must be the first instance of UNIQUE or PRIMARY KEY and so on. this commit fixes our autoindex creation / parsing so that this invariant is upheld.	2025-09-11 14:11:30 +03:00
Jussi Saurio	f17997fc5d	Extract methods for populating indices/views from schema	2025-09-11 09:51:46 +03:00
Jussi Saurio	07944e23b5	Extract common logic for handling sqlite_schema rows	2025-09-11 09:45:40 +03:00
Jussi Saurio	a0613ef781	Avoid allocating and then immediately fallbacking errors in affinity On the syscall IO backend, on TPC-H query 12, the _dominating_ part of the stack trace is trying to construct affinities from a character, failing, allocating an error&string, and then immediately falling back to Blob affinity and dropping the error&string. Since I'm on vacation I won't spend cycles on figuring out why we are passing an incorrect affinity in `flags.get_affinity()` and instead make this lazy PR just to improve performance and stop doing silly things :]	2025-09-05 18:34:23 +03:00
Glauber Costa	08b2e685d5	Persistence for DBSP-based materialized views This fairly long commit implements persistence for materialized view. It is hard to split because of all the interdependencies between components, so it is a one big thing. This commit message will at least try to go into details about the basic architecture. Materialized Views as tables ============================ Materialized views are now a normal table - whereas before they were a virtual table. By making a materialized view a table, we can reuse all the infrastructure for dealing with tables (cursors, etc). One of the advantages of doing this is that we can create indexes on view columns. Later, we should also be able to write those views to separate files with ATTACH write. Materialized Views as Zsets =========================== The contents of the table are a ZSet: rowid, values, weight. Readers will notice that because of this, the usage of the ZSet data structure dwindles throughout the codebase. The main difference between our materialized ZSet and the standard DBSP ZSet, is that obviously ours is backed by a BTree, not a Hash (since SQLite tables are BTrees) Aggregator State ================ In DBSP, the aggregator nodes also have state. To store that state, there is a second table. The table holds all aggregators in the view, and there is one table per view. That is __turso_internal_dbsp_state_{view_name}. The format of that table is similar to a ZSet: rowid, serialized_values, weight. We serialize the values because there will be many aggregators in the table. We can't rely on a particular format for the values. The Materialized View Cursor ============================ Reading from a Materialized View essentially means reading from the persisted ZSet, and enhancing that with data that exists within the transaction. Transaction data is ephemeral, so we do not materialize this anywhere: we have a carefully crafted implementation of seek that takes care of merging weights and stitching the two sets together.	2025-09-05 07:04:33 -05:00
Preston Thorpe	2ea2be6f85	Merge 'prevent modification to system tables.' from Glauber Costa SQLite does not allow us to modify system tables, but we do. Let's fix it. Reviewed-by: Preston Thorpe <preston@turso.tech> Reviewed-by: Avinash Sajjanshetty (@avinassh) Closes #2855	2025-09-04 19:57:04 -04:00
Glauber Costa	032eabb3a4	prevent modification to system tables. SQLite does not allow us to modify system tables, but we do. Let's fix it.	2025-09-04 17:34:47 -05:00
TcMits	bfff05faba	merge main	2025-09-02 18:25:20 +07:00
TcMits	6e87b08d64	faster type_from_name	2025-09-01 14:38:38 +07:00
TcMits	37f33dc45f	add eq/contains/starts_with/ends_with_ignore_ascii_case	2025-08-31 16:18:42 +07:00
Levy A.	5b378e3730	feat: add `AlterColumn` instruction also refactor `RenameColumn` to reuse the logic from `AlterColumn`	2025-08-30 03:10:39 -03:00
Pere Diaz Bou	48e5ad7a55	core/schema: get_dependent_materialized_views_unnormalized If we get a table name for in memory structure, it's safe to assume it's already normalized.	2025-08-28 13:11:40 +02:00
themixednuts	80eca66be9	fix: normalize quotes in update fixes: #2744	2025-08-23 03:17:03 -05:00
Levy A.	07975603d3	fix: incorrect sql statement in parser test	2025-08-21 15:24:01 -03:00
Levy A.	4ba1304fb9	complete parser integration	2025-08-21 15:23:59 -03:00
Levy A.	186e2f5d8e	switch to new parser	2025-08-21 15:19:16 -03:00
Jussi Saurio	215485d403	Add Table::get_column_by_name method	2025-08-21 16:40:10 +03:00
rajajisai	89cd3fe196	`notnull` is now set based on the `nullable` field instead of being hardcoded.	2025-08-19 21:49:04 -04:00
Glauber Costa	03eeabef18	fix pragma table_info for views We were not generating table_info for views. This PR fixes it. We were so far storing columns as strings with just their names - since this is all we needed - but we will move now to store Columns. We need to convert the names to Column anyway for table_info to work.	2025-08-16 08:03:57 -05:00
Glauber Costa	5ab6f78f6b	Implement views Views (non materialized) are relatively simple, since they are just query aliases. We can expand them as if they were subqueries.	2025-08-13 14:14:03 -05:00
Glauber Costa	337f27a433	rename some structures to mention materialized views A lot of the structures we have - like the ones under Schema, are specific for materialized views. In preparation to adding normal views, rename them, so things are less confusing.	2025-08-13 14:13:16 -05:00
pedrocarlo	85e86d427b	cleanups - use io.block in many functions and `return_if_io`	2025-08-13 08:32:38 +03:00
Glauber Costa	145d6eede7	Implement very basic views using DBSP This is just the bare minimum that I needed to convince myself that this approach will work. The only views that we support are slices of the main table: no aggregations, no joins, no projections. drop view is implemented. view population is implemented. deletes, inserts and updates are implemented. much like indexes before, a flag must be passed to enable views.	2025-08-10 23:34:04 -05:00
Pere Diaz Bou	752a876f9a	change every Rc to Arc in schema internals	2025-07-28 10:51:17 +02:00
Pere Diaz Bou	d273de483f	comment clone for schema	2025-07-28 10:50:50 +02:00
Pere Diaz Bou	6ec80b3364	clone everything in schema	2025-07-28 10:27:45 +02:00
Pekka Enberg	fd2a7f9098	core: Switch to unreachable for invalid enum variants The parser unfortunately outputs Stmt, which has some enum variants that we never actually encounter in some parts of the core. Switch to unreachable instead of todo.	2025-07-28 09:52:20 +03:00
Pekka Enberg	669b231714	Merge 'parser: Distinguish quoted identifiers and unify Id into Name enum' from bit-aloo Closes: #1947 This PR replaces the `Name(pub String)` struct with a `Name` enum that explicitly models how the name appeared in the source either as an unquoted identifier (`Ident`) or a quoted string (`Quoted`). In the process, the separate `Id` wrapper type has been coalesced into the `Name` enum, simplifying the AST and reducing duplication in identifier handling logic. While this increases the size of some AST nodes (notably `yyStackEntry`). cc: @levydsa Reviewed-by: Levy A. (@levydsa) Reviewed-by: Preston Thorpe (@PThorpe92) Closes #2251	2025-07-25 12:08:54 +03:00
meteorgan	c48a5ef538	we don't need read_tx return IOResult anymore	2025-07-24 23:19:33 +08:00
bit-aloo	9a54ef214e	parser: Distinguish quoted identifiers and unify Id into Name enum This commit replaces the `Name(pub String)` struct with a `Name` enum that explicitly models how the name appeared in the source either as an unquoted identifier (`Ident`) or a quoted string (`Quoted`). In the process, the separate `Id` wrapper type has been coalesced into the `Name` enum, simplifying the AST and reducing duplication in identifier handling logic. While this increases the size of some AST nodes (notably `yyStackEntry`), it improves correctness and makes source structure more explicit for later phases.	2025-07-24 14:40:19 +05:30
Jussi Saurio	022f679fab	chore: make every CREATE TABLE stmt in entire repo have 1 space after tbl name `BTreeTable::to_sql` makes us incompatible with SQLite by losing e.g. the original whitespace provided during the CREATE TABLE command. For now let's fix our tests by regex-replacing every CREATE TABLE in the entire repo to have exactly 1 space after the table name in the CREATE TABLE statement.	2025-07-22 11:35:21 +03:00
Jussi Saurio	13d40c6a73	schema: fix extra whitespace in BTreeTable::from_sql	2025-07-22 11:11:08 +03:00
Nils Koch	05a9acf8c5	wrap special column names with [] in BTreeTable to_sql	2025-07-20 21:20:59 +01:00
Levy A.	0ea7849dca	feat: IOExt utility trait	2025-07-19 01:40:42 -03:00
pedrocarlo	9690eb41c2	`make_from_btree` should wait for IO to complete if we do not want to use a state machine	2025-07-17 15:34:42 -03:00
Diego Reis	d0af54ae77	refactor: Change CursorResult to IOResult The reasoning here is to treat I/O operations (Either is "Done" or yields to IO) with the same generic type.	2025-07-15 20:52:25 -03:00
Jussi Saurio	beaf393476	Merge 'Treat table-valued functions as tables' from Piotr Rżysko First step toward resolving https://github.com/tursodatabase/limbo/issues/1643. ### This PR With this change, the following two queries are considered equivalent: ```sql SELECT value FROM generate_series(5, 50); SELECT value FROM generate_series WHERE start = 5 AND stop = 50; ``` Arguments passed in parentheses to the virtual table name are now matched to hidden columns. Additionally, I fixed two bugs related to virtual tables. ### TODO (I'll handle this in a separate PR) Column references are still not supported as table-valued function arguments. The only difference is that previously, a query like: ```sql SELECT one.value, series.value FROM (SELECT 1 AS value) one, generate_series(one.value, 3) series; ``` would cause a panic. Now, it returns a proper error message instead. Adding support for column references is more nuanced for two main reasons: * We need to ensure that in joins where a TVF depends on other tables, those other tables are processed first. For example, in: ```sql SELECT one.value, series.value FROM generate_series(one.value, 3) series, (SELECT 1 AS value) one; ``` the one table must be processed by the top-level loop, and series must be nested. * For outer joins involving TVFs, the arguments must be treated as `ON` predicates, not `WHERE` predicates. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1727	2025-07-15 12:23:45 +03:00
Piotr Rzysko	30ae6538ee	Treat table-valued functions as tables With this change, the following two queries are considered equivalent: ```sql SELECT value FROM generate_series(5, 50); SELECT value FROM generate_series WHERE start = 5 AND stop = 50; ``` Arguments passed in parentheses to the virtual table name are now matched to hidden columns. Column references are still not supported as table-valued function arguments. The only difference is that previously, a query like: ```sql SELECT one.value, series.value FROM (SELECT 1 AS value) one, generate_series(one.value, 3) series; ``` would cause a panic. Now, it returns a proper error message instead. Adding support for column references is more nuanced for two main reasons: - We need to ensure that in joins where a TVF depends on other tables, those other tables are processed first. For example, in: ```sql SELECT one.value, series.value FROM generate_series(one.value, 3) series, (SELECT 1 AS value) one; ``` the one table must be processed by the top-level loop, and series must be nested. - For outer joins involving TVFs, the arguments must be treated as ON predicates, not WHERE predicates.	2025-07-14 07:16:53 +02:00
Piotr Rzysko	000d70f1f3	Propagate info about hidden columns	2025-07-14 07:16:53 +02:00
Krishna Vishal	a79fe458db	Fix merge conflicts and adapt schema.rs to use `RecordCursor`	2025-07-14 03:28:55 +05:30
Jussi Saurio	a48b6d049a	Another post-rebase clippy round with 1.88.0	2025-07-12 19:10:56 +03:00
Levy A.	a1e418c999	fix tests	2025-07-11 15:04:28 -03:00
Levy A.	b1341113d7	clippy	2025-07-11 15:04:28 -03:00
Levy A.	b008c787b7	faster type substr comparison	2025-07-11 15:04:28 -03:00
Levy A.	c300a01120	fix: add space between column name and type	2025-07-11 15:04:28 -03:00
Levy A.	cc17211189	direct btree calls	2025-07-11 15:04:28 -03:00
Levy A.	c145577bce	fix: use `ty_str` for SQL conversion	2025-07-11 15:04:28 -03:00

1 2 3 4

192 Commits