turso

mirror of https://github.com/aljazceru/turso.git synced 2026-02-14 04:24:20 +01:00

Author	SHA1	Message	Date
Jussi Saurio	28cd14ff1c	Merge 'Fix labeler' from Jussi Saurio Closes #1538	2025-05-20 16:34:16 +03:00
Jussi Saurio	1dc7518551	Fix labeler: checkout repo and add issues:write perm	2025-05-20 16:32:53 +03:00
Jussi Saurio	c4548b51f1	Merge 'Optimization: lift common subexpressions from OR terms' from Jussi Saurio ```sql -- This PR does effectively this transformation: select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where ( p_partkey = l_partkey and p_brand = 'Brand#22' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 8 and l_quantity <= 8 + 10 and p_size between 1 and 5 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#23' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 10 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#12' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 24 and l_quantity <= 24 + 10 and p_size between 1 and 15 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ); -- Same query with common conjuncts (ANDs) extracted: select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where p_partkey = l_partkey and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' and ( ( p_brand = 'Brand#22' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 8 and l_quantity <= 8 + 10 and p_size between 1 and 5 ) or ( p_brand = 'Brand#23' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 10 ) or ( p_brand = 'Brand#12' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 24 and l_quantity <= 24 + 10 and p_size between 1 and 15 ) ); ``` This allows Limbo's optimizer to 1. recognize `p_partkey=l_partkey` as an index constraint on `part`, and 2. filter out `lineitem` rows before joining. With this optimization, Limbo completes TPC-H `19.sql` nearly as fast as SQLite on my machine. Without it, Limbo takes forever. This branch: `939ms` Main: `uh, i started running it a few minutes ago and it hasnt finished, and i dont feel like waiting i guess` Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1520	2025-05-20 14:33:49 +03:00
Jussi Saurio	14058357ad	Merge 'refactor: replace Operation::Subquery with Table::FromClauseSubquery' from Jussi Saurio Previously the Operation enum consisted of: - Operation::Scan - Operation::Search - Operation::Subquery Which was always a dumb hack because what we really are doing is an Operation::Scan on a "virtual"/"pseudo" table (overloaded names...) derived from a subquery appearing in the FROM clause. Hence, refactor the relevant data structures so that the Table enum now contains a new variant: Table::FromClauseSubquery And the Operation enum only consists of Scan and Search. ``` SELECT * FROM (SELECT ...) sub; -- the subquery here was previously interpreted as Operation::Subquery on a Table::Pseudo, -- with a lot of special handling for Operation::Subquery in different code paths -- now it's an Operation::Scan on a Table::FromClauseSubquery ``` No functional changes (intended, at least!) Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1529	2025-05-20 14:31:42 +03:00
Jussi Saurio	63457bda14	Adjust logic not to delete WhereTerms, since 'consumed' property was introduced	2025-05-20 14:28:05 +03:00
Jussi Saurio	6790b7479c	Optimization: lift common subexpressions from OR terms ```sql -- This PR does effectively this transformation: select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where ( p_partkey = l_partkey and p_brand = 'Brand#22' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 8 and l_quantity <= 8 + 10 and p_size between 1 and 5 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#23' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 10 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#12' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 24 and l_quantity <= 24 + 10 and p_size between 1 and 15 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ); -- Same query with common conjuncts (ANDs) extracted: select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part where p_partkey = l_partkey and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' and ( ( p_brand = 'Brand#22' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 8 and l_quantity <= 8 + 10 and p_size between 1 and 5 ) or ( p_brand = 'Brand#23' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 10 ) or ( p_brand = 'Brand#12' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 24 and l_quantity <= 24 + 10 and p_size between 1 and 15 ) ); ```	2025-05-20 14:25:15 +03:00
Jussi Saurio	9d3aca6e8f	Fix compile error after merge	2025-05-20 14:19:32 +03:00
Jussi Saurio	57d8f20135	Merge 'Add collation column to Index struct' from Jussi Saurio Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1532	2025-05-20 14:18:17 +03:00
Pekka Enberg	e102cd0be5	Merge 'Add support for DISTINCT aggregate functions' from Jussi Saurio Reviewable commit by commit. CI failures are not related. Adds support for e.g. `select first_name, sum(distinct age), count(distinct age), avg(distinct age) from users group by 1` Implementation details: - Creates an ephemeral index per distinct aggregate, and jumps over the accumulation step if a duplicate is found Closes #1507	2025-05-20 13:58:57 +03:00
Jussi Saurio	3121c6cdd3	Replace Operation::Subquery with Table::FromClauseSubquery Previously the Operation enum consisted of: - Operation::Scan - Operation::Search - Operation::Subquery Which was always a dumb hack because what we really are doing is an Operation::Scan on a "virtual"/"pseudo" table (overloaded names...) derived from a subquery appearing in the FROM clause. Hence, refactor the relevant data structures so that the Table enum now contains a new variant: Table::FromClauseSubquery And the Operation enum only consists of Scan and Search. No functional changes (intended, at least!)	2025-05-20 12:56:30 +03:00
Jussi Saurio	9c710b5292	Add collation column to Index struct	2025-05-20 12:52:54 +03:00
Jussi Saurio	32aac8e9ef	Merge 'Feature: Collate' from Pedro Muniz I was implementing `ALTER TABLE .. RENAME TO`, and I noticed that `COLLATE` was necessary for it to work. This is a relatively big PR as to properly implement `COLLATE`, I needed to add a field to a couple of instructions that are emitted frequently, and there is a lot of boilerplate that is required when you do such a change. My main source of reference was this site from SQLite: https://sqlite.org/datatype3.html#collation. It gives a good description of the precedence of collation in certain expressions. I did write a couple of tests that I thought caught the edges cases of `COLLATE`, but honestly, I may have missed a few. I would appreciate some help later to write more tests. `Collate` basically just compares two `TEXT` values according to some comparison function. If both values are not `TEXT`, just fallback to the normal comparison we are already doing. `Collate` happens in four main places: - `Collate` Expression modifier - `Binary` Expression - `Column` Expression - `Order By` and `Group By` In `Binary`, `Order By`, `Group By` expressions, the collation sequence for the comparisons can be derived from explicitly with the use of `COLLATE` keyword, or implicitly if there is a `COLLATE` definition in `CREATE TABLE`. If neither are present it defaults to `Binary` collation. For the `Column` expression, it tries to use collation in `CREATE TABLE` column definition. If not present it defaults to `Binary` collation. Lastly, there was some repetition on how the `Binary` expression was being translated, so I removed that part. As mentioned in the `COMPAT.md`, I did not implement custom collation sequences yet, as it would deter me from properly implementing. I have some ideas of how I can extend my current implementation to support that with FFI, but I think that is best served for a different PR. Closes #1367	2025-05-20 10:52:11 +03:00
pedrocarlo	52533cab40	only pass collations for index in cursor + adhere to order of columns in index	2025-05-19 15:22:55 -03:00
pedrocarlo	22b6b88f68	fix rebase type errors	2025-05-19 15:22:55 -03:00
pedrocarlo	819fd0f496	use any error method instead, as limbo and sqlite error message differ slightly	2025-05-19 15:22:55 -03:00
pedrocarlo	5b15d6aa32	Get the table correctly from the connection instead of table_references + test to confirm unique constraint	2025-05-19 15:22:55 -03:00
pedrocarlo	4a3119786e	refactor BtreeCursor and Sorter to accept Vec of collations	2025-05-19 15:22:55 -03:00
pedrocarlo	f28ce2b757	add collations to btree cursor	2025-05-19 15:22:55 -03:00
pedrocarlo	5bd47d7462	post rebase adjustments to accomodate new instructions that were created before the merge conflicts	2025-05-19 15:22:15 -03:00
pedrocarlo	cc86c789d6	Correct Rtrim	2025-05-19 15:22:15 -03:00
pedrocarlo	6d7a73fd60	More tests	2025-05-19 15:22:15 -03:00
pedrocarlo	bf1fe9e0b3	Actually fixed group by and order by collation	2025-05-19 15:22:15 -03:00
pedrocarlo	0df6c87f07	Fixed Group By collation	2025-05-19 15:22:14 -03:00
pedrocarlo	bba9689674	Fixed matching bug for defining collation context to use	2025-05-19 15:22:14 -03:00
pedrocarlo	a818b6924c	Removed repeated binary expression translation. Adjusted the set_collation to capture additional context of whether it was set by a Collate expression or not. Added some tests to prove those modifications were necessary.	2025-05-19 15:22:14 -03:00
pedrocarlo	f8854f180a	Added collation to create table columns	2025-05-19 15:22:14 -03:00
pedrocarlo	d0a63429a6	Naive implementation of collate for queries. Not implemented for column constraints	2025-05-19 15:22:14 -03:00
pedrocarlo	b5b1010e7c	set binary collation as default	2025-05-19 15:22:14 -03:00
pedrocarlo	510c70e919	Create CollationSeq enum and functions. Move strum to workspace dependency to avoid version mismatch with Parser	2025-05-19 15:22:14 -03:00
Pekka Enberg	4cf9305947	Merge 'bindings/javascript: Add Statement.iterate() method' from Diego Reis I still didn't find a good way to implement variadic functions, we should have some sort of wrapper in JS layer but it didn't work so well for me so far. But once done it will be easily transferable to any function. It also should probably be async, but AFAIC napi doesn't have a straight way to implement async iterators. Closes #1515	2025-05-19 20:44:40 +03:00
Pekka Enberg	95ea92faca	Merge 'Improve debug build validation speed' from Pere Diaz Bou Various things to improve speed of long fuzz test execution time: * remove unnecessary debug_validate_cell calls * Add SortedVec for keys in fuzz tests * Validate btree's depth in fuzz test every 1K inserts to not overload test with validations. We add `VALIDATE_BTREE` env variable to enable validation on every insert in case it is needed. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1521	2025-05-19 20:42:48 +03:00
Pekka Enberg	5bd85774cf	Merge 'Update README.md' from Yusheng Guo A syntax error. Closes #1522	2025-05-19 20:42:25 +03:00
Yusheng Guo	810beeea93	Update README.md A syntax error.	2025-05-19 18:29:57 +08:00
Jussi Saurio	d2b1be8af7	Merge 'optimizer: fix order by removal logic' from Jussi Saurio 1. `group_by_contains_all` was incorrect - it was not checking that all order by columns are in group by; it was instead checking that all group by columns are in order by, which is absolutely incorrect for the intended purpose. 2. remove ORDER BY clause if GROUP BY clause can sort the rows in the same way. Test failures are not related Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1511	2025-05-19 11:29:17 +03:00
Jussi Saurio	b7b4f6a390	Merge 'Mark WHERE terms as consumed instead of deleting them' from Jussi Saurio We've run into trouble in multiple places due to the fact that we delete terms from the where clause (e.g. when a constant condition is removed, or the term becomes part of an index seek key). A simpler solution is to add a flag indicating that the term is consumed (used), so that it is not translated in the main loop anymore when WHERE clause terms are evaluated. note: CI failures are unrelated Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1477	2025-05-19 11:28:09 +03:00
Pere Diaz Bou	f2d0d61962	copilot nice suggestions :)	2025-05-19 09:59:28 +02:00
Pere Diaz Bou	5eab588115	improve debug build validation speed Various things: * remove unnecessary debug_validate_cell calls * Add SortedVec for keys in fuzz tests * Validate btree's depth in fuzz test every 1K inserts to not overload test with validations. We add `VALIDATE_BTREE` env variable to enable validation on every insert in case it is needed.	2025-05-19 09:53:15 +02:00
Jussi Saurio	092462fa74	fix build	2025-05-19 07:29:02 +03:00
Jussi Saurio	7c6a4410d2	Merge '(btree): Implement support for handling offset-based payload access with overflow support' from Krishna Vishal This PR adds a new function `read_write_payload_with_offset` to support reading and writing payload data at specific offsets, handling both local content and overflow pages. This is a port of SQLite's `accessPayload` function in `btree.c` and will be essential for supporting incremental blob I/O in the coming PRs. - Added a state machine called `PayloadOverflowWithOffset` to make the procedure reentrant. - Correctly processes both local payload data and payload stored in overflow pages Testing: - Reading and writing to a column with no overflow pages. - Reading and writing at an offset with overflow pages (spanning 10 pages) Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1476	2025-05-18 22:58:10 +03:00
Jussi Saurio	3185aabd20	Merge 'Cli config 2' from Pedro Muniz As there were many merge conflicts for the other PR, I rewrote the code and condensed it here. ORIGINAL PR TEXT: Provides the code to almost close https://github.com/tursodatabase/limbo/issues/1251 . The JsonSchema is derived, but I am still not sure how to automate the distribution to SchemaStore for autocomplete. I added some docs for that want to see the config file description. I still am not sure how to automate this documentation. Maybe some macro magic? Reviewed-by: Preston Thorpe (@PThorpe92) Closes #1430	2025-05-18 22:56:22 +03:00
Jussi Saurio	372850756d	Merge 'Fix updating single value' from Pedro Muniz Closes #1482. I needed to change the `key_exists_in_index` function because it zips the values from the records it is comparing, but if one of the records is empty or not of the same length, the `all` function could return true incorrectly. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1514	2025-05-18 22:51:11 +03:00
pedrocarlo	fd51c0a970	invalidate records not necessary for fix	2025-05-18 16:43:25 -03:00
Jussi Saurio	071940f9a7	Merge 'Autoindex fix' from Pedro Muniz Closes #1508 . There were two small issues to fix: 1. We were not checking in the IndexMap of columns, if the unique column name is declared in the composite declaration exists in the IndexMap. This solved the first this statement `create table t4(a, unique(b));`. 2. The second thing was that we forgot to add the column_name to the HashSet of columns. ```rust Some(PrimaryKeyDefinitionType::Simple { column, .. }) => { let mut columns = HashSet::new(); columns.insert(std::mem::take(column)); // Have to also insert the current column_name we are iterating over in primary_key_column_results columns.insert(column_name.clone()); <-- Fix here primary_key_definition = Some(PrimaryKeyDefinitionType::Composite { columns }); } ``` The rest of the modifications are just some small simplifications for readability and avoiding some clones Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1512	2025-05-18 22:41:02 +03:00
pedrocarlo	c8b768f1ea	add tests	2025-05-18 12:43:11 -03:00
pedrocarlo	7f081c1ac9	remove transmute. Just iterate over columns. No need for unsafe	2025-05-18 12:32:49 -03:00
Jussi Saurio	06c4a5dea9	Merge 'use temporary db in sqlite3 wal tests to fix later tests failing' from Preston Thorpe This prevents the new wal checkpoint tests in `sqlite3/tests/compat` from writing/creating `test` table to `testing/testing.db`, which is queried in later tests which fail for having an extra table. There is another issue with failing tests related to the new `count` impl that I am in the process of fixing as well, but that will be a separate PR. Closes #1513	2025-05-18 09:48:23 +03:00
Diego Reis	bc88b7cb65	bind/js: Formatting	2025-05-18 00:51:49 -03:00
Diego Reis	9f6e242e42	bind/js: Partially implements iterate() method The API still is sync and isn't variadic	2025-05-18 00:51:23 -03:00
pedrocarlo	af1f9492ef	fix updating single value	2025-05-17 19:43:24 -03:00
PThorpe92	6d70e6d048	Add reset db to Makefile to create clean testing db between tests that perform writes	2025-05-17 16:23:17 -04:00

1 2 3 4 5 ...

4599 Commits