turso

mirror of https://github.com/aljazceru/turso.git synced 2026-02-15 13:04:20 +01:00

Author	SHA1	Message	Date
jussisaurio	dddf850111	Merge 'Update clap to 4.5' from Pekka Enberg The Github dependabot complains about anstream, which comes through `clap`: https://github.com/tursodatabase/limbo/security/dependabot/8 Closes #444	2024-12-11 17:08:38 +02:00
jussisaurio	eb9374aebf	Merge 'Add support for CASE expressions.' from Alex Miller There's two forms of case: CASE (WHEN [bool expr] THEN [value])+ (ELSE [value])? END which checks a series of boolean conditions, and: CASE expr (WHEN [expr] THEN [value})+ (ELSE [value])? END Which checks a series of equality conditions. This implements support for both. Note that the ELSE is optional, and will be equivalent to `ELSE null` if not specified. sqlite3 gives the implementation as: ``` sqlite> explain select case a WHEN a THEN b WHEN c THEN d ELSE 0 END from casetest; addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 16 0 0 Start at 16 1 OpenRead 0 3 0 4 0 root=3 iDb=0; casetest 2 Rewind 0 15 0 0 3 Column 0 0 2 0 r[2]= cursor 0 column 0 4 Column 0 0 3 0 r[3]= cursor 0 column 0 5 Ne 3 8 2 BINARY-8 83 if r[2]!=r[3] goto 8 6 Column 0 1 1 0 r[1]= cursor 0 column 1 7 Goto 0 13 0 0 8 Column 0 2 3 0 r[3]= cursor 0 column 2 9 Ne 3 12 2 BINARY-8 83 if r[2]!=r[3] goto 12 10 Column 0 3 1 0 r[1]= cursor 0 column 3 11 Goto 0 13 0 0 12 Integer 0 1 0 0 r[1]=0 13 ResultRow 1 1 0 0 output=r[1] 14 Next 0 3 0 1 15 Halt 0 0 0 0 16 Transaction 0 0 2 0 1 usesStmtJournal=0 17 Goto 0 1 0 0 ``` and after this patch, limbo gives: ``` addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 19 0 0 Start at 19 1 OpenReadAsync 0 4 0 0 table=casetest, root=4 2 OpenReadAwait 0 0 0 0 3 RewindAsync 0 0 0 0 4 RewindAwait 0 18 0 0 Rewind table casetest 5 Column 0 0 2 0 r[2]=casetest.a 6 Column 0 0 3 0 r[3]=casetest.a 7 Ne 2 3 10 0 if r[2]!=r[3] goto 10 8 Column 0 1 1 0 r[1]=casetest.b 9 Goto 0 15 0 0 10 Column 0 2 3 0 r[3]=casetest.c 11 Ne 2 3 14 0 if r[2]!=r[3] goto 14 12 Column 0 3 1 0 r[1]=casetest.d 13 Goto 0 15 0 0 14 Integer 0 1 0 0 r[1]=0 15 ResultRow 1 1 0 0 output=r[1] 16 NextAsync 0 0 0 0 17 NextAwait 0 5 0 0 18 Halt 0 0 0 0 19 Transaction 0 0 0 0 20 Goto 0 1 0 0 ``` And then as there's nowhere to annotate this new support in COMPAT.md, I added a corresponding heading for SELECT expressions and what is/isn't supported. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #425	2024-12-11 17:05:41 +02:00
Pekka Enberg	617f95c7b6	Update clap to 4.5 The Github dependabot complains about anstream, which comes through `clap`: https://github.com/tursodatabase/limbo/security/dependabot/8	2024-12-11 14:39:27 +02:00
Pekka Enberg	c03839cc6a	Merge 'Upgrade pprof to 0.14' from Pekka Enberg Github's dependabot complains that the current version has an unsoudness issue so let's bump to a newer version: https://github.com/tursodatabase/limbo/security/dependabot/10 Closes #440	2024-12-11 14:07:26 +02:00
Pekka Enberg	ab07c77036	Upgrade pprof to 0.14 Github's dependabot complains that the current version has an unsoudness issue so let's bump to a newer version: https://github.com/tursodatabase/limbo/security/dependabot/10	2024-12-11 11:21:09 +02:00
Pekka Enberg	b8aca48a0f	Update CHANGELOG	2024-12-11 10:45:02 +02:00
Pekka Enberg	04f196113a	Merge 'Add last_insert_rowid() function' from Krishna Vishal - Changed `Cursor` trait to be able to get access to `root_page` - SQLite only updates last_insert_rowid for non-schema inserts. So we check if the `InsertAwait` is not for `root_page` before updating rowid In SQLite it looks like this: ``` sqlite> EXPLAIN SELECT last_insert_rowid(); addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 4 0 0 1 Function 0 0 1 last_insert_rowid(0) 0 2 ResultRow 1 1 0 0 3 Halt 0 0 0 0 4 Goto 0 1 0 0 ``` In limbo it will look like this: ``` limbo> EXPLAIN SELECT last_insert_rowid(); addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 4 0 0 Start at 4 1 Function 0 2 1 last_insert_rowid 0 r[1]=func() 2 ResultRow 1 1 0 0 output=r[1] 3 Halt 0 0 0 0 4 Transaction 0 0 0 0 5 Goto 0 1 0 0 ``` Closes #427	2024-12-11 10:44:34 +02:00
krishvishal	1e89b17462	Ran cargo fmt	2024-12-11 14:08:33 +05:30
krishvishal	b23df24703	Added tests for `last_insert_rowid()`	2024-12-11 14:01:04 +05:30
Pekka Enberg	f7e9d3a25a	Update CHANGELOG	2024-12-11 09:18:36 +02:00
Pekka Enberg	ca272ba937	Merge 'Support JOIN USING and NATURAL JOIN' from Jussi Saurio Closes #360 Closes #361 Closes #422	2024-12-11 09:17:51 +02:00
Pekka Enberg	eda1f5396c	Merge 'Add octet_length scalar function' from Kacper Kołodziej Adds `octet_length` scalar function. Part of solution for: #144 Closes #430	2024-12-11 07:44:04 +02:00
Pekka Enberg	1024432b11	Merge 'Fix: length function characters counting' from Kacper Kołodziej `String::len()` function returns number of bytes. Here we need to use `chars().count()` to count real characters as specified in https://www.sqlite.org/lang_corefunc.html#length Fixes #431 Closes #429	2024-12-11 07:42:47 +02:00
Pekka Enberg	9a004a1742	Merge 'Fixed typo' from Wincent Balin Closes #435	2024-12-11 07:41:46 +02:00
Pekka Enberg	e3d9082feb	scripts/merge-pr.py: Fix pull from a fork	2024-12-11 07:41:33 +02:00
Alex Miller	88c862ce4d	Comments, resolve label better, make tests more fun	2024-12-10 19:59:54 -08:00
Wincent Balin	3f747feb5b	Fixed typo	2024-12-11 03:42:21 +01:00
Kacper Kołodziej	e4d31cbe34	add tests for octet_length scalar function	2024-12-10 22:56:38 +01:00
Kacper Kołodziej	d4bff2c93e	add octet_length scalar function	2024-12-10 22:56:38 +01:00
Kacper Kołodziej	660d3e8d07	fix: count characters in string in length function `length` function should count characters, not bytes. https://www.sqlite.org/lang_corefunc.html#length	2024-12-10 22:48:50 +01:00
Kacper Kołodziej	e68a86532a	tests: length function with multibyte characters Depending on encoding, some characters have more than one byte. Add failing test to verify if current implementation of scalar function `length` takes that into account.	2024-12-10 22:47:22 +01:00
krishvishal	134b5576ad	Ran cargo fmt	2024-12-09 22:55:54 +05:30
krishvishal	7e2928a5f1	Feature: last_insert_rowid() - Changed `Cursor` trait to be able to get access to `root_page` - SQLite only updates last_insert_rowid for non-schema inserts. So we check if the `InsertAwait` is not for `root_page` before updating rowid	2024-12-09 22:48:42 +05:30
jussisaurio	fe88d45e5e	Add more comments to push_predicate/push_predicates	2024-12-09 17:50:29 +02:00
jussisaurio	840caed2f7	Fix bug with multiway joins that include the same table multiple times	2024-12-09 17:50:29 +02:00
jussisaurio	7924f9b64d	consider all joined tables instead of just previous in natural/using	2024-12-09 17:50:29 +02:00
jussisaurio	4f027035de	tests for multiple joins	2024-12-09 17:50:29 +02:00
jussisaurio	81b6605453	support NATURAL JOIN	2024-12-09 17:50:29 +02:00
jussisaurio	bed932c186	Support join USING	2024-12-09 17:50:29 +02:00
Pekka Enberg	f9b300a608	Update CHANGELOG	2024-12-09 17:31:24 +02:00
Pekka Enberg	ba1f7cd16f	Merge 'feat(core/translate): support HAVING' from Jussi Saurio support the HAVING clause. note that sqlite (and i think standard sql?) supports HAVING even without GROUP BY, but `sqlite3-parser` doesn't. also fixes some issues with the PartialOrd implementation of OwnedValue and the implementations of `concat` and `round` which i discovered due to my HAVING tcl tests failing Closes #420	2024-12-09 17:30:40 +02:00
Pekka Enberg	98a8dc58b1	Update CHANGELOG	2024-12-09 17:29:15 +02:00
Pekka Enberg	36f9565910	Merge 'feat(wasm): add get and iterate func' from Jean Arhancet Add `get` and `iterate` functions to the wasm module Closes #421	2024-12-09 17:28:43 +02:00
krishvishal	1e23af7d24	Added `last_insert_rowid()` function. Need to fix its behavior. Problem is probably with `Cursor` implementation.	2024-12-09 17:41:28 +05:30
Alex Miller	f7bb7f8dee	Fix typo and improve comment	2024-12-08 14:20:23 -08:00
Alex Miller	c2e3957d73	I misunderstood what a constant instruction was	2024-12-08 14:12:45 -08:00
Alex Miller	eb00226cfe	Add support for CASE expressions. There's two forms of case: CASE (WHEN [bool expr] THEN [value])+ (ELSE [value])? END which checks a series of boolean conditions, and: CASE expr (WHEN [expr] THEN [value})+ (ELSE [value])? END Which checks a series of equality conditions. This implements support for both. Note that the ELSE is optional, and will be equivalent to `ELSE null` if not specified. sqlite3 gives the implementation as: sqlite> explain select case a WHEN a THEN b WHEN c THEN d ELSE 0 END from casetest; addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 16 0 0 Start at 16 1 OpenRead 0 3 0 4 0 root=3 iDb=0; casetest 2 Rewind 0 15 0 0 3 Column 0 0 2 0 r[2]= cursor 0 column 0 4 Column 0 0 3 0 r[3]= cursor 0 column 0 5 Ne 3 8 2 BINARY-8 83 if r[2]!=r[3] goto 8 6 Column 0 1 1 0 r[1]= cursor 0 column 1 7 Goto 0 13 0 0 8 Column 0 2 3 0 r[3]= cursor 0 column 2 9 Ne 3 12 2 BINARY-8 83 if r[2]!=r[3] goto 12 10 Column 0 3 1 0 r[1]= cursor 0 column 3 11 Goto 0 13 0 0 12 Integer 0 1 0 0 r[1]=0 13 ResultRow 1 1 0 0 output=r[1] 14 Next 0 3 0 1 15 Halt 0 0 0 0 16 Transaction 0 0 2 0 1 usesStmtJournal=0 17 Goto 0 1 0 0 and after this patch, limbo gives: addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 18 0 0 Start at 18 1 OpenReadAsync 0 4 0 0 table=casetest, root=4 2 OpenReadAwait 0 0 0 0 3 RewindAsync 0 0 0 0 4 RewindAwait 0 17 0 0 Rewind table casetest 5 Column 0 0 2 0 r[2]=casetest.a 6 Column 0 0 3 0 r[3]=casetest.a 7 Ne 2 3 10 0 if r[2]!=r[3] goto 10 8 Column 0 1 1 0 r[1]=casetest.b 9 Goto 0 14 0 0 10 Column 0 2 3 0 r[3]=casetest.c 11 Ne 2 3 14 0 if r[2]!=r[3] goto 14 12 Column 0 3 1 0 r[1]=casetest.d 13 Goto 0 14 0 0 14 ResultRow 1 1 0 0 output=r[1] 15 NextAsync 0 0 0 0 16 NextAwait 0 5 0 0 17 Halt 0 0 0 0 18 Transaction 0 0 0 0 19 Integer 0 1 0 0 r[1]=0 20 Goto 0 1 0 0 And then as there's nowhere to annotate this new support in COMPAT.md, I added a corresponding heading for SELECT expressions and what is/isn't supported.	2024-12-08 14:09:03 -08:00
jussisaurio	9bc3ccc394	fmt	2024-12-03 19:11:08 +02:00
jussisaurio	885136a511	Merge 'fix(core/translate): fix bug with multiway joins and clean up left join implementation' from Jussi Saurio There was a bug where this kind of query (a 3-way join with two seeks and only one scan loop) would emit a wrong jump target for DecrJumpZero: ``` limbo> explain select u.first_name, u2.last_name, p.name from users u join users u2 on u.id=u2.id join products p on u2.id = p.id limit 3; addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 21 0 0 Start at 21 1 OpenReadAsync 0 2 0 0 table=u, root=2 2 OpenReadAwait 0 0 0 0 3 OpenReadAsync 1 2 0 0 table=u2, root=2 4 OpenReadAwait 0 0 0 0 5 OpenReadAsync 2 3 0 0 table=p, root=3 6 OpenReadAwait 0 0 0 0 7 RewindAsync 0 0 0 0 8 RewindAwait 0 18 0 0 Rewind table u 9 RowId 0 1 0 0 r[1]=u.rowid 10 SeekRowid 1 1 18 0 if (r[1]!=u2.rowid) goto 18 11 RowId 1 2 0 0 r[2]=u2.rowid 12 SeekRowid 2 2 18 0 if (r[2]!=p.rowid) goto 18 13 Column 0 1 3 0 r[3]=u.first_name 14 Column 1 2 4 0 r[4]=u2.last_name 15 Column 2 1 5 0 r[5]=p.name 16 ResultRow 3 3 0 0 output=r[3..5] 17 DecrJumpZero 6 18 0 0 if (--r[6]==0) goto 18 <--- this should go to Halt!!! 18 NextAsync 0 0 0 0 19 NextAwait 0 9 0 0 20 Halt 0 0 0 0 21 Transaction 0 0 0 0 22 Integer 3 6 0 0 r[6]=3 23 Goto 0 1 0 0 ``` due to incorrect label bookkeeping. fixed the bookkeeping, plus cleaned up unnecessary crap from the left join bookkeeping at the same time. Closes #423	2024-12-03 19:02:28 +02:00
jussisaurio	ca25a73c95	a little bit more explanation about left join handling	2024-11-30 20:54:22 +02:00
jussisaurio	83f8ea1b13	Fix bug with multiway joins and clean up left join implementation	2024-11-30 20:47:48 +02:00
jussisaurio	3e9883bfbd	update COMPAT	2024-11-30 10:06:37 +02:00
jussisaurio	3f80e41e7a	support HAVING	2024-11-30 10:05:13 +02:00
jussisaurio	fceb9ac62b	Merge 'core: (another) refactor of read path query processing logic' from Jussi Saurio # (another) refactor of read path query processing logic This PR rewrites our select query processing architecture by moving away from the stateful operator-based execution model, back to a more direct bytecode generation approach that, IMO, is easier to follow. A large part of the bytecode emission itself (`program.emit_insn(...)`) is just copy-pasted from the old implementation (after all, it did _work_), but just structured differently. ## Main Changes 1. Removed the `step()` state machine from operators. Previously, each operator had internal state tracking its execution progress, and parent operators would call `.step()` on their children until they needed to do something else. Reading the code and trying to follow the execution was not very easy, and the abstraction was also too general: there was a lot of unnecessary pattern matching and special casing to make query execution fit the model, when honestly the evaluation of a SELECT without any CTEs or subqueries etc can only go a few different ways. 2. Because of the above change, the main codegen function `emit_program()` now contains a series of linear conditional steps instead of kicking off the state machines with `root_operator.step()`. These steps are just things like: "open the cursors", "open the loops", "emit a record into either the main output or a sorter", etc. 3. The `Plan` struct now (again) contains most of the familiar SELECT query components (WHERE clause, GROUP BY, ORDER BY, etc.) rather than having all of them embedded in a tree of operators. The operator tree now ONLY consists of operators that read from a source table in some way -- so it could just be called a join tree, I guess. 4. There's now `plan.result_columns` which is _ALWAYS_ evaluated to get the final results of a SELECT. Previously the operator state machine thing had a hodgepodge of different ways of arriving at the result row. 5. Removed operators: - Removed Filter operator (even in the previous version the Filter operator -- which is really the where clause -- had its predicates pushed down to the table loops, and it didn't really ever exist in the bytecode emission phase anymore) - Removed Projection operator (`plan.result_columns`) - Removed Limit operator (`plan.limit`) - Removed Aggregate operator (`plan.group_by` and `plan.aggregates`) - Removed Order operator (`plan.order_by`) 6. Added `ast::Expr::Column` to the vendored sqlite3 parser -- column resolution is now done as early as possible. This eliminates repeated string comparisons during execution. I.e. no need for `resolve_ident_table()` etc 7. Simplified expression result caching by removing the complex, and frankly weird, ExpressionResultCache apparatus. The refactored code handles this by tracking which cursor to read columns from at a given time, and copies values from existing registers if the expression is a computation that has already been done in a previous step of the execution. For example in: ``` limbo> select concat(u.first_name, '-LOL'), sum(u.age) from users u group by concat(u.first_name, '-LOL') order by sum(u.age) desc limit 10; Michael-LOL\|11204 David-LOL\|8758 Robert-LOL\|8109 Jennifer-LOL\|7700 John-LOL\|7299 Christopher-LOL\|6397 James-LOL\|5921 Joseph-LOL\|5711 Brian-LOL\|5059 William-LOL\|5047 ``` the query execution engine knows that `concat(u.first_name, '-LOL')` is the second column of the `ORDER_BY` sorter without any complex caching. HACK: For deduplicating expressions in ORDER BY and the SELECT body, the code still relies on expression `==` equality to make those decisions which sucks (e.g. `sum(x) != SUM(x)` -- I've marked the parts where this is used with a TODO, we should have a custom expression equality comparison function instead...). This is not a correctness- breaking thing, but still. ## In short - No more state machines - The operator tree is now only a "join tree", pretty much - No weird general purpose `ExpressionResultCache` - More direct mapping between SQL operations and generated bytecode -- there's really no harm in carrying the "group by" etc concepts in the bytecode generation phase instead of burying them inside Operators - When a ResultRow is emitted, it is _always_ done by evaluating `plan.result_columns`, instead of the special-casing and hacks that existed previously - 600+ LOC removed Closes #416	2024-11-30 10:03:58 +02:00
JeanArhancet	5693cd1ae0	feat(wasm): add get and iterate func	2024-11-29 21:48:20 +01:00
jussisaurio	84742b81fa	Obsolete comment	2024-11-27 22:43:36 +02:00
jussisaurio	da811dc403	add doc comments for members of Plan struct	2024-11-27 19:30:07 +02:00
jussisaurio	db462530f1	metadata instead of m	2024-11-27 19:27:36 +02:00
jussisaurio	7d569aee1f	fix stupid comment	2024-11-26 18:37:06 +02:00
jussisaurio	1b34698872	add comments and rename some misleading label variables	2024-11-26 18:28:19 +02:00

1 2 3 4 5 ...

1255 Commits