Commit Graph

920 Commits

Author SHA1 Message Date
psvri
2b879a4f40 Fix arithmetic operations for text values 2025-01-04 00:34:04 +05:30
Pekka Enberg
90d01f468f Merge 'Support uncorrelated FROM clause subqueries' from Jussi Saurio
I will warn that this PR is quite big out of necessity, since subqueries
are, as the name implies, queries within queries, so everything that
works with a regular query should also work with a subquery, roughly
speaking.
---
- Adds support for:
    * uncorrelated subqueries in FROM clause (i.e. appear as a "table",
and do not refer to outer tables). Example of this at the end of the PR
description.
    * column and subquery aliasing (`select sub.renamed from (select
name as renamed from products) sub`)
    * inner and outer filtering of subqueries (`select sub.name from
(select name from products where name = 'joe') sub`, and,  `select
sub.name from (select name from products) sub where sub.name = 'joe'`)
    * joining between regular tables and subqueries
    * joining between multiple subqueries
    * in general working with subqueries should roughly equal working
with regular tables
- Main idea: subqueries are just wrappers of a `SelectPlan` that never
emit ResultRows, instead they `Yield` control back to the parent query,
and the parent query can copy the subquery result values into a
ResultRow. New variant `SourceOperator::Subquery` that wraps a subquery
`SelectPlan`.
- Plans can now not only refer to btree tables (`select p.name from
products`) but also subqueries (`select sub.foo from (select name as foo
from products) sub`. Hence this PR also adds support for column aliases
which didn't exist before.
    * An `Expr::Column` that refers to a regular table will result in an
`Insn::Column` (i.e. a read from disk/memory) whereas an `Expr::Column`
that refers to a subquery will result in an `Insn::Copy` (from register
to register) instead
- Subquery handling is entirely unoptimized, there's no predicate
pushdown from outer query to subqueries, or elimination of redundant
subqueries (e.g. in the trivial example `SELECT * FROM (SELECT * FROM
users) sub` the subquery can just be entirely removed)
---
This PR does not add support (yet) for:
- subqueries in result columns: `SELECT t.foo, (SELECT .......) as
column_from_subquery FROM t`
- subqueries in WHERE clauses e.g. `SELECT * FROM t1 WHERE t1.foo IN
(SELECT ...)`
- subquery-related optimizations, of which there are plenty available.
No analysis is done regarding e.g. whether predicates on the outer query
level could be pushed into the subquery, or whether the subquery could
be entirely eliminated. Both of the above can probably be done fairly
easily for a bunch of trivial cases.
---
Example bytecode with comments added:
```
limbo> EXPLAIN SELECT p.name, sub.funny_name FROM products p JOIN (
  select id, concat(name, '-lol') as funny_name from products
) sub USING (id) LIMIT 3;

addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     31    0                    0   Start at 31

// Coroutine implementation starts at insn 2, jump immediately to 14
1     InitCoroutine      1     14    2                    0

2     OpenReadAsync      0     3     0                    0   table=products, root=3
3     OpenReadAwait      0     0     0                    0
4     RewindAsync        0     0     0                    0
5     RewindAwait        0     13    0                    0   Rewind table products
6       RowId            0     2     0                    0   r[2]=products.rowid
7       Column           0     1     4                    0   r[4]=products.name
8       String8          0     5     0     -lol           0   r[5]='-lol'
9       Function         0     4     3     concat         0   r[3]=func(r[4..5])

// jump back to main loop of query (insn 20)
10      Yield            1     0     0                    0

11    NextAsync          0     0     0                    0
12    NextAwait          0     6     0                    0
13    EndCoroutine       1     0     0                    0
14    OpenReadAsync      1     3     0                    0   table=p, root=3
15    OpenReadAwait      0     0     0                    0
16    RewindAsync        1     0     0                    0
17    RewindAwait        1     30    0                    0   Rewind table p

// Since this subquery is the inner loop of the join, reinitialize it on every iteration of the outer loop
18      InitCoroutine    1     0     2                    0

// Jump back to the subquery implementation to assign another row into registers
19      Yield            1     28    0                    0

20      RowId            1     8     0                    0   r[8]=p.rowid

// Copy sub.id
21      Copy             2     9     0                    0   r[9]=r[2]

// p.id == sub.id?
22      Ne               8     9     27                   0   if r[8]!=r[9] goto 27
23      Column           1     1     6                    0   r[6]=p.name

// copy sub.funny_name
24      Copy             3     7     0                    0   r[7]=r[3]

25      ResultRow        6     2     0                    0   output=r[6..7]
26      DecrJumpZero     10    30    0                    0   if (--r[10]==0) goto 30
27      Goto             0     19    0                    0
28    NextAsync          1     0     0                    0
29    NextAwait          1     18    0                    0
30    Halt               0     0     0                    0
31    Transaction        0     0     0                    0
32    Integer            3     10    0                    0   r[10]=3
33    Goto               0     1     0                    0
```

Closes #566
2025-01-02 11:15:07 +02:00
psvri
e7d4fa0a53 Minor clippy fixes 2025-01-01 16:11:52 +05:30
Jussi Saurio
df6c8c9dd1 comment about yield instruction 2025-01-01 08:22:47 +02:00
Jussi Saurio
776ffc6131 assert instead of fallback 2025-01-01 08:21:20 +02:00
Jussi Saurio
3e5be21707 remove commented out code 2025-01-01 08:18:49 +02:00
Jussi Saurio
2b5b54c44e clippy 2025-01-01 07:56:39 +02:00
Jussi Saurio
6633a3c66a condense comment 2025-01-01 07:56:39 +02:00
Jussi Saurio
80b9da95c0 replace termination_label_stack with much simpler LoopLabels 2025-01-01 07:56:39 +02:00
PThorpe92
ed95007298 Separate exec insns to individual functions 2024-12-31 07:55:40 -05:00
PThorpe92
45eeee1589 Add comment and match case for improperly called values 2024-12-31 07:54:01 -05:00
PThorpe92
d572089b80 Refactor out repetitive agg_func code in vdbe 2024-12-31 07:53:55 -05:00
Jussi Saurio
2066475e03 feat: subqueries in FROM clause 2024-12-31 14:18:29 +02:00
Pekka Enberg
0aabcddf18 ext/uuid: Convert uuid4() to external function 2024-12-31 13:56:32 +02:00
Pekka Enberg
33dbd6c892 core: External functions 2024-12-31 13:56:32 +02:00
Pekka Enberg
858aecfea2 core: Drop Clone and PartialEq from Func enum
We don't need them anywhere and they make it hard to introduce
GenericFunction.
2024-12-31 13:51:20 +02:00
Pekka Enberg
dca47f62ea core: Don't use Weak reference for connection database
The database object is a way to represent state that's shared across
multiple connections. We don't want to release that object until all
connections are closed.
2024-12-31 13:51:20 +02:00
Pekka Enberg
3046757d09 core/translate: Move prepare_delete_plan() to delete.rs
The planner.rs file is pretty big. Let's make it smaller by moving more
of delete handling to delete.rs.
2024-12-31 11:40:35 +02:00
Pekka Enberg
cb5d86ed8e core/translate: Move prepare_select_plan() to select.rs
The planner.rs file is pretty big. Let's make it smaller by moving more
of select handling to select.rs.
2024-12-31 11:38:13 +02:00
Pekka Enberg
dad3a5b069 core/translate: Move translate_insert() to top
The translate_insert() function is the entry point to translating an
INSERT statement so let's make it the first function in insert.rs.
2024-12-31 11:33:17 +02:00
Pekka Enberg
f6149d3bd7 core/translate: Kill unused lifetimes 2024-12-31 11:33:17 +02:00
Pekka Enberg
f8d408e9a8 Merge 'enable sqpoll by default in io_uring' from Preston Thorpe
EDIT: will continue iterating on these ideas, as discussed on discord.
for now, this has been changed to just enable `IORING_ENABLE_SQPOLL` by
default. This is supported sin `5.11`, and I believe the last debian
release < that reached EOL in July, so shouldn't be an issue.

Closes #557
2024-12-31 10:37:39 +02:00
Pekka Enberg
80e12dfbf7 Merge 'Fixes glob giving wrong results in some cases ' from Vrishabh
Fixes #577
With the previous implementation we weren't escaping the regex meta
characters . And in certain cases glob had a different meaning than
regex.
For e.g , the below shows a glob pattern with its regex equivalent
- `[][]` translates to `[\]\[]`
- `[^][]` translates to `[^\]\[]`

Closes #578
2024-12-31 10:31:49 +02:00
psvri
3ac3fdf0a2 Fix glob 2024-12-30 17:02:31 +05:30
PThorpe92
4e77840ee5 Setup io_uring with sqpoll enabled 2024-12-29 15:34:17 -05:00
Lauri Virtanen
db384c6828 Simplify init of delimiter_exprs
Clippy:
https://rust-lang.github.io/rust-clippy/master/index.html#needless_late_init
2024-12-29 19:22:45 +02:00
Lauri Virtanen
10065bda35 Don't use inaccurate pi values, make clippy happy 2024-12-29 19:22:45 +02:00
Lauri Virtanen
55916fdd9d Remove function parameter as it's only used in recursion 2024-12-29 19:22:45 +02:00
Lauri Virtanen
854005b977 Run cargo clippy --fix && cargo fmt 2024-12-29 19:22:28 +02:00
PThorpe92
f6cd707544 Add clippy CI, fix or ignore warnings where appropriate 2024-12-29 10:25:41 -05:00
Pekka Enberg
f87dc7cacc Merge 'Support like function with escape' from Vrishabh
I have added support for like function with escape i.e like(X,Y,Z) .
There is good opportunity to refactor/cleanup the like operations which
can be done in another PR, as I wanted to keep the changes small .

Closes #568
2024-12-29 16:58:06 +02:00
adamnemecek
97647ff056 Clean up code to use Self
Closes #556
2024-12-29 10:07:38 +02:00
psvri
1922b8ea38 Support like function with escape 2024-12-28 13:55:12 +05:30
PThorpe92
ddf229c432 Update COMPAT.md 2024-12-27 15:42:13 -05:00
PThorpe92
f08d62b446 Add Remainder vdbe oppcode 2024-12-27 15:39:02 -05:00
Pekka Enberg
3bc554f27c core: Remove unused import 2024-12-27 18:39:38 +02:00
Pekka Enberg
5b8f00cb8d Merge 'Fixes like function when pattern has regex meta chars' from Vrishabh
Fixes #552
In our construct regex function, we were not escaping the required
characters properly which was causing the failure.
Limbo output with this branch
```
limbo> select like('\%A', '\A');
1
limbo> select like('A$%', 'A$');
1
limbo> select like('%a.a', 'aaaa');
0
```

Closes #553
2024-12-27 18:35:47 +02:00
Pekka Enberg
8352dcd582 Merge 'Fix sqlite_version() out of bound panics' from Diego Reis
#560
Changes to `translate_expr` function:
* [`core/translate/expr.rs`](diffhunk://#diff-
371865d5d7b8bcaed649413c687492e61e94f21387cd9b2c47d989a033888c8bL1558-
R1560): Changed the `amount` parameter in the `Insn::Copy` instruction
from `1` to `0`.
Enhancements to the testing framework:
* [`testing/scalar-functions.test`](diffhunk://#diff-
a046d58ab24eee8207f0ce3199f8d0a609edcef9c24b8ed7f242f7a60e6c1e61R812-
R815): Added a new test `do_execsql_test_regex` to validate that the
`sqlite_version` function returns a valid output.
* [`testing/tester.tcl`](diffhunk://#diff-
316cca92d85df3f78558cc3e60d7420c1fd19a23ecf2bbea534db93ab08ea3ecR29-
R45): Introduced a new procedure `do_execsql_test_regex` to support
regex-based validation of SQL outputs.

Closes #561
2024-12-27 18:34:47 +02:00
Diego Reis
2d0c16c428 Fix sqlite_version() out of bound 2024-12-27 11:39:33 -03:00
김선우
ad2d515ffd Merge branch 'main' into feature/delete-planning 2024-12-27 23:21:35 +09:00
Pekka Enberg
b2f96ddfbd core/translate: Remove unnecessary mut 2024-12-27 10:55:31 +02:00
Pekka Enberg
9680471876 core: Remove unreachable pragma patterns 2024-12-27 10:55:31 +02:00
Pekka Enberg
464508bb29 core/vdbe: Kill unused next_free_register() 2024-12-27 10:55:31 +02:00
Pekka Enberg
244326ee57 core: Remove unused imports 2024-12-27 10:55:31 +02:00
Pekka Enberg
f2ecebc357 Rename RowResult to StepResult
The name "row result" is confusing because it really *is* a result from
a step() call. The only difference is how a row is represented as we
return from VDBE or from a statement.

Therefore, rename RowResult to StepResult.
2024-12-27 10:20:41 +02:00
Pekka Enberg
db0c59413b Merge 'Implement json_array_length' from Peter Sooley
In line with [other work](#127) for JSON support, this PR adds support
for [`json_array_length`](https://www.sqlite.org/json1.html#jarraylen).
This includes a first pass at supporting the JSON path for accessing
values within the JSON.
I've added tests in rust and tcl.
![image](https://github.com/user-
attachments/assets/0d0e3319-317b-4783-bff4-241eb2902255)

Closes #555
2024-12-27 10:08:22 +02:00
Pekka Enberg
5065074617 Merge 'core: disk serialization changes to align with sqlite' from Jussi Saurio
This PR's genesis is from investigating #532, but I still can't reliably
reproduce it on either `main` or this branch so I don't know if this PR
_fixes_ anything, but I guess it aligns us more with sqlite anyway
---
Anyway: I looked at DBs created with limbo and with sqlite using
[ImHex](https://github.com/WerWolv/ImHex) and the differences seem to
be:
1. SQLite uses varint according to [the
spec](https://www.sqlite.org/fileformat.html#record_format), whereas
limbo always encodes integers as i64
2. Limbo adds 4 bytes of zeros for overflow page pointer (even in cases
where the cell doesnt overflow)
3. Limbo adds a space after `CREATE TABLE name` before the `(` even when
user doesn't specify it?
I implemented the following:
- Fix 1: Varint serialization of i8, i16, i24, i32, i48 and i64
according to payload, instead of always using i64
- Fix 2: Removed the 4 bytes reserved for overflow page pointer in non-
overflow cases

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #550
2024-12-27 10:06:52 +02:00
Pekka Enberg
937779b8c0 Merge 'core/btree: small refactoring + documentation tweaks' from Jussi Saurio
small follow up to https://github.com/tursodatabase/limbo/pull/539
contains:
- Variable renaming and comments to `btreecursor.insert_into_cell()`
- New utility methods `pagecontent.header_size()`,
`pagecontent.cell_pointer_array_size()`,
`pagecontent.unallocated_region_start()` and
`pagecontent.unallocated_region_size()`
- Refactor of `btreecursor.compute_free_space()` (plus comments and
variable renaming)
- Rename `pagecontent.cell_get_raw_pointer_region()` to
`pagecontent.cell_pointer_array_offset_and_size()` and remove its usage
in `btreecursor.defragment_page()`

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #543
2024-12-27 10:06:30 +02:00
Peter Sooley
28244b10d6 implement json_array_length 2024-12-26 15:08:11 -08:00
psvri
12e49da1d0 fmt fixes 2024-12-26 22:42:46 +05:30