data does not match predicate when using index, e.g: `select id, age
from users where age > 90 limit 1;` will return data with age 90
the reason is that the current index seek directly uses record for
comparison, but the record of the index itself is longer than the record
of the key (because it contains the primary key), so Gt is invalid.
since only single-column indexes are currently supported:
https://github.com/tursodatabase/limbo/pull/350, only the first value of
the record is currently used for comparison.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#593
I will warn that this PR is quite big out of necessity, since subqueries
are, as the name implies, queries within queries, so everything that
works with a regular query should also work with a subquery, roughly
speaking.
---
- Adds support for:
* uncorrelated subqueries in FROM clause (i.e. appear as a "table",
and do not refer to outer tables). Example of this at the end of the PR
description.
* column and subquery aliasing (`select sub.renamed from (select
name as renamed from products) sub`)
* inner and outer filtering of subqueries (`select sub.name from
(select name from products where name = 'joe') sub`, and, `select
sub.name from (select name from products) sub where sub.name = 'joe'`)
* joining between regular tables and subqueries
* joining between multiple subqueries
* in general working with subqueries should roughly equal working
with regular tables
- Main idea: subqueries are just wrappers of a `SelectPlan` that never
emit ResultRows, instead they `Yield` control back to the parent query,
and the parent query can copy the subquery result values into a
ResultRow. New variant `SourceOperator::Subquery` that wraps a subquery
`SelectPlan`.
- Plans can now not only refer to btree tables (`select p.name from
products`) but also subqueries (`select sub.foo from (select name as foo
from products) sub`. Hence this PR also adds support for column aliases
which didn't exist before.
* An `Expr::Column` that refers to a regular table will result in an
`Insn::Column` (i.e. a read from disk/memory) whereas an `Expr::Column`
that refers to a subquery will result in an `Insn::Copy` (from register
to register) instead
- Subquery handling is entirely unoptimized, there's no predicate
pushdown from outer query to subqueries, or elimination of redundant
subqueries (e.g. in the trivial example `SELECT * FROM (SELECT * FROM
users) sub` the subquery can just be entirely removed)
---
This PR does not add support (yet) for:
- subqueries in result columns: `SELECT t.foo, (SELECT .......) as
column_from_subquery FROM t`
- subqueries in WHERE clauses e.g. `SELECT * FROM t1 WHERE t1.foo IN
(SELECT ...)`
- subquery-related optimizations, of which there are plenty available.
No analysis is done regarding e.g. whether predicates on the outer query
level could be pushed into the subquery, or whether the subquery could
be entirely eliminated. Both of the above can probably be done fairly
easily for a bunch of trivial cases.
---
Example bytecode with comments added:
```
limbo> EXPLAIN SELECT p.name, sub.funny_name FROM products p JOIN (
select id, concat(name, '-lol') as funny_name from products
) sub USING (id) LIMIT 3;
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 31 0 0 Start at 31
// Coroutine implementation starts at insn 2, jump immediately to 14
1 InitCoroutine 1 14 2 0
2 OpenReadAsync 0 3 0 0 table=products, root=3
3 OpenReadAwait 0 0 0 0
4 RewindAsync 0 0 0 0
5 RewindAwait 0 13 0 0 Rewind table products
6 RowId 0 2 0 0 r[2]=products.rowid
7 Column 0 1 4 0 r[4]=products.name
8 String8 0 5 0 -lol 0 r[5]='-lol'
9 Function 0 4 3 concat 0 r[3]=func(r[4..5])
// jump back to main loop of query (insn 20)
10 Yield 1 0 0 0
11 NextAsync 0 0 0 0
12 NextAwait 0 6 0 0
13 EndCoroutine 1 0 0 0
14 OpenReadAsync 1 3 0 0 table=p, root=3
15 OpenReadAwait 0 0 0 0
16 RewindAsync 1 0 0 0
17 RewindAwait 1 30 0 0 Rewind table p
// Since this subquery is the inner loop of the join, reinitialize it on every iteration of the outer loop
18 InitCoroutine 1 0 2 0
// Jump back to the subquery implementation to assign another row into registers
19 Yield 1 28 0 0
20 RowId 1 8 0 0 r[8]=p.rowid
// Copy sub.id
21 Copy 2 9 0 0 r[9]=r[2]
// p.id == sub.id?
22 Ne 8 9 27 0 if r[8]!=r[9] goto 27
23 Column 1 1 6 0 r[6]=p.name
// copy sub.funny_name
24 Copy 3 7 0 0 r[7]=r[3]
25 ResultRow 6 2 0 0 output=r[6..7]
26 DecrJumpZero 10 30 0 0 if (--r[10]==0) goto 30
27 Goto 0 19 0 0
28 NextAsync 1 0 0 0
29 NextAwait 1 18 0 0
30 Halt 0 0 0 0
31 Transaction 0 0 0 0
32 Integer 3 10 0 0 r[10]=3
33 Goto 0 1 0 0
```
Closes#566
This PR cleans up a lot of the repetitive code that caused having to
match the `AggFunc` cases repeatedly 😄
Most of the repetition was caused by the binary math operators, so I
just added it for those for now.
Reviewed-by: Jussi Saurio <penberg@iki.fi>
Reviewed-by: Pere Diaz Bou <preston@unlockedlabs.org>
Closes#575
This pull request adds support for external functions, which are
functions provided by extensions. The main difference to how things are
today is that extensions can register functions to a symbol table at
runtime instead of specifying an enum variant.
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#567
The database object is a way to represent state that's shared across
multiple connections. We don't want to release that object until all
connections are closed.
EDIT: will continue iterating on these ideas, as discussed on discord.
for now, this has been changed to just enable `IORING_ENABLE_SQPOLL` by
default. This is supported sin `5.11`, and I believe the last debian
release < that reached EOL in July, so shouldn't be an issue.
Closes#557
- add creating the same table two times to the list of checked
properties as a failure property example
Reviewed-by: Pekka Enberg <penberg@iki.fi>
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#554
Suggesting adding a simple guide on how to start contributing a SQL
function.
It would be easier if I had something like this when I started
contributing to Limbo, maybe it will be for others as well.
I'm aware that this will be partly out-of-date over time due to
refactors. However, I think the gist of modifying different layers
(Parser, VDBE bytecode program,...) and how they interact with each
other stays. That part is what I hope this guide can convey to a new
contributor.
What do you think?
Closes#565
Fixes#577
With the previous implementation we weren't escaping the regex meta
characters . And in certain cases glob had a different meaning than
regex.
For e.g , the below shows a glob pattern with its regex equivalent
- `[][]` translates to `[\]\[]`
- `[^][]` translates to `[^\]\[]`
Closes#578
There is no semantic changes in this PR, the clippy command came from
@pereman2's suggestion in #542
There was more to fix than I previously thought. I originally set out to
refactor out some of the logic in `vdbe::step`, but with some actual
semantic changes. That file: `vdbe/mod.rs` is so full that it required
moving the `Insn` enum to another file, so I figured I would just put
some non-semantic changes all together so it's easier to review and get
that done first... and figured I'd fix some clippy warnings while I was
at it. Also adjusted the actions to `checkout/@v3`.
The project is obviously so early that there are going to be a decent
amount of things like unused fields or methods, which is why I was
originally not really pro clippy.. but seeing how many genuinely good
improvements it recommended, I think it's probably the right way to go.
Closes#563