TLDR: no need to call either of:
program.emit_insn_with_label_dependency() -> just call program.emit_insn()
program.defer_label_resolution() -> just call program.resolve_label()
Changes:
- make BranchOffset an explicit enum (Label, Offset, Placeholder)
- remove program.emit_insn_with_label_dependency() - label dependency is automatically detected
- for label to offset mapping, use a hashmap from label(negative i32) to offset (positive u32)
- resolve all labels in program.build()
- remove program.defer_label_resolution() - all labels are resolved in build()
#525
- Adds the necessary `ScalarFunc` variants to support the `changes()` &
`total_changes()` SQLite function.
- Adds the necessary fields to the `Connection` struct to track changes.
- Modify the `InsertAwait` OpCode behaviour to affect the changes
counter.
Closes#589
Implements the `json_extract` function.
In the meantime, the json path has already been implemented by
@petersooley in https://github.com/tursodatabase/limbo/pull/555 which is
a requirement for `json_extract`.
However, this PR takes a different approach and parses the JSON path
using the JSON grammar, because there are a lot of quirks in how a JSON
`key` can look (see the JSON grammar in the Pest file).
The downside is that it allocates more memory than the current
implementation, but might be easier to maintain in the long run.
I included a lot of tests with some quirky behavior of the
`json_extract` (some of them still need some work). I also noticed that
these changed between sqlite versions (had `SQLite 3.43.2` locally and
`3.45` gave different results). Due to this, I'm not sure how much value
there is in trying to be fully compatible with SQLite. Perhaps the
approach taken by @petersooley solves 99% of use-cases?
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#524
I will warn that this PR is quite big out of necessity, since subqueries
are, as the name implies, queries within queries, so everything that
works with a regular query should also work with a subquery, roughly
speaking.
---
- Adds support for:
* uncorrelated subqueries in FROM clause (i.e. appear as a "table",
and do not refer to outer tables). Example of this at the end of the PR
description.
* column and subquery aliasing (`select sub.renamed from (select
name as renamed from products) sub`)
* inner and outer filtering of subqueries (`select sub.name from
(select name from products where name = 'joe') sub`, and, `select
sub.name from (select name from products) sub where sub.name = 'joe'`)
* joining between regular tables and subqueries
* joining between multiple subqueries
* in general working with subqueries should roughly equal working
with regular tables
- Main idea: subqueries are just wrappers of a `SelectPlan` that never
emit ResultRows, instead they `Yield` control back to the parent query,
and the parent query can copy the subquery result values into a
ResultRow. New variant `SourceOperator::Subquery` that wraps a subquery
`SelectPlan`.
- Plans can now not only refer to btree tables (`select p.name from
products`) but also subqueries (`select sub.foo from (select name as foo
from products) sub`. Hence this PR also adds support for column aliases
which didn't exist before.
* An `Expr::Column` that refers to a regular table will result in an
`Insn::Column` (i.e. a read from disk/memory) whereas an `Expr::Column`
that refers to a subquery will result in an `Insn::Copy` (from register
to register) instead
- Subquery handling is entirely unoptimized, there's no predicate
pushdown from outer query to subqueries, or elimination of redundant
subqueries (e.g. in the trivial example `SELECT * FROM (SELECT * FROM
users) sub` the subquery can just be entirely removed)
---
This PR does not add support (yet) for:
- subqueries in result columns: `SELECT t.foo, (SELECT .......) as
column_from_subquery FROM t`
- subqueries in WHERE clauses e.g. `SELECT * FROM t1 WHERE t1.foo IN
(SELECT ...)`
- subquery-related optimizations, of which there are plenty available.
No analysis is done regarding e.g. whether predicates on the outer query
level could be pushed into the subquery, or whether the subquery could
be entirely eliminated. Both of the above can probably be done fairly
easily for a bunch of trivial cases.
---
Example bytecode with comments added:
```
limbo> EXPLAIN SELECT p.name, sub.funny_name FROM products p JOIN (
select id, concat(name, '-lol') as funny_name from products
) sub USING (id) LIMIT 3;
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 31 0 0 Start at 31
// Coroutine implementation starts at insn 2, jump immediately to 14
1 InitCoroutine 1 14 2 0
2 OpenReadAsync 0 3 0 0 table=products, root=3
3 OpenReadAwait 0 0 0 0
4 RewindAsync 0 0 0 0
5 RewindAwait 0 13 0 0 Rewind table products
6 RowId 0 2 0 0 r[2]=products.rowid
7 Column 0 1 4 0 r[4]=products.name
8 String8 0 5 0 -lol 0 r[5]='-lol'
9 Function 0 4 3 concat 0 r[3]=func(r[4..5])
// jump back to main loop of query (insn 20)
10 Yield 1 0 0 0
11 NextAsync 0 0 0 0
12 NextAwait 0 6 0 0
13 EndCoroutine 1 0 0 0
14 OpenReadAsync 1 3 0 0 table=p, root=3
15 OpenReadAwait 0 0 0 0
16 RewindAsync 1 0 0 0
17 RewindAwait 1 30 0 0 Rewind table p
// Since this subquery is the inner loop of the join, reinitialize it on every iteration of the outer loop
18 InitCoroutine 1 0 2 0
// Jump back to the subquery implementation to assign another row into registers
19 Yield 1 28 0 0
20 RowId 1 8 0 0 r[8]=p.rowid
// Copy sub.id
21 Copy 2 9 0 0 r[9]=r[2]
// p.id == sub.id?
22 Ne 8 9 27 0 if r[8]!=r[9] goto 27
23 Column 1 1 6 0 r[6]=p.name
// copy sub.funny_name
24 Copy 3 7 0 0 r[7]=r[3]
25 ResultRow 6 2 0 0 output=r[6..7]
26 DecrJumpZero 10 30 0 0 if (--r[10]==0) goto 30
27 Goto 0 19 0 0
28 NextAsync 1 0 0 0
29 NextAwait 1 18 0 0
30 Halt 0 0 0 0
31 Transaction 0 0 0 0
32 Integer 3 10 0 0 r[10]=3
33 Goto 0 1 0 0
```
Closes#566
Fixes#577
With the previous implementation we weren't escaping the regex meta
characters . And in certain cases glob had a different meaning than
regex.
For e.g , the below shows a glob pattern with its regex equivalent
- `[][]` translates to `[\]\[]`
- `[^][]` translates to `[^\]\[]`
Closes#578
I have added support for like function with escape i.e like(X,Y,Z) .
There is good opportunity to refactor/cleanup the like operations which
can be done in another PR, as I wanted to keep the changes small .
Closes#568
Fixes#552
In our construct regex function, we were not escaping the required
characters properly which was causing the failure.
Limbo output with this branch
```
limbo> select like('\%A', '\A');
1
limbo> select like('A$%', 'A$');
1
limbo> select like('%a.a', 'aaaa');
0
```
Closes#553
Implements [json_array](https://sqlite.org/json1.html#jarray).
As a side quest, this PR also fixes an issue with the `CHAR` function
which didn't work properly if the parameters were non-leaf AST nodes.
The PR is quite big, because as I mentioned in https://github.com/tursod
atabase/limbo/issues/127#issuecomment-2541307979 we had to modify
`OwnedValue::Text` to support a `subtype` parameter, which is what
SQLite does.
Closes#504