Purpose of this PR
- Add support for Java (as Java has an extensive community)
- Enable Limbo to be provided as a Java library in the future.
Changes
- Added `bindings/java` directory.
- Created `src` package for Java (Gradle) and `rs_src` for Rust JNI
code.
- Implemented basic functionality to gather feedback and proceed with
further development.
- Some features just printout the result for testing purposes.
Future Work
- Integrate CI to publish the library.
- Enhance error handling mechanisms.
- Implement additional features and functionality.
- Add test code after we decide on which features to provide
Issue
https://github.com/tursodatabase/limbo/issues/615Closes#613
No functional changes, just move almost everything out of `emitter.rs`
into smaller modules with more distinct responsibilities. Also, from
`expr.rs`, move `translate_aggregation` into `aggregation.rs` and
`translate_aggregation_groupby` into `group_by.rs`
Closes#610
In sqlite3, before arithmetic operation is done, it first checks if the
operation dosent overflow and then does the operation. In case it
overflows it would covert the arguments into floats and then does the
operation as [per code](https://github.com/sqlite/sqlite/blob/ded37f337b
7b2e916657a83732aaec40eb146282/src/vdbe.c#L1875) . I have done the same
behaviour for limbo.
Closes#612
Closes#448
Adds support for:
- Automatically creating index on the PRIMARY KEY if the pk is not a
rowid alias
- Parsing the automatically created index into memory, for use in
queries
* `testing/testing_norowidalias.db` now uses the PK indexes and some
tests were failing -- looks like taking the index into use revealed some
bugs in our codegen :) I fixed those in later commits.
Does not add support for:
- Inserting to the index during writes to the table
Closes#588
We had not implemented arithmetic operations for text values. This PR
implements this and aligns the behavior with sqlite3 .
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#605
Changes:
---
Instead of passing around:
1. `SymbolTable` (for resolving functions dynamically provided by
extensions), and
2. `precomputed_exprs_to_registers` (for resolving registers
containing results from already-computed expressions),
add `Resolver` struct to `TranslateCtx` that encapsulates both.
This introduces some lifetime annotation spam unfortunately, but maybe
we can also migrate to using references instead of cloning elsewhere as
well, since we generally do a lot of copying of expressions right now
for convenience.
---
Use way less arguments to functions - mainly just passing `program`,
`t_ctx` and `plan` around.
Closes#609
I plan to do a series of cleanups to `emitter.rs` which is a chonker
module right now. This first one is just making some variable namings
more consistent and refactoring some metadata fields to function-local
variables
Closes#604
#525
- Adds the necessary `ScalarFunc` variants to support the `changes()` &
`total_changes()` SQLite function.
- Adds the necessary fields to the `Connection` struct to track changes.
- Modify the `InsertAwait` OpCode behaviour to affect the changes
counter.
Closes#589
* Add SQLITE_FLAGS env variable handling to compatibility tests as
SQLite does not handle `-q` flag
* Fix inconsistent SQLITE_NOTFOUND error code and add SQLITE_CANTOPEN
code
* Improve compatibility tests to display errors instead of hanging
indefinitely
Closes#599
I was trying to get limbo to a position where we can run the benchmark [
clickbench](https://github.com/ClickHouse/ClickBench/blob/main/sqlite/be
nchmark.sh) and found that `.import` command was not supported in cli.
This PR adds that support for command `.import` which has the same
parameters as sqlite cli.
Do note that not all options from sqlite `.import` is implemented yet in
this PR.
Reviewed-by: Preston Thorpe <preston@unlockedlabs.org>
Closes#598
Implements the `json_extract` function.
In the meantime, the json path has already been implemented by
@petersooley in https://github.com/tursodatabase/limbo/pull/555 which is
a requirement for `json_extract`.
However, this PR takes a different approach and parses the JSON path
using the JSON grammar, because there are a lot of quirks in how a JSON
`key` can look (see the JSON grammar in the Pest file).
The downside is that it allocates more memory than the current
implementation, but might be easier to maintain in the long run.
I included a lot of tests with some quirky behavior of the
`json_extract` (some of them still need some work). I also noticed that
these changed between sqlite versions (had `SQLite 3.43.2` locally and
`3.45` gave different results). Due to this, I'm not sure how much value
there is in trying to be fully compatible with SQLite. Perhaps the
approach taken by @petersooley solves 99% of use-cases?
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#524
data does not match predicate when using index, e.g: `select id, age
from users where age > 90 limit 1;` will return data with age 90
the reason is that the current index seek directly uses record for
comparison, but the record of the index itself is longer than the record
of the key (because it contains the primary key), so Gt is invalid.
since only single-column indexes are currently supported:
https://github.com/tursodatabase/limbo/pull/350, only the first value of
the record is currently used for comparison.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#593
I will warn that this PR is quite big out of necessity, since subqueries
are, as the name implies, queries within queries, so everything that
works with a regular query should also work with a subquery, roughly
speaking.
---
- Adds support for:
* uncorrelated subqueries in FROM clause (i.e. appear as a "table",
and do not refer to outer tables). Example of this at the end of the PR
description.
* column and subquery aliasing (`select sub.renamed from (select
name as renamed from products) sub`)
* inner and outer filtering of subqueries (`select sub.name from
(select name from products where name = 'joe') sub`, and, `select
sub.name from (select name from products) sub where sub.name = 'joe'`)
* joining between regular tables and subqueries
* joining between multiple subqueries
* in general working with subqueries should roughly equal working
with regular tables
- Main idea: subqueries are just wrappers of a `SelectPlan` that never
emit ResultRows, instead they `Yield` control back to the parent query,
and the parent query can copy the subquery result values into a
ResultRow. New variant `SourceOperator::Subquery` that wraps a subquery
`SelectPlan`.
- Plans can now not only refer to btree tables (`select p.name from
products`) but also subqueries (`select sub.foo from (select name as foo
from products) sub`. Hence this PR also adds support for column aliases
which didn't exist before.
* An `Expr::Column` that refers to a regular table will result in an
`Insn::Column` (i.e. a read from disk/memory) whereas an `Expr::Column`
that refers to a subquery will result in an `Insn::Copy` (from register
to register) instead
- Subquery handling is entirely unoptimized, there's no predicate
pushdown from outer query to subqueries, or elimination of redundant
subqueries (e.g. in the trivial example `SELECT * FROM (SELECT * FROM
users) sub` the subquery can just be entirely removed)
---
This PR does not add support (yet) for:
- subqueries in result columns: `SELECT t.foo, (SELECT .......) as
column_from_subquery FROM t`
- subqueries in WHERE clauses e.g. `SELECT * FROM t1 WHERE t1.foo IN
(SELECT ...)`
- subquery-related optimizations, of which there are plenty available.
No analysis is done regarding e.g. whether predicates on the outer query
level could be pushed into the subquery, or whether the subquery could
be entirely eliminated. Both of the above can probably be done fairly
easily for a bunch of trivial cases.
---
Example bytecode with comments added:
```
limbo> EXPLAIN SELECT p.name, sub.funny_name FROM products p JOIN (
select id, concat(name, '-lol') as funny_name from products
) sub USING (id) LIMIT 3;
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 31 0 0 Start at 31
// Coroutine implementation starts at insn 2, jump immediately to 14
1 InitCoroutine 1 14 2 0
2 OpenReadAsync 0 3 0 0 table=products, root=3
3 OpenReadAwait 0 0 0 0
4 RewindAsync 0 0 0 0
5 RewindAwait 0 13 0 0 Rewind table products
6 RowId 0 2 0 0 r[2]=products.rowid
7 Column 0 1 4 0 r[4]=products.name
8 String8 0 5 0 -lol 0 r[5]='-lol'
9 Function 0 4 3 concat 0 r[3]=func(r[4..5])
// jump back to main loop of query (insn 20)
10 Yield 1 0 0 0
11 NextAsync 0 0 0 0
12 NextAwait 0 6 0 0
13 EndCoroutine 1 0 0 0
14 OpenReadAsync 1 3 0 0 table=p, root=3
15 OpenReadAwait 0 0 0 0
16 RewindAsync 1 0 0 0
17 RewindAwait 1 30 0 0 Rewind table p
// Since this subquery is the inner loop of the join, reinitialize it on every iteration of the outer loop
18 InitCoroutine 1 0 2 0
// Jump back to the subquery implementation to assign another row into registers
19 Yield 1 28 0 0
20 RowId 1 8 0 0 r[8]=p.rowid
// Copy sub.id
21 Copy 2 9 0 0 r[9]=r[2]
// p.id == sub.id?
22 Ne 8 9 27 0 if r[8]!=r[9] goto 27
23 Column 1 1 6 0 r[6]=p.name
// copy sub.funny_name
24 Copy 3 7 0 0 r[7]=r[3]
25 ResultRow 6 2 0 0 output=r[6..7]
26 DecrJumpZero 10 30 0 0 if (--r[10]==0) goto 30
27 Goto 0 19 0 0
28 NextAsync 1 0 0 0
29 NextAwait 1 18 0 0
30 Halt 0 0 0 0
31 Transaction 0 0 0 0
32 Integer 3 10 0 0 r[10]=3
33 Goto 0 1 0 0
```
Closes#566