mirror of https://github.com/aljazceru/turso.git synced 2026-02-09 02:04:22 +01:00

Go to file

Pekka Enberg 90d01f468f Merge 'Support uncorrelated FROM clause subqueries' from Jussi Saurio

I will warn that this PR is quite big out of necessity, since subqueries
are, as the name implies, queries within queries, so everything that
works with a regular query should also work with a subquery, roughly
speaking.
---
- Adds support for:
    * uncorrelated subqueries in FROM clause (i.e. appear as a "table",
and do not refer to outer tables). Example of this at the end of the PR
description.
    * column and subquery aliasing (`select sub.renamed from (select
name as renamed from products) sub`)
    * inner and outer filtering of subqueries (`select sub.name from
(select name from products where name = 'joe') sub`, and,  `select
sub.name from (select name from products) sub where sub.name = 'joe'`)
    * joining between regular tables and subqueries
    * joining between multiple subqueries
    * in general working with subqueries should roughly equal working
with regular tables
- Main idea: subqueries are just wrappers of a `SelectPlan` that never
emit ResultRows, instead they `Yield` control back to the parent query,
and the parent query can copy the subquery result values into a
ResultRow. New variant `SourceOperator::Subquery` that wraps a subquery
`SelectPlan`.
- Plans can now not only refer to btree tables (`select p.name from
products`) but also subqueries (`select sub.foo from (select name as foo
from products) sub`. Hence this PR also adds support for column aliases
which didn't exist before.
    * An `Expr::Column` that refers to a regular table will result in an
`Insn::Column` (i.e. a read from disk/memory) whereas an `Expr::Column`
that refers to a subquery will result in an `Insn::Copy` (from register
to register) instead
- Subquery handling is entirely unoptimized, there's no predicate
pushdown from outer query to subqueries, or elimination of redundant
subqueries (e.g. in the trivial example `SELECT * FROM (SELECT * FROM
users) sub` the subquery can just be entirely removed)
---
This PR does not add support (yet) for:
- subqueries in result columns: `SELECT t.foo, (SELECT .......) as
column_from_subquery FROM t`
- subqueries in WHERE clauses e.g. `SELECT * FROM t1 WHERE t1.foo IN
(SELECT ...)`
- subquery-related optimizations, of which there are plenty available.
No analysis is done regarding e.g. whether predicates on the outer query
level could be pushed into the subquery, or whether the subquery could
be entirely eliminated. Both of the above can probably be done fairly
easily for a bunch of trivial cases.
---
Example bytecode with comments added:
```
limbo> EXPLAIN SELECT p.name, sub.funny_name FROM products p JOIN (
  select id, concat(name, '-lol') as funny_name from products
) sub USING (id) LIMIT 3;

addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     31    0                    0   Start at 31

// Coroutine implementation starts at insn 2, jump immediately to 14
1     InitCoroutine      1     14    2                    0

2     OpenReadAsync      0     3     0                    0   table=products, root=3
3     OpenReadAwait      0     0     0                    0
4     RewindAsync        0     0     0                    0
5     RewindAwait        0     13    0                    0   Rewind table products
6       RowId            0     2     0                    0   r[2]=products.rowid
7       Column           0     1     4                    0   r[4]=products.name
8       String8          0     5     0     -lol           0   r[5]='-lol'
9       Function         0     4     3     concat         0   r[3]=func(r[4..5])

// jump back to main loop of query (insn 20)
10      Yield            1     0     0                    0

11    NextAsync          0     0     0                    0
12    NextAwait          0     6     0                    0
13    EndCoroutine       1     0     0                    0
14    OpenReadAsync      1     3     0                    0   table=p, root=3
15    OpenReadAwait      0     0     0                    0
16    RewindAsync        1     0     0                    0
17    RewindAwait        1     30    0                    0   Rewind table p

// Since this subquery is the inner loop of the join, reinitialize it on every iteration of the outer loop
18      InitCoroutine    1     0     2                    0

// Jump back to the subquery implementation to assign another row into registers
19      Yield            1     28    0                    0

20      RowId            1     8     0                    0   r[8]=p.rowid

// Copy sub.id
21      Copy             2     9     0                    0   r[9]=r[2]

// p.id == sub.id?
22      Ne               8     9     27                   0   if r[8]!=r[9] goto 27
23      Column           1     1     6                    0   r[6]=p.name

// copy sub.funny_name
24      Copy             3     7     0                    0   r[7]=r[3]

25      ResultRow        6     2     0                    0   output=r[6..7]
26      DecrJumpZero     10    30    0                    0   if (--r[10]==0) goto 30
27      Goto             0     19    0                    0
28    NextAsync          1     0     0                    0
29    NextAwait          1     18    0                    0
30    Halt               0     0     0                    0
31    Transaction        0     0     0                    0
32    Integer            3     10    0                    0   r[10]=3
33    Goto               0     1     0                    0
```

Closes #566

2025-01-02 11:15:07 +02:00

.github/workflows

Fix checkout action for clippy ci

2024-12-29 10:25:49 -05:00

bindings

Limbo 0.0.11

2024-12-31 10:43:24 +02:00

cli

Add clippy CI, fix or ignore warnings where appropriate

2024-12-29 10:25:41 -05:00

core

Merge 'Support uncorrelated FROM clause subqueries' from Jussi Saurio

2025-01-02 11:15:07 +02:00

docs/internals

make functions contrib doc better

2024-12-28 12:48:04 +07:00

macros

Add clippy CI, fix or ignore warnings where appropriate

2024-12-29 10:25:41 -05:00

perf/latency

Rename RowResult to StepResult

2024-12-27 10:20:41 +02:00

scripts

scripts/merge-pr.py: Fix pull from a fork

2024-12-11 07:41:33 +02:00

simulator

fix clippy warning

2024-12-30 00:41:21 -05:00

sqlite3

Add clippy CI, fix or ignore warnings where appropriate

2024-12-29 10:25:41 -05:00

test

Rename RowResult to StepResult

2024-12-27 10:20:41 +02:00

testing

feat: subqueries in FROM clause

2024-12-31 14:18:29 +02:00

vendored/sqlite3-parser

pragma: inital approach to handle pragma statements

2024-12-18 17:26:16 +01:00

.github.json

github: Fix Pekka's email address

2024-11-15 15:55:33 +02:00

.gitignore

ignore wal files

2024-12-14 17:13:02 +02:00

Cargo.lock

Limbo 0.0.11

2024-12-31 10:43:24 +02:00

Cargo.toml

Limbo 0.0.11

2024-12-31 10:43:24 +02:00

CHANGELOG.md

Limbo 0.0.11

2024-12-31 10:43:24 +02:00

COMPAT.md

Merge 'Support like function with escape' from Vrishabh

2024-12-29 16:58:06 +02:00

CONTRIBUTING.md

make functions contrib doc better

2024-12-28 12:48:04 +07:00

flake.lock

improve nix flake by moving to fenix

2024-07-16 19:52:39 -07:00

flake.nix

update flake to include tcl latest (addresses #243 ) and add core foundation lib and link flags for darwin

2024-08-01 18:00:46 -07:00

LICENSE.md

License

2024-05-07 16:33:44 -03:00

limbo.png

Limbo mascot

2024-07-05 09:51:56 +03:00

Makefile

Shrink shell help msg and replace hardcoded path for shell tests

2024-12-16 20:14:04 -05:00

PERF.md

Update PERF.md

2024-12-20 09:22:44 +02:00

Pipfile

Updated Pipfile

2024-07-12 13:07:34 -07:00

Pipfile.lock

Added Pipfile and Pipfile.lock

2024-07-12 12:38:56 -07:00

README.md

Update README.md

2024-12-14 11:36:22 +02:00

rust-toolchain.toml

Pin toolchain to Rust version 1.83

2024-12-15 10:54:52 +02:00

README.md

Limbo

Limbo is a work-in-progress, in-process OLTP database management system, compatible with SQLite.

Features

In-process OLTP database engine library
Asynchronous I/O support on Linux with io_uring
SQLite compatibility (status)
- SQL dialect support
- File format support
- SQLite C API
JavaScript/WebAssembly bindings (wip)
Support for Linux, macOS, and Windows

Getting Started

CLI

Install limbo with:

curl --proto '=https' --tlsv1.2 -LsSf \
  https://github.com/penberg/limbo/releases/latest/download/limbo-installer.sh | sh

Then use the SQL shell to create and query a database:

$ limbo database.db
Limbo v0.0.6
Enter ".help" for usage hints.
limbo> CREATE TABLE users (id INT PRIMARY KEY, username TEXT);
limbo> INSERT INTO users VALUES (1, 'alice');
limbo> INSERT INTO users VALUES (2, 'bob');
limbo> SELECT * FROM users;
1|alice
2|bob

JavaScript (wip)

Installation:

npm i limbo-wasm

Example usage:

import { Database } from 'limbo-wasm';

const db = new Database('sqlite.db');
const stmt = db.prepare('SELECT * FROM users');
const users = stmt.all();
console.log(users);

Python (wip)

pip install pylimbo

Example usage:

import limbo

con = limbo.connect("sqlite.db")
cur = con.cursor()
res = cur.execute("SELECT * FROM users")
print(res.fetchone())

Developing

Build and run limbo cli:

cargo run --package limbo --bin limbo database.db

Run tests:

cargo test

Test coverage report:

cargo tarpaulin -o html

Run benchmarks:

cargo bench

Run benchmarks and generate flamegraphs:

echo -1 | sudo tee /proc/sys/kernel/perf_event_paranoid
cargo bench --bench benchmark -- --profile-time=5

FAQ

How is Limbo different from libSQL?

Limbo is a research project to build a SQLite compatible in-process database in Rust with native async support. The libSQL project, on the other hand, is an open source, open contribution fork of SQLite, with focus on production features such as replication, backups, encryption, and so on. There is no hard dependency between the two projects. Of course, if Limbo becomes widely successful, we might consider merging with libSQL, but that is something that will be decided in the future.

Publications

Pekka Enberg, Sasu Tarkoma, Jon Crowcroft Ashwin Rao (2024). Serverless Runtime / Database Co-Design With Asynchronous I/O. In EdgeSys ‘24. [PDF]
Pekka Enberg, Sasu Tarkoma, and Ashwin Rao (2023). Towards Database and Serverless Runtime Co-Design. In CoNEXT-SW ’23. [PDF] [Slides]

Contributing

We'd love to have you contribute to Limbo! Check out the contribution guide to get started.

License

This project is licensed under the MIT license.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in Limbo by you, shall be licensed as MIT, without any additional terms or conditions.

Languages

Rust 76.8%

Tcl 6.6%

C 6.4%

Dart 2.4%

Java 2.3%

Other 5.3%

README.md Unescape Escape

Limbo

Features

Getting Started

CLI

JavaScript (wip)

Python (wip)

Developing

FAQ

How is Limbo different from libSQL?

Publications

Contributing

License

Contribution

README.md