Commit Graph

115 Commits

Author SHA1 Message Date
Nikita Sivukhin
c9c5ef4e25 remote query_mode from ProgramBuilderOpts and from function arguments
- mode never changes and ProgramBuilder already created with proper mode set correctly
2025-07-02 13:24:12 +04:00
Pekka Enberg
725c3e4ddc Rename limbo_sqlite3_parser crate to turso_sqlite3_parser 2025-06-29 12:34:46 +03:00
meteorgan
51764d882e fix comments 2025-06-27 11:50:19 +08:00
meteorgan
d4789d0a05 add tests 2025-06-27 11:50:19 +08:00
Pekka Enberg
2fc5c0ce5c Switch to runtime flag for enabling indexes
Makes it easier to test the feature:

```
$ cargo run --  --experimental-indexes
Limbo v0.0.22
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database
limbo> CREATE TABLE t(x);
limbo> CREATE INDEX t_idx ON t(x);
limbo> DROP INDEX t_idx;
```
2025-06-26 10:07:28 +03:00
Nils Koch
2827b86917 chore: fix clippy warnings 2025-06-23 19:52:13 +01:00
pedrocarlo
e53a290a48 move ephemeral table logic to update plan and reuse select logic for ephemeral index 2025-06-20 16:30:21 -03:00
Pere Diaz Bou
dde93e8deb disable distinct without index_experimental
distinct uses indexes, therefore we need to disable them
2025-06-17 19:33:23 +02:00
meteorgan
fd09675d8c clean up 2025-06-13 10:39:36 +03:00
meteorgan
6179d8de23 refactor compound select 2025-06-13 10:39:32 +03:00
Levy A.
de2ac89ad2 feat: complete ALTER TABLE implementation 2025-06-11 14:17:36 -03:00
Jussi Saurio
cc405dea7e Use new TableReferences struct everywhere 2025-05-29 11:44:56 +03:00
Jussi Saurio
d2a287f67f Add Schema reference to Resolver - needed for adhoc subquery planning 2025-05-27 19:12:47 +03:00
pedrocarlo
90e3c8483d tests with compound select 2025-05-25 19:15:28 -03:00
pedrocarlo
72c1f2f582 fix rebase issues and make code compile by cloning query type. Adjust the compound select behavior with insert 2025-05-25 19:13:40 -03:00
pedrocarlo
bb7da39c72 remove assumption that translate_select is always called from a top-level context + adjust insert to use translate_select when needed 2025-05-25 19:12:30 -03:00
pedrocarlo
15ffdd3e51 modify translate_select to return number of result columns 2025-05-25 19:02:17 -03:00
Jussi Saurio
07fa3a9668 Rename SelectQueryType to QueryDestination 2025-05-25 21:23:04 +03:00
Jussi Saurio
d893a55c55 UNION 2025-05-25 21:23:04 +03:00
Jussi Saurio
7c07c09300 Add stable internal_id property to TableReference
Currently our "table id"/"table no"/"table idx" references always
use the direct index of the `TableReference` in the plan, e.g. in
`SelectPlan::table_references`. For example:

```rust
Expr::Column { table: 0, column: 3, .. }
```

refers to the 0'th table in the `table_references` list.

This is a fragile approach because it assumes the table_references
list is stable for the lifetime of the query processing. This has so
far been the case, but there exist certain query transformations,
e.g. subquery unnesting, that may fold new table references from
a subquery (which has its own table ref list) into the table reference
list of the parent.

If such a transformation is made, then potentially all of the Expr::Column
references to tables will become invalid. Consider this example:

```sql
-- Assume tables: users(id, age), orders(user_id, amount)

-- Get total amount spent per user on orders over $100
SELECT u.id, sub.total
FROM users u JOIN
     (SELECT user_id, SUM(amount) as total
      FROM orders o
      WHERE o.amount > 100
      GROUP BY o.user_id) sub
WHERE u.id = sub.user_id

-- Before subquery unnesting:
-- Main query table_references: [users, sub]
-- u.id refers to table 0, column 0
-- sub.total refers to table 1, column 1
--
-- Subquery table_references: [orders]
-- o.user_id refers to table 0, column 0
-- o.amount refers to table 0, column 1
--
-- After unnesting and folding subquery tables into main query,
-- the query might look like this:

SELECT u.id, SUM(o.amount) as total
FROM users u JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100
GROUP BY u.id;

-- Main query table_references: [users, orders]
-- u.id refers to table index 0 (correct)
-- o.amount refers to table index 0 (incorrect, should be 1)
-- o.user_id refers to table index 0 (incorrect, should be 1)
```

We could ofc traverse every expression in the subquery and rewrite
the table indexes to be correct, but if we instead use stable identifiers
for each table reference, then all the column references will continue
to be correct.

Hence, this PR introduces a `TableInternalId` used in `TableReference`
as well as `Expr::Column` and `Expr::Rowid` so that this kind of query
transformations can happen with less pain.
2025-05-25 20:26:17 +03:00
Jussi Saurio
f6443ae742 Support LIMIT with UNION ALL 2025-05-24 13:12:41 +03:00
Jussi Saurio
08bda9cc58 UNION ALL 2025-05-24 13:12:41 +03:00
meteorgan
3bf0ce7fb3 Add some comments for values statement 2025-05-23 22:11:34 +08:00
meteorgan
34e05ef974 make values work in subquery 2025-05-23 00:30:04 +08:00
meteorgan
0467d7e11b Support values statement and values in select 2025-05-23 00:29:54 +08:00
Jussi Saurio
76227ec274 Rename to Distinctness + add distinctness information to SelectPlan 2025-05-22 16:51:03 +03:00
pedrocarlo
53bf5d5ef5 adjust translate functions to take a program instead of Option<ProgramBuilder> + remove any Init emission in traslate functions + use epilogue in all places necessary 2025-05-21 16:41:10 -03:00
pedrocarlo
517c7c81cd refactor to include optional program builder argument 2025-05-21 12:47:51 -03:00
Jussi Saurio
14058357ad Merge 'refactor: replace Operation::Subquery with Table::FromClauseSubquery' from Jussi Saurio
Previously the Operation enum consisted of:
- Operation::Scan
- Operation::Search
- Operation::Subquery
Which was always a dumb hack because what we really are doing is an
Operation::Scan on a "virtual"/"pseudo" table (overloaded names...)
derived from a subquery appearing in the FROM clause.
Hence, refactor the relevant data structures so that the Table enum now
contains a new variant:
Table::FromClauseSubquery
And the Operation enum only consists of Scan and Search.
```
SELECT * FROM (SELECT ...) sub;

-- the subquery here was previously interpreted as Operation::Subquery on a Table::Pseudo,
-- with a lot of special handling for Operation::Subquery in different code paths
-- now it's an Operation::Scan on a Table::FromClauseSubquery
```
No functional changes (intended, at least!)

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #1529
2025-05-20 14:31:42 +03:00
Jussi Saurio
3121c6cdd3 Replace Operation::Subquery with Table::FromClauseSubquery
Previously the Operation enum consisted of:

- Operation::Scan
- Operation::Search
- Operation::Subquery

Which was always a dumb hack because what we really are doing is
an Operation::Scan on a "virtual"/"pseudo" table (overloaded names...)
derived from a subquery appearing in the FROM clause.

Hence, refactor the relevant data structures so that the Table enum
now contains a new variant:

Table::FromClauseSubquery

And the Operation enum only consists of Scan and Search.

No functional changes (intended, at least!)
2025-05-20 12:56:30 +03:00
Jussi Saurio
368c45e025 Add distinctness information to Aggregate struct 2025-05-17 15:33:55 +03:00
pedrocarlo
9f726dbe62 simplify simple count detection 2025-05-10 22:36:43 -03:00
pedrocarlo
977d09fd36 small fixes 2025-05-10 22:23:01 -03:00
pedrocarlo
e9b1631d3c fix is_simple_count detection 2025-05-10 22:23:01 -03:00
pedrocarlo
655ceeca45 correct count implementation 2025-05-10 22:23:01 -03:00
Jussi Saurio
37097e01ae GROUP BY: refactor logic to support cases where no sorting is needed 2025-05-08 12:39:26 +03:00
Jussi Saurio
330fedbc2f Add notion of join ordering to plan + make determining where to eval expr dynamic always 2025-05-03 15:32:06 +03:00
Jussi Saurio
3798b4aa8b use SortOrder in sorters always 2025-04-24 10:34:06 +03:00
Jussi Saurio
5a1cfb7d15 Add ColumnUsedMask struct to TableReference to track columns referenced in query 2025-04-15 15:13:31 +03:00
Jussi Saurio
457bded14d optimizer: refactor optimizer to support multicolumn index scans 2025-04-10 15:53:02 +03:00
PThorpe92
13e084351d Change parse_limit function to accept reference value to ast::Limit 2025-04-04 12:38:18 -04:00
Jussi Saurio
eca196a54b Support numeric column references in GROUP BY 2025-02-14 16:57:45 +02:00
Pekka Enberg
ac54c35f92 Switch to workspace dependencies
...makes it easier to specify a version, which is needed for `cargo publish`.
2025-02-12 17:28:04 +02:00
Pekka Enberg
5205c23eed Merge 'Initial support for WITH clauses (common table expressions)' from Jussi Saurio
Adds initial limited support for CTEs.
- No MATERIALIZED
- No RECURSIVE
- No named CTE columns
- Only SELECT statements supported inside CTE
Basically this kind of WITH clause can just be rewritten as a subquery,
so this PR adds some plumbing to rewrite them using the existing
subquery machinery.
It also introduces the concept of a `Scope` where a child query can
refer to its parent, useful for CTEs like:
```
do_execsql_test nested-subquery-cte {
    with nested_sub as (
        select concat(name, '!!!') as loud_hat
        from products where name = 'hat'
    ),
    sub as (
        select upper(nested_sub.loud_hat) as loudest_hat from nested_sub
    )
    select sub.loudest_hat from sub;
} {HAT!!!}
```
I think we need to expand the use of `Scope` to all of our identifier
resolutions (currently we don't explicitly have logic for determining
what a given query can see), but I didn't want to bloat the PR too much.
Hence, this implementation is probably full of all sorts of bugs, but
I've added equivalent tests for ALL the existing subquery tests,
rewritten in CTE form.

Closes #920
2025-02-10 12:15:07 +02:00
Jussi Saurio
40a8dc14cd sqlite3-parser: separate boxed SelectInner struct 2025-02-09 12:54:30 +02:00
Jussi Saurio
781aa3b5d6 sqlite3-parser: box the having clause in GroupBy 2025-02-08 18:10:26 +02:00
Jussi Saurio
9e70e8fe02 Add basic CTE support 2025-02-08 14:50:05 +02:00
PThorpe92
d4c06545e1 Refactor vtable impl and remove Rc Refcell from module 2025-02-06 09:15:39 -05:00
Pekka Enberg
6ea7fa06d2 Merge 'prepare perf: make ProgramBuilder aware of plan to count/estimate required memory' from Jussi Saurio
Use knowledge of query plan to inform how much memory to initially
allocate for `ProgramBuilder` vectors
Some of them are exact, some are semi-random estimates
```sql
Prepare `SELECT 1`/Limbo/SELECT 1
                        time:   [756.93 ns 758.11 ns 759.59 ns]
                        change: [-4.5974% -4.3153% -4.0393%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) low severe
  1 (1.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

Prepare `SELECT * FROM users LIMIT 1`/Limbo/SELECT * FROM users LIMIT 1
                        time:   [1.4739 µs 1.4769 µs 1.4800 µs]
                        change: [-7.9364% -7.7171% -7.4979%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...`
                        time:   [3.7440 µs 3.7520 µs 3.7596 µs]
                        change: [-5.4627% -5.1578% -4.8445%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
```

Closes #899
2025-02-05 18:24:16 +02:00
Jussi Saurio
795576b2ec dont eagerly allocate result column name strings 2025-02-05 17:53:23 +02:00