Commit Graph

24 Commits

Author SHA1 Message Date
Pekka Enberg
725c3e4ddc Rename limbo_sqlite3_parser crate to turso_sqlite3_parser 2025-06-29 12:34:46 +03:00
Nils Koch
2827b86917 chore: fix clippy warnings 2025-06-23 19:52:13 +01:00
Levy A.
5d60d82499 fix: add default 2025-06-11 14:19:06 -03:00
Levy A.
7638b0dab7 fix: use default value on empty columns added via ALTER TABLE 2025-06-11 14:18:19 -03:00
Jussi Saurio
547ca6cf2a Fix incorrect usage of indexes with non-contiguous columns
Due to the left-prefix rule of indexes, for an index key to be usable,
it needs to:

- Use the columns in contiguous order (0, 1, 2...)
  * eg if WHERE refers to cols 0 and 2, only 0 can be used
- Stop at the first range operator
  * eg if WHERE: col1 = 5 AND col2 > 5 AND col3 = 5, only col1 and col2
    can be used.

This wasn't properly tested, and resulted in simulator failures. Added
some regression tests for this behavior.
2025-06-10 15:21:26 +03:00
Jussi Saurio
211b511189 Fix join optimizer tests 2025-05-29 11:44:56 +03:00
Jussi Saurio
cc405dea7e Use new TableReferences struct everywhere 2025-05-29 11:44:56 +03:00
Jussi Saurio
73e806ad84 Make WhereTerm::consumed a Cell<bool>
Currently in the main translation logic after planning and optimization,
we don't _really_ need to pass a &mut Vec<WhereTerm> around anymore, except
for the fact that virtual table constraint resolution is done ad-hoc in
`init_loop()`. Even there, the only thing we mutate is `WhereTerm::consumed`
which is a boolean indicating that the term has been "used up" by the optimizer
and shouldn't be evaluated as a normal where clause condition anymore.

In the upcoming branch for WHERE clause subqueries, I want to store immutable
references to WHERE clause expressions in `Resolver`, but this is unfortunately
not possible if we still use the aforementioned mutable references.

Hence, we can temporarily make `WhereTerm::consumed` a `Cell<bool>` which allows
us to pass an immutable reference to `init_loop()`, and the `Cell` can be removed
once the virtual table constraint resolution is moved to an earlier part of the
query processing pipeline.
2025-05-28 11:02:39 +03:00
Jussi Saurio
7c07c09300 Add stable internal_id property to TableReference
Currently our "table id"/"table no"/"table idx" references always
use the direct index of the `TableReference` in the plan, e.g. in
`SelectPlan::table_references`. For example:

```rust
Expr::Column { table: 0, column: 3, .. }
```

refers to the 0'th table in the `table_references` list.

This is a fragile approach because it assumes the table_references
list is stable for the lifetime of the query processing. This has so
far been the case, but there exist certain query transformations,
e.g. subquery unnesting, that may fold new table references from
a subquery (which has its own table ref list) into the table reference
list of the parent.

If such a transformation is made, then potentially all of the Expr::Column
references to tables will become invalid. Consider this example:

```sql
-- Assume tables: users(id, age), orders(user_id, amount)

-- Get total amount spent per user on orders over $100
SELECT u.id, sub.total
FROM users u JOIN
     (SELECT user_id, SUM(amount) as total
      FROM orders o
      WHERE o.amount > 100
      GROUP BY o.user_id) sub
WHERE u.id = sub.user_id

-- Before subquery unnesting:
-- Main query table_references: [users, sub]
-- u.id refers to table 0, column 0
-- sub.total refers to table 1, column 1
--
-- Subquery table_references: [orders]
-- o.user_id refers to table 0, column 0
-- o.amount refers to table 0, column 1
--
-- After unnesting and folding subquery tables into main query,
-- the query might look like this:

SELECT u.id, SUM(o.amount) as total
FROM users u JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100
GROUP BY u.id;

-- Main query table_references: [users, orders]
-- u.id refers to table index 0 (correct)
-- o.amount refers to table index 0 (incorrect, should be 1)
-- o.user_id refers to table index 0 (incorrect, should be 1)
```

We could ofc traverse every expression in the subquery and rewrite
the table indexes to be correct, but if we instead use stable identifiers
for each table reference, then all the column references will continue
to be correct.

Hence, this PR introduces a `TableInternalId` used in `TableReference`
as well as `Expr::Column` and `Expr::Rowid` so that this kind of query
transformations can happen with less pain.
2025-05-25 20:26:17 +03:00
Jussi Saurio
a7b33b1509 schema: add Index::has_rowid 2025-05-20 14:22:17 +03:00
Jussi Saurio
9c710b5292 Add collation column to Index struct 2025-05-20 12:52:54 +03:00
pedrocarlo
0df6c87f07 Fixed Group By collation 2025-05-19 15:22:14 -03:00
Jussi Saurio
d584a1879b Mark WHERE terms as consumed instead of deleting them
We've run into trouble in multiple places due to the fact that
we delete terms from the where clause (e.g. when a constant condition
is removed, or the term becomes part of an index seek key).

A simpler solution is to add a flag indicating that the term is
consumed (used), so that it is not translated in the main loop
anymore when WHERE clause terms are evaluated.
2025-05-17 15:44:12 +03:00
pedrocarlo
4dc1431428 handling edge case when passing duplicate a multi-column unique index 2025-05-14 11:46:24 -03:00
Jussi Saurio
176d9bd3c7 Prune bad plans earlier to avoid allocating useless JoinN structs 2025-05-14 09:42:26 +03:00
Jussi Saurio
eb983c88c6 reserve capacity for memo hashmap entries 2025-05-14 09:42:26 +03:00
Jussi Saurio
5e5788bdfe Reduce allocations 2025-05-14 09:42:26 +03:00
Jussi Saurio
d2fa91e984 avoid growing vec 2025-05-14 09:42:26 +03:00
Jussi Saurio
1d465e6d94 Remove unnecessary method 2025-05-14 09:42:26 +03:00
Jussi Saurio
9d50446ffb AccessMethod: simplify - get rid of AccessMethodKind as it can be derived 2025-05-14 09:42:26 +03:00
Jussi Saurio
a90358f669 TableMask: comments 2025-05-14 09:42:26 +03:00
Jussi Saurio
3442e4981d remove some unnecessary parameters 2025-05-14 09:42:26 +03:00
Jussi Saurio
c782616180 Refactor constraints so that WHERE clause is not needed in join reordering phase 2025-05-14 09:42:26 +03:00
Jussi Saurio
bd875e3876 optimizer module split 2025-05-14 09:42:26 +03:00