Closes#2470
In a query like `SELECT * FROM t LEFT JOIN s ON t.a=s.a WHERE s.a = 'foo'` we can
remove the LEFT JOIN because NULL values will be equal to 'foo'. In fact, we have
this optimization already.
However, there was a dumb bug where `WhereTerm`s involving this join still retained
their `from_outer_join` state, resulting in forcing the evaluation of those terms
at the original join index, which results in completely wrong bytecode if the join
optimizer decides to reorder the join as `s JOIN t` instead. Effectively it will
evaluate `t.a=s.a` after table `s` is open but table `t` is not open yet.
This PR fixes that issue by clearing `from_outer_join` properly from the relevant
`WhereTerm`s.
- even if index search will return only 1 row - it will call next in the loop - and we incorrecty can process same row values multiple times
- the following query failed with this optimization:
turso> CREATE TABLE t (id INTEGER PRIMARY KEY AUTOINCREMENT, k TEXT, c0 INT);
turso> CREATE UNIQUE INDEX idx_p1_0 ON t(c0);
turso> insert into t values (null, 'uu', -1);
turso> insert into t values (null, 'uu', -2);
turso> UPDATE t SET c0 = NULL WHERE c0 = -1;
turso> SELECT * FROM t
┌────┬────┬────┐
│ id │ k │ c0 │
├────┼────┼────┤
│ 1 │ uu │ │
├────┼────┼────┤
│ 2 │ uu │ │
└────┴────┴────┘
This solves an issue where an INSERT statement conflicts with
multiple indices. In that case, sqlite iterates the linked list
`pTab->pIndex` in order and handles the first conflict encountered.
The newest parsed index is always added to the head of the list.
To be compatible with this behavior, we also need to put the most
recently parsed index definition first in our indexes list for a given
table.
Currently, this is effectively a no-op because, at the optimization
stage, window function expressions are in the form
win_func(subquery_column1, subquery_column2, ...).
Nevertheless, expressions are rewritten to maintain consistency with
aggregates, which also hold cloned expressions from sources like result
columns. This ensures future changes in the optimizer won’t break window
function handling.
To be used in DBSP-based projections. This will compile an expression
to VDBE bytecode and execute it.
To do that we need to add a new type of Expression, which we call a
Register.
This is a way for us to pass parameters to a DBSP program which will be
not columns or literals, but inputs from the DBSP deltas.
The side of the binary expression no longer needs to be stored in
`ConstraintInfo`, since the optimizer now guarantees that it is always
on the right. As a result, only the index of the corresponding constraint
needs to be preserved.
This change connects virtual tables with the query optimizer.
The optimizer now considers virtual tables during join order search
and invokes their best_index callbacks to determine feasible access
paths.
Currently, this is not a visible change, since none of the existing
extensions return information indicating that a plan is invalid.
Different scan parameters are required for different table types.
Currently, index and iteration direction are only used by B-tree tables,
while the remaining table types don’t require any parameters. Planning
access to virtual tables, however, will require passing additional
information from the planner, such as the virtual table index (distinct
from a B-tree index) and the constraints that must be forwarded to the
`filter` method.
Previously, AccessMethod stored fields like `iter_dir`, `index`, and
`constraint_refs` directly, but these only applied to BTree tables.
Other table types (virtual tables, subqueries) either ignored these
fields or required different parameters entirely.
This change prepares the planner to handle virtual table access methods
with their own specialized parameters.
This commit replaces the `Name(pub String)` struct with a `Name` enum that
explicitly models how the name appeared in the source either as an
unquoted identifier (`Ident`) or a quoted string (`Quoted`).
In the process, the separate `Id` wrapper type has been coalesced into the
`Name` enum, simplifying the AST and reducing duplication in identifier
handling logic.
While this increases the size of some AST nodes (notably `yyStackEntry`),
it improves correctness and makes source structure more explicit for
later phases.
I ended up hitting #1974 today and wanted to fix it. I worked with
Claude to generate a more comprehensive set of queries that could fail
aside from just the insert query described in the issue. He got most of
them right - lots of cases were indeed failing. The ones that were
gibberish, he told me I was absolutely right for pointing out they were
bad.
But alas. With the test cases generated, we can work on fixing it. The
place where the assertion was hit, all we need to do there is return
true (but we assert that this is indeed a string literal, it shouldn't
be anything else at this point).
There are then just a couple of places where we need to make sure we
handle double quotes correctly. We already tested for single quotes in a
couple of places, but never for double quotes.
There is one funny corner case where you can just select "col" from tbl,
and if there is no column "col" on the table, that is treated as a
string literal. We handle that too.
Fixes#1974
Previously, the test queries added in this commit would fail with:
thread 'main' panicked at core/schema.rs:129:34:
not implemented
stack backtrace:
0: rust_begin_unwind
at /rustc/90b35a6239c3d8bdabc530a6a0816f7ff89a0aaf/library/std/src/panicking.rs:665:5
1: core::panicking::panic_fmt
at /rustc/90b35a6239c3d8bdabc530a6a0816f7ff89a0aaf/library/core/src/panicking.rs:74:14
2: core::panicking::panic
at /rustc/90b35a6239c3d8bdabc530a6a0816f7ff89a0aaf/library/core/src/panicking.rs:148:5
3: limbo_core::schema::Table::get_root_page
at ./core/schema.rs:129:34
4: limbo_core::translate::main_loop::init_loop
at ./core/translate/main_loop.rs:260:44
5: limbo_core::translate::emitter::emit_query
at ./core/translate/emitter.rs:568:5
6: limbo_core::translate::emitter::emit_program_for_select
at ./core/translate/emitter.rs:496:5
7: limbo_core::translate::emitter::emit_program
at ./core/translate/emitter.rs:187:31
8: limbo_core::translate::select::translate_select
at ./core/translate/select.rs:82:5
9: limbo_core::translate::translate_inner
at ./core/translate/mod.rs:241:13
10: limbo_core::translate::translate
at ./core/translate/mod.rs:95:17
11: limbo_core::Connection::run_cmd
at ./core/lib.rs:416:31
12: <limbo_core::QueryRunner as core::iter::traits::iterator::Iterator>::next
at ./core/lib.rs:916:22
13: limbo::app::Limbo::run_query
at ./cli/app.rs:442:27
14: limbo::app::Limbo::handle_input_line
at ./cli/app.rs:544:13
15: limbo::main
at ./cli/main.rs:51:31
16: core::ops::function::FnOnce::call_once
Closes#1888 . This PR fixes UPDATE translation by not emitting an
ephemeral plan when we are doing a `RowIdEq` search. Also, we should
delete the previous rowid when the rowid is in the set clause.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1891
Makes it easier to test the feature:
```
$ cargo run -- --experimental-indexes
Limbo v0.0.22
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database
limbo> CREATE TABLE t(x);
limbo> CREATE INDEX t_idx ON t(x);
limbo> DROP INDEX t_idx;
```