I found a bug in queries using a like function in the where clause, ex:
`SELECT first_name, last_name FROM users WHERE like('Jas%', first_name)
= 1`. That panicked with the message:
```
thread 'main' panicked at core/vdbe/mod.rs:1226:33:
internal error: entered unreachable code: Like on non-text registers
```
This was caused by an off-by-one error in the vdbe code for executing a
`ScalarFunc::Like`. However, this only happened in where-like-fn
queries. Queries using the like operator (ex: `SELECT first_name,
last_name FROM users WHERE first_name LIKE 'Jas%'`) did not have this
problem.
I did some digging around, looked at the explains for these queries from
both limbo and sqlite, and it turns out, for binary expressions, limbo
positions the arguments in the register differently, which is the
ultimate root cause of this problem.
For the where-like-fn query, before execution limbo's registers look
like this:
```
[Null, Null, Integer(1), Text("Jas%"), Text("Jason"), Null, Null]
^the rhs 1 ^pattern ^haystack str
```
Sqlite's look look something like this:
```
[Null, Null, Text("Jas%"), Text("Jason"), Integer(1), Null, Null]
^pattern ^haystack str ^the rhs 1
```
Ultimately limbo's execution of scalar like function was looking in
positions 2 and 3 always, but because we stored the right-hand-side
before the like-fn arguments, there was an off-by-one error, and the
non-text register it was finding was the `Integer(1)`.
This PR changes the binary expression translation to allocate the right-
hand-side register *after* translating the left-hand-side, fixing the
off-by-one and matching sqlite's register layout.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#321
This pr adds a new crate to increase expressivity of tests with complex
things like btree node splitting.
To run tests:
```bash
cd core_tester
cargo test test_sequential_write -- --nocapture
```
This prs improves btree balancing with simple page splitting and some
minor refactors.
Closes#316
This PR adds a regex cache to `ProgramState` so that we ca re-use
already constructed regexes while processing LIKE expressions. I didn't
find anywhere else that seemed like a good fit to put an execution-time
only cache like this, so let me know if there's a better spot.
To best match sqlite, I added the constant mask into the `Function`
instruction (this indicates whether the first argument to the function
was determined to be constant at compile time), and decide whether to
use the cache based on its value. I've left the value for
`constant_mask` as 0 on every other kind of `Function` instruction. That
seemed to be the safest choice, as that appears to be what has been
implicitly done up to this point. Happy to change that if you'd advise
otherwise.
Fixes#168Closes#320
Mimics SQLite's EXPLAIN QUERY PLAN pretty printer.
Example run on a preexisting db:
```
$ cargo run /tmp/srn "explain query plan select * from t natural join t join t2 on t.id = 2*t2.id where t.id in (select * from t);"
QUERY PLAN
`--JOIN ON t.id = 2 * t2.id
|--JOIN
| |--SCAN t FILTER t.id IN (SELECT * FROM t)
| `--SCAN t
`--SCAN t2
```
Fixes https://github.com/penberg/limbo/issues/295
Reader's guide to this PR:
The aim is to have a more structured and maintainable approach to generating bytecode from the query AST so that different parts of the query processing pipeline have clearer responsibilities, so that developing new functionality is easier. E.g.:
- If you want to implement join reordering -> you do it in `Optimizer`
- If you want to implement `GROUP BY` -> you change `QueryPlanNode::Aggregate` to include it, parse it in `Planner` and handle the code generation for it in `Emitter`
The pipeline is:
`SQL text -> Parser -> Planner -> Optimizer -> Emitter`
and this pipeline generates:
`SQL text -> AST -> Logical Plan -> Optimized Logical Plan -> SQLite Bytecode`
---
Module structure:
`plan.rs`: defines the `Operator` enum. An `Operator` is a tree of other `Operators`, e.g. an `Operator::Join` has `left` and `right` children, etc.
`planner.rs`: Parses an `ast::Select` into a `Plan` which is mainly a wrapper for a root `Operator`
`optimizer.rs`: Makes a new `Plan` from an input `Plan` - does predicate pushdown, constant elimination and turns `Scan` nodes into `SeekRowId` nodes where applicable
`emitter.rs`: Generates bytecode instructions from an input `Plan`.
---
Adds feature `EXPLAIN QUERY PLAN <stmt>` which shows the logical query plan instead of the bytecode plan
---
Other changes:
- Almost everything from `select.rs` removed; things like `translate_aggregation()` moved to `expr.rs`
- `where_clause.rs` removed, some things from it like `translate_condition_expr()` moved to `expr.rs`
- i.e.: there is nothing _new_ in `expr.rs`, stuff just moved there
---
Concerns:
- Perf impact: there's a lot more indirection than before (`Operator`s are very "traditional" trees where they refer to other operators via Boxes etc)
Closes#281