This PR is based on #1357 and further improves performance:
```sql
limbo> select l_orderkey, 3 as revenue, o_orderdate, o_shippriority from lineitem, orders, customer where c_mktsegment = 'FURNITURE' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < cast('1995-03-29' as datetime) and l_shipdate > cast('1995-03-29' as datetime);
┌────────────┬─────────┬─────────────┬────────────────┐
│ l_orderkey │ revenue │ o_orderdate │ o_shippriority │
├────────────┼─────────┼─────────────┼────────────────┤
└────────────┴─────────┴─────────────┴────────────────┘
Command stats:
----------------------------
total: 3.728050958 s (this includes parsing/coloring of cli app)
```
Reviewed-by: Preston Thorpe (@PThorpe92)
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#1358
In left joins, even if the join condition is not matched, the system
must emit a row for every row of the outer table:
```
-- this must return t1.count() rows, with NULLs for all columns of t2
SELECT * FROM t1 LEFT JOIN t2 ON FALSE;
```
To achieve this, we set a "null flag" on the right table cursor which
tells our VDBE to emit NULLs for any columns of that cursor until the
flag is cleared.
Our logic for clearing the null flag was to do it in Next/Prev. However,
this is problematic for a few reasons:
- If the inner table of the left join is using SeekRowid, then Next/Prev
is never called on its cursor, so the null flag doesn't get cleared.
- If the inner table of the left join is using a non-covering index
seek, i.e. it iterates its rows using an index, but seeks to the main
table to fetch data, then Next/Prev is never called on the main table,
and the main table's null flag doesn't get cleared.
What this results in is NULL values incorrectly being emitted for the
inner table after the first correct NULL row, since the null flag is
correctly set to true, but never cleared.
This PR fixes the issue by clearing the null flag whenever seek() is
invoked on the cursor. Hence, the null flag is now cleared on:
- next()
- prev()
- seek()
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes#1364
In left joins, even if the join condition is not matched, the system
must emit a row for every row of the outer table:
-- this must return t1.count() rows, with NULLs for all columns of t2
SELECT * FROM t1 LEFT JOIN t2 ON FALSE;
Our logic for clearing the null flag was to do it in Next/Prev. However,
this is problematic for a few reasons:
- If the inner table of the left join is using SeekRowid, then Next/Prev
is never called on its cursor, so the null flag doesn't get cleared.
- If the inner table of the left join is using a non-covering index seek,
i.e. it iterates its rows using an index, but seeks to the main table
to fetch data, then Next/Prev is never called on the main table, and the
main table's null flag doesn't get cleared.
What this results in is NULL values incorrectly being emitted for the
inner table after the first correct NULL row, since the null flag is
correctly set to true, but never cleared.
This PR fixes the issue by clearing the null flag whenever seek() is
invoked on the cursor. Hence, the null flag is now cleared on:
- next()
- prev()
- seek()
Implements binary search to find the correct cell within a page,
specialized for table btrees only due to lack of energy at 8:30 PM
---
I used a [1GB TPC-H database](https://github.com/lovasoa/TPCH-
sqlite/releases/download/v1.0/TPC-H.db) for benchmarking and ran this
query which does a lot of seeks:
before
```sql
limbo> .timer on
limbo> select
l_orderkey,
3 as revenue,
o_orderdate,
o_shippriority
from
lineitem,
orders,
customer
where
c_mktsegment = 'FURNITURE'
and c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate < cast('1995-03-29' as datetime)
and l_shipdate > cast('1995-03-29' as datetime);
┌────────────┬─────────┬─────────────┬────────────────┐
│ l_orderkey │ revenue │ o_orderdate │ o_shippriority │
├────────────┼─────────┼─────────────┼────────────────┤
└────────────┴─────────┴─────────────┴────────────────┘
Command stats:
----------------------------
total: 16.267797375 s (this includes parsing/coloring of cli app)
```
after
```sql
limbo> .timer on
limbo> select
l_orderkey,
3 as revenue,
o_orderdate,
o_shippriority
from
lineitem,
orders,
customer
where
c_mktsegment = 'FURNITURE'
and c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate < cast('1995-03-29' as datetime)
and l_shipdate > cast('1995-03-29' as datetime);
┌────────────┬─────────┬─────────────┬────────────────┐
│ l_orderkey │ revenue │ o_orderdate │ o_shippriority │
├────────────┼─────────┼─────────────┼────────────────┤
└────────────┴─────────┴─────────────┴────────────────┘
Command stats:
----------------------------
total: 5.20604125 s (this includes parsing/coloring of cli app)
```
BTW sqlite completes this in 600 milliseconds so there's still a lot of
fuckiness somewhere.
---
UPDATE: refactored table btree seek (on leaf pages) to use binary search
too. I also updated the above numbers so that I ran each a few times and
took the lowest time i got for each. This is after binsearch on leaf
pages too:
```sql
limbo> select l_orderkey, 3 as revenue, o_orderdate, o_shippriority from lineitem, orders, customer where c_mktsegment = 'FURNITURE' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < cast('1995-03-29' as datetime) and l_shipdate > cast('1995-03-29' as datetime);
┌────────────┬─────────┬─────────────┬────────────────┐
│ l_orderkey │ revenue │ o_orderdate │ o_shippriority │
├────────────┼─────────┼─────────────┼────────────────┤
└────────────┴─────────┴─────────────┴────────────────┘
Command stats:
----------------------------
total: 4.529645958 s (this includes parsing/coloring of cli app)
```
Closes#1357
closes#1185
## The Problem:
The underlying schema of virtual tables is hidden from the query
planner, and it currently has no way of optimizing select queries with
vtab table refs by using indexes or removing non-constant predicates.
All vtabs are currently rewound completely each time and any conditional
filtering is done in the vdbe layer instead of in the `VFilter`.
## The solution:
Add xBestIndex to the vtab module API to let extensions return some
`IndexInfo` that will allow the query planner to make better
optimizations and possibly omit conditionals
## Examples:
table `t`: vtab: (key, value)
table `t2`: table: (a,b)
### Join where vtab is outer table:

Properly pushes predicate to VFilter, which receives the idx_str
`key_eq` arg, telling it that there is a useable where clause on the key
"index"
### Join where vtab is inner table:

Constraint is not sent because it is marked as unusable
### Where clause on "indexed" column:

Pushed down and the predicate is omitted from the VDBE layer.
### Where clause on regular column:

No idx info received from BestIndex, VDBE handles conditional.
## TODO:
OrderBy info needs to be sent to xBestIndex and its not in a great
position in `open_loop` currently
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1264
### Feat:
- Adds support for descending indexes
### Testing:
- Augments existing compound key index seek fuzz test to test for
various combinations of ascending and descending indexed columns. To
illustrate, the test runs 10000 queries like this on 8 different tables
with all different asc/desc permutations of a three-column primary key
index:
```sql
query: SELECT * FROM t WHERE x = 2826 LIMIT 5
query: SELECT * FROM t WHERE x = 671 AND y >= 2447 ORDER BY x ASC, y DESC LIMIT 5
query: SELECT * FROM t WHERE x = 2412 AND y = 589 AND z >= 894 ORDER BY x DESC LIMIT 5
query: SELECT * FROM t WHERE x = 1217 AND y = 1437 AND z <= 265 ORDER BY x ASC, y ASC, z DESC LIMIT 5
query: SELECT * FROM t WHERE x < 138 ORDER BY x DESC LIMIT 5
query: SELECT * FROM t WHERE x = 1312 AND y = 2757 AND z > 39 ORDER BY x DESC, y ASC, z ASC LIMIT 5
query: SELECT * FROM t WHERE x = 1829 AND y >= 1629 ORDER BY x ASC, y ASC LIMIT 5
query: SELECT * FROM t WHERE x = 2047 ORDER BY x DESC LIMIT 5
query: SELECT * FROM t WHERE x = 893 AND y > 432 ORDER BY y DESC LIMIT 5
query: SELECT * FROM t WHERE x = 1865 AND y = 784 AND z <= 785 ORDER BY x DESC, y DESC, z DESC LIMIT 5
query: SELECT * FROM t WHERE x = 213 AND y = 1475 AND z <= 2870 ORDER BY x ASC, y ASC, z ASC LIMIT 5
query: SELECT * FROM t WHERE x >= 1780 ORDER BY x ASC LIMIT 5
query: SELECT * FROM t WHERE x = 1983 AND y = 602 AND z = 485 ORDER BY y ASC, z ASC LIMIT 5
query: SELECT * FROM t WHERE x = 2311 AND y >= 31 ORDER BY y DESC LIMIT 5
query: SELECT * FROM t WHERE x = 81 AND y >= 1037 ORDER BY x ASC, y DESC LIMIT 5
query: SELECT * FROM t WHERE x < 2698 ORDER BY x ASC LIMIT 5
query: SELECT * FROM t WHERE x = 1503 AND y = 554 AND z >= 185 ORDER BY x DESC, y DESC, z DESC LIMIT 5
query: SELECT * FROM t WHERE x = 619 AND y > 1414 ORDER BY x DESC, y ASC LIMIT 5
query: SELECT * FROM t WHERE x >= 865 ORDER BY x DESC LIMIT 5
query: SELECT * FROM t WHERE x = 1596 AND y = 622 AND z = 62 ORDER BY x DESC, z ASC LIMIT 5
query: SELECT * FROM t WHERE x = 1555 AND y = 1257 AND z < 1929 ORDER BY x ASC, y ASC, z ASC LIMIT 5
query: SELECT * FROM t WHERE x > 2598 LIMIT 5
query: SELECT * FROM t WHERE x = 302 AND y = 2476 AND z < 2302 ORDER BY z DESC LIMIT 5
query: SELECT * FROM t WHERE x = 2197 AND y = 2195 AND z > 2089 ORDER BY y ASC, z DESC LIMIT 5
query: SELECT * FROM t WHERE x = 1030 AND y = 1717 AND z < 987 LIMIT 5
query: SELECT * FROM t WHERE x = 2899 AND y >= 382 ORDER BY y DESC LIMIT 5
query: SELECT * FROM t WHERE x = 62 AND y = 2980 AND z < 1109 ORDER BY x DESC, y DESC, z DESC LIMIT 5
query: SELECT * FROM t WHERE x = 550 AND y > 221 ORDER BY y DESC LIMIT 5
query: SELECT * FROM t WHERE x = 376 AND y = 1874 AND z < 206 ORDER BY y DESC, z ASC LIMIT 5
query: SELECT * FROM t WHERE x = 859 AND y = 2157 ORDER BY x DESC LIMIT 5
query: SELECT * FROM t WHERE x = 2166 AND y = 2079 AND z < 301 ORDER BY x DESC, y ASC LIMIT 5
```
the queries are run against both sqlite and limbo.
Reviewed-by: Pere Diaz Bou (@pereman2)
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes#1330
First attempt at closing #1212. Also with this PR, I added the option of
using `with` syntax for `TestLimboShell`. It automatically closes the
shell on error, and facilitates error handling overall. If this is
merged, I can update the other python tests to use `with` as well.
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes#1230