Made a new PR based on @sivukhin 's PR #2869 that had a lot of
conflicts. You can check out the PR description from there.
## The main idea is:
Before, if we had an index on `x` and had a query like `WHERE x > 100
and x < 200`, the plan would be something like:
```
- Seek to first row where x > 100
- Then, for every row, discard the row if x >= 200
```
This is highly wasteful in cases where there are a lot of rows where `x
>= 200`. Since our index is sorted on `x`, we know that once we hit the
_first_ row where `x >= 200`, we can stop iterating entirely.
So, the new plan is:
```
- Seek to first row where x > 100
- Then, iterate rows until x >= 200, and then stop
```
This also improves the situation for multi-column indexes. Imagine index
on `(x,y)` and a condition like `WHERE x = 100 and y > 100 and y < 200`.
Before, the plan was:
```
- Seek to first row where x=100 and y > 100
- Then, iterate rows while x = 100 and discard the row if y >= 200
- Stop when x > 100
```
This also suffers from a problem where if there are a lot of rows where
`x=100` and `y >= 200`, we go through those rows unnecessarily. The new
plan is:
```
- Seek to first row where x=100 and y > 100
- Then, iterate rows while x = 100 and y < 200
- Stop when either x > 100 or y >= 200
```
Which prevents us from iterating rows like `x=100, y = 666`
unnecessarily because we know the index is sorted on `(x,y)` - once we
hit any row where `x>100` OR `x=100, y >= 200`, we can stop.
Closes#3644
Before, we just skipped evaluating `Id`, `Qualified` and
`DoublyQualified` if `referenced_tables` was `None`, leading to shit
like #3621. Let's eagerly return `"No such column"` parse errors in
these cases instead, and punch exceptions for cases where that doesn't
cleanly work
Top tip: use `Hide whitespace` toggle when inspecting the diff of this
PR
Closes#3621
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3626
There is currently a bug found in our materialized view implementation
that happens when we delete a row, and then re-insert another row with
the same primary key.
Our insert code needs to detect updates and generate a DELETE +
INSERT. But in this case, after the initial DELETE, the fresh insert
generates another delete.
We ended up with the wrong response for aggregations (and I am pretty
sure even filter-only views would manifest the bug as well), where
groups that should still be present just disappeared because of the
extra delete.
A new test case is added that fails without the fix.
Add short writes in the faulty_libc
As @PThorpe92 stated in #3209, this should be implemented here instead
of the memory io in the simulator. Running this in the stress test I
caught a logic bug in the try_pwritev_raw I will create a pr for that
small fix. I will close#3209 in favor of this pr.
Closes#3569
Fixes the following problems with COLLATE:
- Fix: incorrectly used e.g. `x COLLATE NOCASE = 'fOo'` as index
constraint on an index whose column was not case-insensitively collated
- Fix: various ephemeral indexes (in GROUP BY, ORDER BY, DISTINCT) and
subqueries did not retain proper collation information of columns
- Fix: collation of a given expression was not determined properly
according to SQLite's rules
Adds TCL tests and fuzz test
Closes#3476Closes#1524Closes#3305
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3538
Closes#3470
## Background
In a query like `SELECT * FROM t LEFT JOIN s ON t.a=s.a WHERE s.a =
'foo'` we can remove the LEFT JOIN and replace it with an `INNER JOIN`
because NULL values will never be equal to 'foo'. Rewriting as `INNER
JOIN` allows the optimizer to also reorder the table join order to come
up with a more efficient query plan. In fact, we have this optimization
already.
## Problem
However, there is a dumb bug where `WhereTerm`s involving this join
still retain their `from_outer_join` state, resulting in forcing the
evaluation of those terms at the original join index, which results in
completely wrong bytecode if the join optimizer decides to reorder the
join as `s JOIN t` instead. Effectively it will evaluate `t.a=s.a` after
table `s` is open but table `t` is not open yet.
## Fix
This PR fixes that issue by clearing `from_outer_join` properly from the
relevant `WhereTerm`s.
Closes#3475
Fixes:
- `start_value` and `length_value` should be casted to integers
- proper handling of utf-8 characters
- do not need to cast blob to string, as substr in blobs refers to byte
indexes and not char-indexes
Closes#3465
There were 2 problems:
1. The SELECT wasn't propagating which register it used for its results,
so sometimes the INSERT read bad data.
2. `TableReferences::contains_table` was only checking the top-level
tables, not the nested tables in FROM queries. This condition is used to
emit "template 4", the bytecode template for self-inserts.
Closes https://github.com/tursodatabase/turso/issues/3312
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#3436
Closes#2470
In a query like `SELECT * FROM t LEFT JOIN s ON t.a=s.a WHERE s.a = 'foo'` we can
remove the LEFT JOIN because NULL values will be equal to 'foo'. In fact, we have
this optimization already.
However, there was a dumb bug where `WhereTerm`s involving this join still retained
their `from_outer_join` state, resulting in forcing the evaluation of those terms
at the original join index, which results in completely wrong bytecode if the join
optimizer decides to reorder the join as `s JOIN t` instead. Effectively it will
evaluate `t.a=s.a` after table `s` is open but table `t` is not open yet.
This PR fixes that issue by clearing `from_outer_join` properly from the relevant
`WhereTerm`s.
closes#3285
maybe adding seperate func for it is stupid but i kept running into
issues with not closing some random `}` and it got annoying real quick
so i just moved tht into its own func. - and as i am using same logic in
3 places i think it's ok.
Closes#3412
Addition of compatibilty tests for `printf()`.
While doing this I found some differences in the current implementation,
so this fixes those too.
Closes#3438
SQLite supports complex expressions in group by columns - because of
course it does...
So we need to make sure that a column is created for this expression if
it doesn't exist already, and compute it, the same way we compute
pre-projections in the filter operator.
Fixes#3363Fixes#3366Fixes#3365