Prevents something like `WHERE x = 5 AND x = 5` from becoming a two
component index key.
Closes#3656
Reviewed-by: Nikita Sivukhin (@sivukhin)
Closes#3658
This PR introduces sparse vectors support and jaccard distance
implementation.
Also, this PR restructure the code to have all vector operations in
separate files (they grow pretty quickly as new vector representations
added to the DB).
Closes#3647
Various little fixes to `Sorter` that reduce unnecessary work.
Makes TPC-H query 1 roughly 2x faster, which is a lot because it
originally took 30-40 seconds depending on the CI run
Closes#3645
I found an application in the open that expects sqlite_version() to
return a specific string (higher than 3.8...).
We had tons of those issues at Scylla, and the lesson was that you tell
your kids not to lie, but when life hits, well... you lie.
We'll add a new function, turso_version, that tells the truth.
Closes#3635
- On each interaction we assume that the value is NULL, so we need to
set it like so for every interaction in the list. So we force to not
emit this NULL as constant;
- Forces a copy so IN expressions works inside an aggregation
expression. Not ideal but it works, we should work more on the query
planner for sure.
I found an application in the open that expects sqlite_version() to
return a specific string (higher than 3.8...).
We had tons of those issues at Scylla, and the lesson was that you
tell your kids not to lie, but when life hits, well... you lie.
We'll add a new function, turso_version, that tells the truth.
Made a new PR based on @sivukhin 's PR #2869 that had a lot of
conflicts. You can check out the PR description from there.
## The main idea is:
Before, if we had an index on `x` and had a query like `WHERE x > 100
and x < 200`, the plan would be something like:
```
- Seek to first row where x > 100
- Then, for every row, discard the row if x >= 200
```
This is highly wasteful in cases where there are a lot of rows where `x
>= 200`. Since our index is sorted on `x`, we know that once we hit the
_first_ row where `x >= 200`, we can stop iterating entirely.
So, the new plan is:
```
- Seek to first row where x > 100
- Then, iterate rows until x >= 200, and then stop
```
This also improves the situation for multi-column indexes. Imagine index
on `(x,y)` and a condition like `WHERE x = 100 and y > 100 and y < 200`.
Before, the plan was:
```
- Seek to first row where x=100 and y > 100
- Then, iterate rows while x = 100 and discard the row if y >= 200
- Stop when x > 100
```
This also suffers from a problem where if there are a lot of rows where
`x=100` and `y >= 200`, we go through those rows unnecessarily. The new
plan is:
```
- Seek to first row where x=100 and y > 100
- Then, iterate rows while x = 100 and y < 200
- Stop when either x > 100 or y >= 200
```
Which prevents us from iterating rows like `x=100, y = 666`
unnecessarily because we know the index is sorted on `(x,y)` - once we
hit any row where `x>100` OR `x=100, y >= 200`, we can stop.
Closes#3644
Before, we just skipped evaluating `Id`, `Qualified` and
`DoublyQualified` if `referenced_tables` was `None`, leading to shit
like #3621. Let's eagerly return `"No such column"` parse errors in
these cases instead, and punch exceptions for cases where that doesn't
cleanly work
Top tip: use `Hide whitespace` toggle when inspecting the diff of this
PR
Closes#3621
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3626
SQLite surprisingly supports this:
select sqlite_version(*);
this gets translated at the parser level to sqlite_version(), and it
works for all functions that take 0 arguments.
Let's be compatible with SQLite and support the same thing.
Closes#3630
This PR introduces support for foreign key constraints, and the `PRAGMA
foreign_keys;`, and relevant opcodes: `FkCounter` and `FkIfZero`.
Extensive fuzz tests were added both for regular and composite
PK/rowid/unique index constraints, as well as some really weird
edgecases to make sure we our affinity handling is correct as well when
we trigger the constraints.
Foreign-key checking is driven by two VDBE ops: `FkCounter` and
`FkIfZero`, and
`FkCounter` is a running meter on the `Connection` for deferred FK
violations. When an `insert/delete/update` operation creates a potential
orphan (we insert a child row that doesn’t have a matching parent, or we
delete/update a parent that children still point at), this counter is
incremented. When a later operation fixes that (e.g. we insert the
missing parent or re-target the child), we decrement the counter. If any
is remaining at commit time, the commit fails. For immediate
constraints, on the violation path we emit Halt right away.
`FkIfZero` can either be used to guard a decrement of FkCounter to
prevent underflow, or can potentially (in the future) be used to avoid
work checking if any constraints need resolving.
NOTE: this PR does not implement `pragma defer_foreign_keys` for global
`deferred` constraint semantics. only explicit `col INT REFERENCES t(id)
DEFERRABLE INITIALLY DEFERRED` is supported in this PR.
This PR does not add support for `ON UPDATE|DELETE CASCADE`, only for
basic implicit `DO NOTHING` behavior.
~~NOTE: I did notice that, as referenced here: #3463~~
~~our current handling of unique constraints does not pass fuzz tests, I
believe only in the case of composite primary keys,~~ ~~because the fuzz
test for FK referencing composite PK is failing but only for UNIQUE
constraints, never (or as many times as i tried) for foreign key
constraints.~~
EDIT: all fuzzers are passing, because @sivukhin fixed the unique
constraint issue.
The reason that the `deferred` fuzzer is `#[ignore]`'d is because sqlite
uses sub-transactions, and even though the fuzzing only does 1 entry per
transaction... the fuzzer can lose track of _when_ it's in a transaction
and when it hits a FK constraint, and there is an error in both DB's, it
can just continue to do run regular statements, and then the eventual
ROLLBACK will revert different things in sqlite vs turso.. so for now,
we leave it `ignore`d
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#3510
SQLite surprisingly supports this:
select sqlite_version(*);
this gets translated at the parser level to sqlite_version(), and it
works for all functions that take 0 arguments.
Let's be compatible with SQLite and support the same thing.
Fixes the following problems with COLLATE:
- Fix: incorrectly used e.g. `x COLLATE NOCASE = 'fOo'` as index
constraint on an index whose column was not case-insensitively collated
- Fix: various ephemeral indexes (in GROUP BY, ORDER BY, DISTINCT) and
subqueries did not retain proper collation information of columns
- Fix: collation of a given expression was not determined properly
according to SQLite's rules
Adds TCL tests and fuzz test
Closes#3476Closes#1524Closes#3305
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3538