Ongoing tests for [turso-go](https://github.com/tursodatabase/turso-go)
have unearthed a couple more issues
closes#3187
### Number 1:
We were getting something like:
```sql
sqlite_autoindex_`databases`_2
```
when creating autoindex for table in Gorm (gorm is notorious for
backticks everywhere), because of not normalizing the column name when
creating autoindex.
### Number 2:
When creating table with `PRIMARY KEY AUTOINCREMENT`, we were still
creating the index, but it wasn't properly handled in
`populate_indices`, because we are doing the following:
```rust
if column.primary_key && unique_set.is_primary_key {
if column.is_rowid_alias {
// rowid alias, no index needed
continue; // continues, but doesn't consume it..
}
```
So if we created such an index entry for the AUTOINCREMENT... we would
trip this:
```rust
assert!(automatic_indexes.is_empty(), "all automatic indexes parsed from sqlite_schema should have been consumed, but {} remain", automatic_indexes.len());
```
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#3186
This PR fixes bugs found in the [turso-
go](https://github.com/tursodatabase/turso-go) driver with UPSERT clause
earlier, where `Gorm` will (obviously) use Expr::Variable's as well as
use quotes for `Expr::Qualified` in the tail end of an UPSERT statement.
Example:
```sql
INSERT INTO users (a,b,c) VALUES (?,?,?) ON CONFLICT (`users`.`a`) DO UPDATE SET b = `excluded`.`b`, a = ?;
```
and previously we were not properly calling `rewrite_expr`, which was
not properly setting the anonymous `Expr::Variable` to `__param_N` named
parameter, so it would ignore it completely, then return the wrong # of
parameters.
Also, we didn't handle quoted "`excluded`.`x`", so it would panic in the
optimizer that Qualified should have been rewritten earlier.
Closes#3157
This adds basic support for window functions. For now:
* Only existing aggregate functions can be used as window functions.
* Specialized window-specific functions (`rank`, `row_number`, etc.) are
not yet supported.
* Only the default frame definition is implemented:
`RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW EXCLUDE NO OTHERS`.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3079
We have not implemented them before because they require the raw
elements to be kept. It is easy to see why in the following example:
```
current_min = 3;
insert(2) => current_min = 2 // can be done without state
delete(2) => needs to look at the state to determine new min!
```
The aggregator state was a very simple key-value structure. To
accomodate for min/max, we will make it into a more complex table, where
we can encode a more complex structure.
The key insight is that we can use a primary key composed of:
```
1) storage_id
2) zset_id,
3) element
```
The storage_id and zset_id are our previous key, except they are now
exploded to support a larger range of storage_id. With more bits
available in the storage_id, we can encode information about which
column we are storing. For aggregations in multiple columns, we will
need to keep a different list of values for min/max!
The element is just the values of the columns.
Because this is a primary key, the data will be sorted in the btree. We
can then just do a prefix search in the first two components of the key
and easily find the min/max when needed.
This new format is also adequate for joins. Joins will just have a new
storage_id which encodes two "columns" (left side, right side).
Closes#3143
And also change the schema of the main table. I have come to see the
current key-value schema as inadequate for non-aggregate operators.
Calculating Min/Max, for example, doesn't feat in this schema because
we have to be able to track existing values and index them.
Another alternative is to keep one table per operator type, but this
quickly leads to an explosion of tables.
Currently, this is effectively a no-op because, at the optimization
stage, window function expressions are in the form
win_func(subquery_column1, subquery_column2, ...).
Nevertheless, expressions are rewritten to maintain consistency with
aggregates, which also hold cloned expressions from sources like result
columns. This ensures future changes in the optimizer won’t break window
function handling.
Adds initial support for window functions. For now, only existing
aggregate functions can be used as window functions—no specialized
window-specific functions are supported yet.
Currently, only the default frame definition is implemented:
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW EXCLUDE NO OTHERS.
Two main reasons for this change:
* Improve readability by moving the logic for this special case closer
to the code that relies on it.
* Decouple AggFunc from the Aggregate struct. In the future, window
function processing will use AggFunc directly, without necessarily
depending on Aggregate.
Previously, while resetting accumulator registers, we would also
reset subsequent registers. This happened because the number of registers
to reset was computed as the sum of arguments rather than the number of
aggregate functions.
Retrying fsync() on error was historically not safe ("fsyncgate") and
Postgres still defaults to panicing on fsync(). Therefore, add a
"data_sync_retry" pragma (disabled by default) and use it to determine
whether to panic on fsync() error or not.
After this PR:
```
turso> EXPLAIN QUERY PLAN SELECT 1;
QUERY PLAN
`--SCAN CONSTANT ROW
turso> EXPLAIN QUERY PLAN SELECT 1 UNION SELECT 1;
QUERY PLAN
`--COMPOUND QUERY
|--LEFT-MOST SUBQUERY
| `--SCAN CONSTANT ROW
`--UNION USING TEMP B-TREE
`--SCAN CONSTANT ROW
turso> CREATE TABLE x(y);
turso> CREATE TABLE z(y);
turso> EXPLAIN QUERY PLAN SELECT * from x,z;
QUERY PLAN
|--SCAN x
`--SCAN z
turso> EXPLAIN QUERY PLAN SELECT * from x,z ON x.y = z.y;
QUERY PLAN
|--SCAN x
`--SEARCH z USING INDEX ephemeral_z_t2
turso>
```
Closes#3057
This is a collection of fixes for materialized views ahead of adding
support for JOINs.
It is mostly issues with how we assume there is a single table, with a
single delta, but we have to send more than one.
Those are things that are just objectively wrong, so I am sending it
separately to make the JOIN PR smaller.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3009
Currently, when MVCC is enabled, every transaction mode supports
concurrent reads and writes, which makes it hard to adopt for existing
applications that use `BEGIN DEFERRED` or `BEGIN IMMEDIATE`.
Therefore, add support for `BEGIN CONCURRENT` transactions when MVCC is
enabled. The transaction mode allows multiple concurrent read/write
transactions that don't block each other, with conflicts resolved at
commit time. Furthermore, implement the correct semantics for `BEGIN
DEFERRED` and `BEGIN IMMEDIATE` by taking advantage of the pager level
write lock when transaction upgrades to write. This means that now
concurrent MVCC transactions are serialized against the legacy ones when
needed.
The implementation includes:
- Parser support for CONCURRENT keyword in BEGIN statements
- New Concurrent variant in TransactionMode to distinguish from regular
read/write transactions
- MVCC store tracking of exclusive transactions to support IMMEDIATE and
EXCLUSIVE modes alongside CONCURRENT
- Proper transaction state management for all transaction types in MVCC
This enables better concurrency for applications that can handle
optimistic concurrency control, while still supporting traditional
SQLite transaction semantics via IMMEDIATE and EXCLUSIVE modes.
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#3021
Working on https://github.com/tursodatabase/turso/issues/2964 I came
upon `walk_expr_mut`, I don't think it existed last time I really spent
much time in the translator. So quickly went back and cleaned this up.
Currently, when MVCC is enabled, every transaction mode supports
concurrent reads and writes, which makes it hard to adopt for existing
applications that use `BEGIN DEFERRED` or `BEGIN IMMEDIATE`.
Therefore, add support for `BEGIN CONCURRENT` transactions when MVCC is
enabled. The transaction mode allows multiple concurrent read/write
transactions that don't block each other, with conflicts resolved at
commit time. Furthermore, implement the correct semantics for `BEGIN
DEFERRED` and `BEGIN IMMEDIATE` by taking advantage of the pager level
write lock when transaction upgrades to write. This means that now
concurrent MVCC transactions are serialized against the legacy ones when
needed.
The implementation includes:
- Parser support for CONCURRENT keyword in BEGIN statements
- New Concurrent variant in TransactionMode to distinguish from regular
read/write transactions
- MVCC store tracking of exclusive transactions to support IMMEDIATE and
EXCLUSIVE modes alongside CONCURRENT
- Proper transaction state management for all transaction types in MVCC
This enables better concurrency for applications that can handle
optimistic concurrency control, while still supporting traditional
SQLite transaction semantics via IMMEDIATE and EXCLUSIVE modes.
We have been ignoring the alias in the logical plan, but we have to keep
it. Implementing joins in particular is made hard without it, because it
is common that one has the same column name in different tables, just
differentiated by the alias
indexes with the naming scheme "sqlite_autoindex_<tblname>_<number>"
are automatically created when a table is created with UNIQUE or
PRIMARY KEY definitions.
these indexes must map to the table definition SQL in definition order,
i.e. sqlite_autoindex_foo_1 must be the first instance of UNIQUE or
PRIMARY KEY and so on.
this commit fixes our autoindex creation / parsing so that this invariant
is upheld.