Unfortunately, our current translation machinery is unable to know for sure
whether a subquery reference to an outer table 't1' has opened a table cursor,
an index cursor, or both.
For this reason, return a flag from `TableReferences::find_table_by_internal_id()`
that tells the caller whether the table is an outer query reference, and further
commits will have some additional logic to decide which cursor a subquery will
read from when referencing a table from the outer query.
This optimization reuses an existing cursor when op_open_write() is
called on the same table/index (same root_page). This is safe because
the cursor position doesn't matter - op_rewind() is always called after
op_open_write() to position the cursor at the beginning of the
table/index before any operations are performed.
This change speeds up op_open_write() by avoiding unnecessary cursor re-
initialization.
Closes#3815
Trying to return integer sometimes to match SQLite led to more problems
that I anticipated. The reason being, we can't *really* match SQLite's
behavior unless we know the type of *every* element in the sum. This is
not impossible, but it is very hard, for very little gain.
Fixes#3831Closes#3832
The SQLite varint specification states that the varint is guaranteed to be a maximum of 9 bytes, but our version of write_varint initializes a buffer of 10 bytes. Changing the size to match the specification.
Trying to return integer sometimes to match SQLite led to more problems
that I anticipated. The reason being, we can't *really* match SQLite's
behavior unless we know the type of *every* element in the sum. This is
not impossible, but it is very hard, for very little gain.
Fixes#3831
Implements COUNT/SUM/AVG(DISTINCT) and SELECT DISTINCT for materialized
views. To do this we have to keep a list of the actual distinct values
(similarly to how we do for min/max). We then update the operator (and
issue deltas) only when there is a state transition (for example, if we
already count the value x = 1, and we see an insert for x = 1, we do
nothing).
SELECT DISTINCT (with no aggregator) is similar. We already have to keep
a list of the values anyway to power the aggregates. So we just issue
new deltas based on the transition, without updating the aggregator.
Closes#3808
Our foreign key constraint checks were checking for changes in PRIMARY
KEYs, but not unique indexes - which are in practice the same thing,
apart from the `INTEGER PRIMARY KEY` special case, where the PRIMARY KEY
is an alias for the rowid of the table.
Closes#3648Closes#3652 (reimplements)
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3825
The `julian_day_converter` crate is GPL, which is problematic for apps
embedding Turso. Switch to SQLite's Julian date logic by porting the C
code to Rust.
If WAL is already enabled, let's just continue execution instead of
erroring out.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#3819
This optimization reuses an existing cursor when op_open_write() is
called on the same table/index (same root_page). This is safe because
the cursor position doesn't matter - op_rewind() is always called
after op_open_write() to position the cursor at the beginning of the
table/index before any operations are performed.
This change speeds up op_open_write() by avoiding unnecessary cursor
re-initialization.
This PR implements simple heap-sort approach for query plans like
`SELECT ... FROM t WHERE ... ORDER BY ... LIMIT N` in order to maintain
small set of top N elements in the ephemeral B-tree and avoid sort and
materialization of whole dataset.
I removed all optimizations not related to this particular change in
order to make branch lightweight.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#3726
## Gist
This PR implements _statement subtransactions_, which means that a
single statement within an interactive transaction can individually be
rolled back.
## Background
The default constraint violation resolution strategy in SQLite is
`ABORT`, which means to rollback the statement that caused the conflict.
For example:
```sql
CREATE TABLE t(x UNIQUE);
INSERT INTO t VALUES (1);
BEGIN;
INSERT INTO t VALUES (2),(3); -- ok
INSERT INTO t VALUES (4),(1); -- conflict on 1, this statement should rollback
INSERT INTO t VALUES (5); -- ok
COMMIT; -- ok
SELECT * FROM t;
1
2
3
5
```
So far we haven't been able to support this due to lack of support for
subtransactions, and have used the `ROLLBACK` strategy, which means to
rollback the entire transaction on any constraint error.
## Problem
Although PRIMARY KEY and UNIQUE constraints allow defining the conflict
resolution strategy (e.g. `id INTEGER PRIMARY KEY ON CONFLICT
ROLLBACK`), FOREIGN KEY violations do not support this: they always use
`ABORT` i.e. statement subtransaction rollback. For this reason alone it
is important to implement this mechanism now rather than later, since we
already have FOREIGN KEY support implemented.
## Details
This PR implements statement subtransactions with _anonymous
savepoints_. This means that whenever a statement begins, it will open a
new savepoint which will write "page undo images" into a temporary file
called a _subjournal_. Whenever the statement marks a page as dirty, it
will write the before-image of the page into the subjournal so that its
modifications can be undone in the event of an ABORT (statement
rollback).
- Right now, only anonymous savepoints are supported, so the explicit
`SAVEPOINT` syntax is not.
- Due to the above, there can be only one savepoint open per pager, and
this is enforced with assertions.
- The subjournal file is currently entirely in memory. If it were not,
we would either have to block on IO or refactor many usages of code to
account for potentially pending completions.
- Constraint errors no longer cause transactions to abort nor do they
cause the page cache to be cleared - instead, subjournaled pages will be
brought back into the page cache which effectively handles the same
behavior albeit more fine-grained.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3792
Every transaction was reading page 1 from the WAL to check the schema
cookie in op_transaction, causing unnecessary WAL lookups.
This commit caches the schema_cookie in Pager as AtomicU64, similar to
how page_size and reserved_space are already cached. The cache is
updated when the header is read/modified and invalidated in
begin_read_tx() when WAL changes are detected from other connections.
This matches SQLite's approach of caching frequently accessed header
fields to avoid repeated page 1 reads. Improves write throughput by 5%
in our benchmarks.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#3727
Implements COUNT/SUM/AVG(DISTINCT) and SELECT DISTINCT for materialized views.
To do this we have to keep a list of the actual distinct values
(similarly to how we do for min/max). We then update the operator (and
issue deltas) only when there is a state transition (for example, if we
already count the value x = 1, and we see an insert for x = 1, we do
nothing).
SELECT DISTINCT (with no aggregator) is similar. We already have to keep
a list of the values anyway to power the aggregates. So we just issue
new deltas based on the transition, without updating the aggregator.
We don't want something like `BEGIN IMMEDIATE` to start a subtransaction,
so instead we will open it if:
- Statement is write, AND
a) Statement has >0 table_references, or
b) The statement is an INSERT (INSERT doesn't track table_references in
the same way as other program types)