Currently we have this:
program.alloc_cursor_id(Option<String>, CursorType)`
where the String is the table's name or alias ('users' or 'u' in
the query).
This is problematic because this can happen:
`SELECT * FROM t WHERE EXISTS (SELECT * FROM t)`
There are two cursors, both with identifier 't'. This causes a bug
where the program will use the same cursor for both the main query
and the subquery, since they are keyed by 't'.
Instead introduce `CursorKey`, which is a combination of:
1. `TableInternalId`, and
2. index name (Option<String> -- in case of index cursors.
This should provide key uniqueness for cursors:
`SELECT * FROM t WHERE EXISTS (SELECT * FROM t)`
here the first 't' will have a different `TableInternalId` than the
second `t`, so there is no clash.
This is the first step towards rollback, since we still don't spill
pages with WAL, we can simply invalidate page cache in case of failure.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1599
Currently in the main translation logic after planning and optimization,
we don't _really_ need to pass a `&mut Vec<WhereTerm>` around anymore,
except for the fact that virtual table constraint resolution is done ad-
hoc in `init_loop()`.
Even there, the only thing we mutate is `WhereTerm::consumed` which is a
boolean indicating that the term has been "used up" by the optimizer and
shouldn't be evaluated as a normal where clause condition anymore.
In the upcoming branch for WHERE clause subqueries, I want to store
immutable references to WHERE clause expressions in `Resolver`, but this
is unfortunately not possible if we still use the aforementioned mutable
references.
Hence, we can temporarily make `WhereTerm::consumed` a `Cell<bool>`
which allows us to pass an immutable reference to `init_loop()`, and the
`Cell` can be removed once the virtual table constraint resolution is
moved to an earlier part of the query processing pipeline.
Closes#1597
Currently in the main translation logic after planning and optimization,
we don't _really_ need to pass a &mut Vec<WhereTerm> around anymore, except
for the fact that virtual table constraint resolution is done ad-hoc in
`init_loop()`. Even there, the only thing we mutate is `WhereTerm::consumed`
which is a boolean indicating that the term has been "used up" by the optimizer
and shouldn't be evaluated as a normal where clause condition anymore.
In the upcoming branch for WHERE clause subqueries, I want to store immutable
references to WHERE clause expressions in `Resolver`, but this is unfortunately
not possible if we still use the aforementioned mutable references.
Hence, we can temporarily make `WhereTerm::consumed` a `Cell<bool>` which allows
us to pass an immutable reference to `init_loop()`, and the `Cell` can be removed
once the virtual table constraint resolution is moved to an earlier part of the
query processing pipeline.
Currently we have some usages of LIMIT where the actual limit counter
is initialized next to the DecrJumpZero instruction, and then
`program.mark_last_insn_constant()` is used to hoist the counter
initialization to the beginning of the program.
This is very fragile, and already FROM clause subquery handling works
around this with a hack (removed in this PR), and (upcoming) WHERE clause
subqueries would also run into problems because of this, because the LIMIT
might need to be initialized once for every iteration of the subquery.
This PR removes those usages for LIMIT, and LIMIT processing is now more
intuitive:
- limit counter is now initialized at the start of the query processing
- a function init_limit() is extracted to do this for select/update/delete
1. allow calling op_null with Insn::BeginSubrtn
- BeginSubrtn is identical to Null, but named differently so that
its use in context is clearer
2. Insn::Return: add possibility to fallthrough on non-integer values as
per sqlite spec
Closes#1588
This pull request implements the `libsql_wal_get_frame()` API. To do
that, we also introduce a `wait_for_completion()` API in I/O dispatcher.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1533
Found this when reviewing #1528 locally and this was crashing
```sql
INSERT INTO t SELECT * FROM generate_series(1,10,1);
```
Reason was that `op_vopen` was not replacing the already allocated
cursor slot, but using `.insert()`
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#1583
If we don't reset the state of `IdxDelete`, next `IdxDelete` will start
in `Deleting` state which is completely wrong since it should seek from
the start.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1584
Closes#1528 .
- Modified `translate_select` so that the caller can define if the
statement is top-level statement or a subquery.
- Refactored `translate_insert` to offload the translation of multi-row
VALUES and SELECT statements to `translate_select`
- I did not try to change much of `populate_column_registers` as I did
not want to break `translate_virtual_table_insert`. Ideally, I would
want to unite this remaining logic folding `populate_column_registers`
into `populate_columns_multiple_rows` and the
`translate_virtual_table_insert` into `translate_insert`. But, I think
this may be best suited for a separate PR.
## TODO
- ~Tests~ - *Done*
- ~Need to emit a temp table when we are selecting and inserting into
the Same Table -
https://github.com/sqlite/sqlite/blob/master/src/insert.c#L1369~ -
*Done*
- Optimization when table have the exact same schema - open an Issue
about it
- Virtual Tables do not benefit yet from this feature - open an Issue
about it
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1566
- Instead of using a confusing CheckpointStatus for many different things,
introduce the following statuses:
* PagerCacheflushStatus - cacheflush can result in either:
- the WAL being written to disk and fsynced
- but also a checkpoint to the main BD file, and fsyncing the main DB file
Reflect this in the type.
* WalFsyncStatus - previously CheckpointStatus was also used for this, even
though fsyncing the WAL doesn't checkpoint.
* CheckpointStatus/CheckpointResult is now used only for actual checkpointing.
- Rename HaltState to CommitState (program.halt_state -> program.commit_state)
- Make WAL a non-optional property in Pager
* This gets rid of a lot of if let Some(...) boilerplate
* For ephemeral indexes, provide a DummyWAL implementation that does nothing.
- Rename program.halt() to program.commit_txn()
- Add some documentation comments to structs and functions
Currently our "table id"/"table no"/"table idx" references always
use the direct index of the `TableReference` in the plan, e.g. in
`SelectPlan::table_references`. For example:
```rust
Expr::Column { table: 0, column: 3, .. }
```
refers to the 0'th table in the `table_references` list.
This is a fragile approach because it assumes the table_references
list is stable for the lifetime of the query processing. This has so
far been the case, but there exist certain query transformations,
e.g. subquery unnesting, that may fold new table references from
a subquery (which has its own table ref list) into the table reference
list of the parent.
If such a transformation is made, then potentially all of the Expr::Column
references to tables will become invalid. Consider this example:
```sql
-- Assume tables: users(id, age), orders(user_id, amount)
-- Get total amount spent per user on orders over $100
SELECT u.id, sub.total
FROM users u JOIN
(SELECT user_id, SUM(amount) as total
FROM orders o
WHERE o.amount > 100
GROUP BY o.user_id) sub
WHERE u.id = sub.user_id
-- Before subquery unnesting:
-- Main query table_references: [users, sub]
-- u.id refers to table 0, column 0
-- sub.total refers to table 1, column 1
--
-- Subquery table_references: [orders]
-- o.user_id refers to table 0, column 0
-- o.amount refers to table 0, column 1
--
-- After unnesting and folding subquery tables into main query,
-- the query might look like this:
SELECT u.id, SUM(o.amount) as total
FROM users u JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100
GROUP BY u.id;
-- Main query table_references: [users, orders]
-- u.id refers to table index 0 (correct)
-- o.amount refers to table index 0 (incorrect, should be 1)
-- o.user_id refers to table index 0 (incorrect, should be 1)
```
We could ofc traverse every expression in the subquery and rewrite
the table indexes to be correct, but if we instead use stable identifiers
for each table reference, then all the column references will continue
to be correct.
Hence, this PR introduces a `TableInternalId` used in `TableReference`
as well as `Expr::Column` and `Expr::Rowid` so that this kind of query
transformations can happen with less pain.
Re-Opening #1076 because it had bit-rotted to a point of no return.
However it has improved. Now with Weak references and no incrementing Rc
strong counts.
This also includes a better test extension that returns info about the
other tables in the schema.

(theme doesn't show rows column)
Closes#1366
Fixes#1567
Probably also fixes#1485
Currently we are simply unable to read any WAL frames from disk once a
fresh process w/ Limbo is opened, since we never try to read anything
from disk unless we already have it in our in-memory frame cache.
This commit implements a crude way of reading entire WAL into memory as
a single buffer and reconstructing the frame cache.
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#1570