Closes#1622
I did an A/B test between SQLite and Limbo and they can restart the db
from each other, indicating that there isn't something very wrong with
our file format. Turns out it was with our reset logic without
truncating the file. I assumed it's safe to don't reset if we're in
`PASSIVE` mode, given the
[docs](https://www.sqlite.org/c3ref/wal_checkpoint_v2.html) and [source
code](https://github.com/sqlite/sqlite/blob/2bd9f69d40dd240c4122c6d02f1f
f447e7b5c098/src/wal.c#L2193).
It also does some small clean ups and fixes.
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#1647
This PR adds support for table-valued functions for PRAGMAs (see the
[PRAGMA functions section](https://www.sqlite.org/pragma.html)).
Additionally, it introduces built-in table-valued functions. I
considered using extensions for this, but there are several reasons in
favor of a dedicated mechanism:
* It simplifies the use of internal functions, structs, etc. For
example, when implementing `json_each` and `json_tree`, direct access to
internals was necessary:
https://github.com/tursodatabase/limbo/pull/1088
* It avoids FFI overhead. [Benchmarks](https://github.com/piotrrzysko/li
mbo/blob/pragma_vtabs_bench/core/benches/pragma_benchmarks.rs) on my
hardware show that `pragma_table_info()` implemented as an extension is
2.5× slower than the built-in version.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1642
Closes#1628. Every function that calls `process_overflow_read` needs
to be reentrant. I did not change it here, but it would include
`get_prev_record` and `get_next_record`. Maybe `tablebtree_move_to` did
not need to use the state machine, but I included it as a safeguard.
Edit: Closes#1625 . When I implemented `restore_context`, I forgot to
add a `return_if_io` after calling it in `next` 🤦♂️
Edit: Closes#1617 . Just tested it and it also solves this bug.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1636
The salts values in the WAL header are (re)generated in every checkpoint (but in PASSIVE mode), so if we find a frame with mismatch it means it's a leftover from a previous checkpoint.
In preparation for `CREATE VIEW`, we need to have the original sql query
that was used to create the view. I'm using the scanner's offset to
slice into the original input, trimming the newlines, and passing it to
the translate function.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1621
Instrument trace_insn to debug print its the stack pc and instruction.
Also, disable rustyline logs for the CLI as it is too noisy to work
with.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1635
Again found when fuzzing nested where clause subqueries:
Aggregate registers need to be NULLed at the start because the same
registers might be reused on another invocation of a subquery, and if
they are not NULLed, the 2nd invocation of the same subquery will have
values left over from the first invocation.
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes#1614
Found while fuzzing nested subqueries. Since subqueries result in nested
plans, it quickly revealed that there can be multiple `DeferredSeek`
instructions issued for different cursors, but our `ProgramState` only
supported one at a time.
Closes#1610
**Beef:** we need to distinguish between references to tables in the
current query scope (CTEs, FROM clause) and references to tables from
outer query scopes (inside subqueries, or inside CTEs that refer to
previous CTEs). We don't want to consider 'tables from outside' in the
join order of a subquery, but we want the subquery to be able to
_reference_ those tables in e.g. its WHERE clause.
This PR -- or at least some sort of equivalent of it -- is a requirement
for #1595.
---
This PR replaces the `Vec<TableReference>` we use with new data
structures:
- TableReferences struct, which holds both:
- joined_tables, and
- outer_query_refs
- JoinedTable:
- this is just a rename of the previous TableReference struct
- OuterQueryReference
- this is to distinguish from JoinedTable those cases where
e.g. a subquery refers to an outer query's table, or a CTE
refers to a previous CTE.
Both JoinedTable and OuterQueryReference can be referred to by
expressions,
but only JoinedTables are considered for join ordering optimization and
so
forth.
These data structures are then used everywhere, which resulted in a lot
of changes.
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#1580
- TableReferences struct, which holds both:
- joined_tables, and
- outer_query_refs
- JoinedTable:
- this is just a rename of the previous TableReference struct
- OuterQueryReference
- this is to distinguish from JoinedTable those cases where
e.g. a subquery refers to an outer query's table, or a CTE
refers to a previous CTE.
Both JoinedTable and OuterQueryReference can be referred to by expressions,
but only JoinedTables are considered for join ordering optimization and so
forth.
This commit does not compile.
Currently we have this:
program.alloc_cursor_id(Option<String>, CursorType)`
where the String is the table's name or alias ('users' or 'u' in
the query).
This is problematic because this can happen:
`SELECT * FROM t WHERE EXISTS (SELECT * FROM t)`
There are two cursors, both with identifier 't'. This causes a bug
where the program will use the same cursor for both the main query
and the subquery, since they are keyed by 't'.
Instead introduce `CursorKey`, which is a combination of:
1. `TableInternalId`, and
2. index name (Option<String> -- in case of index cursors.
This should provide key uniqueness for cursors:
`SELECT * FROM t WHERE EXISTS (SELECT * FROM t)`
here the first 't' will have a different `TableInternalId` than the
second `t`, so there is no clash.
This is the first step towards rollback, since we still don't spill
pages with WAL, we can simply invalidate page cache in case of failure.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1599