The `RowData` opcode is required to implement #1575.
I haven't found a ideal way to test this PR independently, but I
verified its functionality while working on #1575(to be committed soon),
and it performs effectively.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1756
This PR adds support for the instruction `IntegrityCk` which performs an
integrity check on the contents of a single table. Next PR I will try to
implement the rest of the integrity check where we would check indexes
containt correct amount of data and some more.
<img width="1151" alt="image" src="https://github.com/user-
attachments/assets/29d54148-55ba-480f-b972-e38587f0a483" />
Closes#1719
This PR adds support for table-valued functions for PRAGMAs (see the
[PRAGMA functions section](https://www.sqlite.org/pragma.html)).
Additionally, it introduces built-in table-valued functions. I
considered using extensions for this, but there are several reasons in
favor of a dedicated mechanism:
* It simplifies the use of internal functions, structs, etc. For
example, when implementing `json_each` and `json_tree`, direct access to
internals was necessary:
https://github.com/tursodatabase/limbo/pull/1088
* It avoids FFI overhead. [Benchmarks](https://github.com/piotrrzysko/li
mbo/blob/pragma_vtabs_bench/core/benches/pragma_benchmarks.rs) on my
hardware show that `pragma_table_info()` implemented as an extension is
2.5× slower than the built-in version.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1642
Found while fuzzing nested subqueries. Since subqueries result in nested
plans, it quickly revealed that there can be multiple `DeferredSeek`
instructions issued for different cursors, but our `ProgramState` only
supported one at a time.
Closes#1610
Currently we have this:
program.alloc_cursor_id(Option<String>, CursorType)`
where the String is the table's name or alias ('users' or 'u' in
the query).
This is problematic because this can happen:
`SELECT * FROM t WHERE EXISTS (SELECT * FROM t)`
There are two cursors, both with identifier 't'. This causes a bug
where the program will use the same cursor for both the main query
and the subquery, since they are keyed by 't'.
Instead introduce `CursorKey`, which is a combination of:
1. `TableInternalId`, and
2. index name (Option<String> -- in case of index cursors.
This should provide key uniqueness for cursors:
`SELECT * FROM t WHERE EXISTS (SELECT * FROM t)`
here the first 't' will have a different `TableInternalId` than the
second `t`, so there is no clash.
- Instead of using a confusing CheckpointStatus for many different things,
introduce the following statuses:
* PagerCacheflushStatus - cacheflush can result in either:
- the WAL being written to disk and fsynced
- but also a checkpoint to the main BD file, and fsyncing the main DB file
Reflect this in the type.
* WalFsyncStatus - previously CheckpointStatus was also used for this, even
though fsyncing the WAL doesn't checkpoint.
* CheckpointStatus/CheckpointResult is now used only for actual checkpointing.
- Rename HaltState to CommitState (program.halt_state -> program.commit_state)
- Make WAL a non-optional property in Pager
* This gets rid of a lot of if let Some(...) boilerplate
* For ephemeral indexes, provide a DummyWAL implementation that does nothing.
- Rename program.halt() to program.commit_txn()
- Add some documentation comments to structs and functions
The following code reproduces the leak, with memory usage increasing
over time:
```
#[tokio::main]
async fn main() {
let db = Builder::new_local(":memory:").build().await.unwrap();
let conn = db.connect().unwrap();
conn.execute("SELECT load_extension('./target/debug/liblimbo_series');", ())
.await
.unwrap();
loop {
conn.execute("SELECT * FROM generate_series(1,10,2);", ())
.await
.unwrap();
}
}
```
After switching to the system allocator, the leak becomes detectable
with Valgrind:
```
32,000 bytes in 1,000 blocks are definitely lost in loss record 24 of 24
at 0x538580F: malloc (vg_replace_malloc.c:446)
by 0x62E15FA: alloc::alloc::alloc (alloc.rs:99)
by 0x62E172C: alloc::alloc::Global::alloc_impl (alloc.rs:192)
by 0x62E1530: allocate (alloc.rs:254)
by 0x62E1530: alloc::alloc::exchange_malloc (alloc.rs:349)
by 0x62E0271: new<limbo_series::GenerateSeriesCursor> (boxed.rs:257)
by 0x62E0271: open_GenerateSeriesVTab (lib.rs:19)
by 0x425D8FA: limbo_core::VirtualTable::open (lib.rs:732)
by 0x4285DDA: limbo_core::vdbe::execute::op_vopen (execute.rs:890)
by 0x42351E8: limbo_core::vdbe::Program::step (mod.rs:396)
by 0x425C638: limbo_core::Statement::step (lib.rs:610)
by 0x40DB238: limbo::Statement::execute::{{closure}} (lib.rs:181)
by 0x40D9EAF: limbo::Connection::execute::{{closure}} (lib.rs:109)
by 0x40D54A1: example::main::{{closure}} (example.rs:26)
```
Interestingly, when using mimalloc, neither Valgrind nor mimalloc’s
internal statistics report the leak.
DeleteState had a bit too many unnecessary states so I removed them.
Usually we care about having a different state when I/O is triggered
requiring a state to be stored for later.
Furthermore, there was a bug with op_idx_delete where if balance is
triggered, op_idx_delete wouldn't be re-entrant. So a state machine was
added to prevent that from happening.