Commit Graph

598 Commits

Author SHA1 Message Date
Krishna Vishal
d3368a28bc fix merge conflicts 2025-07-14 03:28:55 +05:30
Krishna Vishal
9b315d1d7e Manually inline the record deserialization code for performance.
This is done because the compiler is refusing to inline even after
adding inline hint.
- Get refvalues from directly from registers without using
`make_record`
2025-07-14 03:28:54 +05:30
Jussi Saurio
a48b6d049a Another post-rebase clippy round with 1.88.0 2025-07-12 19:10:56 +03:00
Nils Koch
828d4f5016 fix clippy errors for rust 1.88.0 (auto fix) 2025-07-12 18:58:41 +03:00
Jussi Saurio
38650eee0e VDBE: fix op_insert re-entrancy
when updating last_insert_rowid we call return_if_io!(cursor.rowid())
which yields IO on large records. this causes op_insert to insert and
overwrite the same row many times. we need a state machine to ensure
that the insertion only happens once and the reading of rowid can
independently yield IO without causing a re-insert.
2025-07-09 14:26:40 +03:00
Jussi Saurio
c752058a97 VDBE: introduce state machine for op_idx_insert for more granular IO control
Separates cursor.key_exists_in_index() into a state machine. The problem with
the main branch implementation is this:

`return_if_io!(seek)`
`return_if_io!(cursor.record())`

The latter may yield on IO and cause the seek to start over, causing an infinite
loop. With an explicit state machine we can control and prevent this.
2025-07-09 11:43:18 +03:00
Pere Diaz Bou
232beddf62 vdbe: fix compilation 2025-07-08 16:15:29 +02:00
Pere Diaz Bou
8909e198ae set closed flag for connection to detect force zombies
Let's make sure we don't keep using a connection after it was dropped.
In case of executing a query that was closed we will try to rollback and
return early.
2025-07-08 15:19:20 +02:00
Nikita Sivukhin
d8fb321b16 treat ImmutableRecord as Value::Blob 2025-07-08 10:28:11 +04:00
pedrocarlo
6b60dd06c6 only rollback on write transaction 2025-07-07 12:10:54 -03:00
pedrocarlo
367002fb72 rename change_schema to schema_did_change 2025-07-07 11:58:16 -03:00
pedrocarlo
b85687658d change instrumentation level to INFO 2025-07-07 11:53:45 -03:00
pedrocarlo
5559c45011 more instrumentation + write counter should decrement if pwrite fails 2025-07-07 11:50:21 -03:00
pedrocarlo
b69472b5a3 pass correct change schema to step rollback 2025-07-07 11:50:21 -03:00
pedrocarlo
9632ab0a41 rollback transaction when we fail in step 2025-07-07 11:50:21 -03:00
pedrocarlo
897426a662 add error tracing to relevant functions + rollback transaction in step_end_write_txn + make move_to_root return result 2025-07-07 11:50:21 -03:00
pedrocarlo
56d87cb916 move disable behavior to connection instead of checkpoint 2025-07-03 12:05:53 -03:00
pedrocarlo
db005c81a0 add option to disable wal checkpoint 2025-07-03 12:04:17 -03:00
Pere Diaz Bou
6b16950488 fmt 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
d8658264d9 alter set cookie 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
abf1699dd2 set scheam version and update shared schema in txn 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
4d80b8237d write page1 on database initialization
Page 1 must be initialized and written as soon as possible without
marking page as dirty.
2025-06-26 14:44:23 +02:00
Pere Diaz Bou
22f9cd695d commit_txn track rollback case 2025-06-25 14:00:57 +02:00
Jussi Saurio
cc2e14b11c Read page 1 from pager always, no separate db_header 2025-06-24 14:41:49 -03:00
Nils Koch
2827b86917 chore: fix clippy warnings 2025-06-23 19:52:13 +01:00
pedrocarlo
74beac5ea8 ephemeral table for update when rowid is being update 2025-06-20 16:28:10 -03:00
Jussi Saurio
c69047106c Merge 'Implement RowData opcode' from meteorgan
The `RowData` opcode is required to implement #1575.
I haven't found a ideal way to test this PR independently, but I
verified its functionality while working on #1575(to be committed soon),
and it performs effectively.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1756
2025-06-20 21:58:47 +03:00
meteorgan
5c7d1423e7 Support indent for Goto opcode when executing explain 2025-06-17 23:55:39 +08:00
meteorgan
4f742a3a0f Implement RowData opcode 2025-06-16 22:07:35 +08:00
Pekka Enberg
882c5ca168 Merge 'Simple integrity check on btree' from Pere Diaz Bou
This PR adds support for the instruction `IntegrityCk` which performs an
integrity check on the contents of a single table. Next PR I will try to
implement the rest of the integrity check where we would check indexes
containt correct amount of data and some more.
<img width="1151" alt="image" src="https://github.com/user-
attachments/assets/29d54148-55ba-480f-b972-e38587f0a483" />

Closes #1719
2025-06-16 13:46:26 +03:00
Pekka Enberg
90c1e3fc06 Switch Connection to use Arc instead of Rc
Connection needs to be Arc so that bindings can wrap it with `Mutex` for
multi-threading.
2025-06-16 10:43:19 +03:00
pedrocarlo
8dbf09bb42 betters instrumentation for btree operations 2025-06-11 23:34:32 -03:00
Pere Diaz Bou
9383ba207d introduce integrity_check pragma 2025-06-11 11:14:29 +02:00
Jussi Saurio
da2437408e get_new_rowid(): fix off by one - rowids start at 1 2025-06-10 14:16:26 +03:00
Jussi Saurio
2bac140d73 Remove SeekOp::EQ and encode eq_only in LE&GE - needed for iteration direction aware equality seeks 2025-06-10 14:16:26 +03:00
Pere Diaz Bou
77b6896eae implement lazy record and rowid in cursor
This also comments save_context for now
2025-06-10 14:16:26 +03:00
Pekka Enberg
c6ef19396d Merge 'Add support for pragma table-valued functions' from Piotr Rżysko
This PR adds support for table-valued functions for PRAGMAs (see the
[PRAGMA functions section](https://www.sqlite.org/pragma.html)).
Additionally, it introduces built-in table-valued functions. I
considered using extensions for this, but there are several reasons in
favor of a dedicated mechanism:
* It simplifies the use of internal functions, structs, etc. For
example, when implementing `json_each` and `json_tree`, direct access to
internals was necessary:
https://github.com/tursodatabase/limbo/pull/1088
* It avoids FFI overhead. [Benchmarks](https://github.com/piotrrzysko/li
mbo/blob/pragma_vtabs_bench/core/benches/pragma_benchmarks.rs) on my
hardware show that `pragma_table_info()` implemented as an extension is
2.5× slower than the built-in version.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1642
2025-06-04 09:08:10 +03:00
Piotr Rzysko
6300deb77f Move VTabOpaqueCursor to vtab module 2025-06-01 07:45:57 +02:00
pedrocarlo
0757109676 instrument trace_insn 2025-05-30 11:33:22 -03:00
Pere Diaz Bou
d4f1b8e068 update i64::MAX comment 2025-05-30 14:02:05 +02:00
Pere Diaz Bou
da4190a23e Convert u64 rowid to i64
Rowids can be negative, therefore let's swap to i64
2025-05-30 13:07:31 +02:00
Jussi Saurio
5632a6046e Merge 'Fix: allow DeferredSeek on more than one cursor per program' from Jussi Saurio
Found while fuzzing nested subqueries. Since subqueries result in nested
plans, it quickly revealed that there can be multiple `DeferredSeek`
instructions issued for different cursors, but our `ProgramState` only
supported one at a time.

Closes #1610
2025-05-30 09:39:23 +03:00
Jussi Saurio
69133b3b2e Fix: allow DeferredSeek on more than one cursor per program 2025-05-29 16:05:47 +03:00
Jussi Saurio
cc405dea7e Use new TableReferences struct everywhere 2025-05-29 11:44:56 +03:00
Jussi Saurio
77ce4780d9 Fix ProgramBuilder::cursor_ref not having unique keys
Currently we have this:

program.alloc_cursor_id(Option<String>, CursorType)`

where the String is the table's name or alias ('users' or 'u' in
the query).

This is problematic because this can happen:

`SELECT * FROM t WHERE EXISTS (SELECT * FROM t)`

There are two cursors, both with identifier 't'. This causes a bug
where the program will use the same cursor for both the main query
and the subquery, since they are keyed by 't'.

Instead introduce `CursorKey`, which is a combination of:

1. `TableInternalId`, and
2. index name (Option<String> -- in case of index cursors.

This should provide key uniqueness for cursors:

`SELECT * FROM t WHERE EXISTS (SELECT * FROM t)`

here the first 't' will have a different `TableInternalId` than the
second `t`, so there is no clash.
2025-05-29 00:59:24 +03:00
Jussi Saurio
360b1fcdae Fix bug: op_vopen should replace cursor slot, not add new one 2025-05-27 10:52:36 +03:00
Jussi Saurio
3ba9f2ab97 Small cleanups to pager/wal/vdbe - mostly naming
- Instead of using a confusing CheckpointStatus for many different things,
  introduce the following statuses:
    * PagerCacheflushStatus - cacheflush can result in either:
      - the WAL being written to disk and fsynced
      - but also a checkpoint to the main BD file, and fsyncing the main DB file

      Reflect this in the type.
    * WalFsyncStatus - previously CheckpointStatus was also used for this, even
      though fsyncing the WAL doesn't checkpoint.
    * CheckpointStatus/CheckpointResult is now used only for actual checkpointing.

- Rename HaltState to CommitState (program.halt_state -> program.commit_state)
- Make WAL a non-optional property in Pager
  * This gets rid of a lot of if let Some(...) boilerplate
  * For ephemeral indexes, provide a DummyWAL implementation that does nothing.
- Rename program.halt() to program.commit_txn()
- Add some documentation comments to structs and functions
2025-05-26 10:37:34 +03:00
Pekka Enberg
e3f71259d8 Rename OwnedValue -> Value
We have not had enough merge conflicts for a while so let's do a
tree-wide rename.
2025-05-15 09:59:46 +03:00
pedrocarlo
3526a206e4 support Unique properly by creating a vec of auto indices 2025-05-14 11:34:39 -03:00
Piotr Rzysko
977b6b331a Fix memory leak caused by unclosed virtual table cursors
The following code reproduces the leak, with memory usage increasing
over time:

```
#[tokio::main]
async fn main() {
    let db = Builder::new_local(":memory:").build().await.unwrap();
    let conn = db.connect().unwrap();

    conn.execute("SELECT load_extension('./target/debug/liblimbo_series');", ())
        .await
        .unwrap();

    loop {
        conn.execute("SELECT * FROM generate_series(1,10,2);", ())
            .await
            .unwrap();
    }
}
```

After switching to the system allocator, the leak becomes detectable
with Valgrind:

```
32,000 bytes in 1,000 blocks are definitely lost in loss record 24 of 24
   at 0x538580F: malloc (vg_replace_malloc.c:446)
   by 0x62E15FA: alloc::alloc::alloc (alloc.rs:99)
   by 0x62E172C: alloc::alloc::Global::alloc_impl (alloc.rs:192)
   by 0x62E1530: allocate (alloc.rs:254)
   by 0x62E1530: alloc::alloc::exchange_malloc (alloc.rs:349)
   by 0x62E0271: new<limbo_series::GenerateSeriesCursor> (boxed.rs:257)
   by 0x62E0271: open_GenerateSeriesVTab (lib.rs:19)
   by 0x425D8FA: limbo_core::VirtualTable::open (lib.rs:732)
   by 0x4285DDA: limbo_core::vdbe::execute::op_vopen (execute.rs:890)
   by 0x42351E8: limbo_core::vdbe::Program::step (mod.rs:396)
   by 0x425C638: limbo_core::Statement::step (lib.rs:610)
   by 0x40DB238: limbo::Statement::execute::{{closure}} (lib.rs:181)
   by 0x40D9EAF: limbo::Connection::execute::{{closure}} (lib.rs:109)
   by 0x40D54A1: example::main::{{closure}} (example.rs:26)
```

Interestingly, when using mimalloc, neither Valgrind nor mimalloc’s
internal statistics report the leak.
2025-05-05 21:26:23 +02:00