Commit Graph

149 Commits

Author SHA1 Message Date
Jussi Saurio
a5aeff9a3d Fix index insert accidentally double-inserting after balance 2025-06-10 14:16:26 +03:00
Jussi Saurio
844461d20b update and delete fixes 2025-06-10 14:16:26 +03:00
Jussi Saurio
d81f5f67bd insert spaghetti fixes 2025-06-10 14:16:26 +03:00
Jussi Saurio
1b4bef9c7c Fix op_idx_delete infinite seeking loop 2025-06-10 14:16:26 +03:00
Jussi Saurio
2bac140d73 Remove SeekOp::EQ and encode eq_only in LE&GE - needed for iteration direction aware equality seeks 2025-06-10 14:16:26 +03:00
Pere Diaz Bou
681df9b1eb fix get record 2025-06-10 14:16:26 +03:00
Pere Diaz Bou
77b6896eae implement lazy record and rowid in cursor
This also comments save_context for now
2025-06-10 14:16:26 +03:00
pedrocarlo
a9ed8dd288 namespace exec_min and exec_max to Value for reuse in simulator 2025-06-09 11:59:44 -03:00
pedrocarlo
39f85ffd03 namespace exec_like to Value 2025-06-09 11:39:55 -03:00
pedrocarlo
6c95a88533 namespace many functions to Value 2025-06-09 11:38:15 -03:00
Jussi Saurio
2075e5f3eb Fix UPDATE always inserting only nulls into non-unique indexes 2025-06-09 08:51:23 +03:00
Zaid Humayun
e994adfb40 Persisting database header and pointer map page to cache
This commit ensures that the metadata in the database header and the pointer map pages allocated are correctly persisted to the page cache. This was not being done earlier.
2025-06-06 23:14:25 +05:30
Zaid Humayun
1f5025541c addresses comment https://github.com/tursodatabase/limbo/pull/1600#discussion_r2115796655 by @jussisaurio
this commit ensures that ptrmap operations return a CursorResult so operation can be suspended & later retried
2025-06-06 23:14:25 +05:30
Zaid Humayun
5827a33517 Beginnings of AUTOVACUUM
This commit introduces AUTOVACUUM to Limbo. It introduces the concept of ptrmap pages and also adds some additional instructions that are required to make AUTOVACUUM PRAGMA work
2025-06-06 23:14:22 +05:30
Jussi Saurio
2087393d22 Merge 'Write database header via normal pager route' from meteorgan
Closes: #1613

Closes #1634
2025-06-04 09:39:14 +03:00
Pekka Enberg
c6ef19396d Merge 'Add support for pragma table-valued functions' from Piotr Rżysko
This PR adds support for table-valued functions for PRAGMAs (see the
[PRAGMA functions section](https://www.sqlite.org/pragma.html)).
Additionally, it introduces built-in table-valued functions. I
considered using extensions for this, but there are several reasons in
favor of a dedicated mechanism:
* It simplifies the use of internal functions, structs, etc. For
example, when implementing `json_each` and `json_tree`, direct access to
internals was necessary:
https://github.com/tursodatabase/limbo/pull/1088
* It avoids FFI overhead. [Benchmarks](https://github.com/piotrrzysko/li
mbo/blob/pragma_vtabs_bench/core/benches/pragma_benchmarks.rs) on my
hardware show that `pragma_table_info()` implemented as an extension is
2.5× slower than the built-in version.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1642
2025-06-04 09:08:10 +03:00
meteorgan
f2bf6251cd write database header via normal pager route 2025-06-03 22:06:08 +08:00
Piotr Rzysko
4d35e36b77 Introduce virtual table types 2025-06-01 07:45:57 +02:00
Piotr Rzysko
b291179554 Extract cursor logic from VirtualTable into VirtualTableCursor 2025-06-01 07:45:57 +02:00
pedrocarlo
33480540f1 make cursor seek reentrant 2025-05-30 13:25:46 -03:00
Pere Diaz Bou
da4190a23e Convert u64 rowid to i64
Rowids can be negative, therefore let's swap to i64
2025-05-30 13:07:31 +02:00
Jussi Saurio
819a6138d0 Merge 'Fix: aggregate regs must be initialized as NULL at the start' from Jussi Saurio
Again found when fuzzing nested where clause subqueries:
Aggregate registers need to be NULLed at the start because the same
registers might be reused on another invocation of a subquery, and if
they are not NULLed, the 2nd invocation of the same subquery will have
values left over from the first invocation.

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #1614
2025-05-30 09:39:37 +03:00
Jussi Saurio
f8257df77b Fix: aggregate regs must be initialized as NULL at the start 2025-05-29 18:44:53 +03:00
Jussi Saurio
69133b3b2e Fix: allow DeferredSeek on more than one cursor per program 2025-05-29 16:05:47 +03:00
Pere Diaz Bou
28bd24b7d4 clear page cache on transaction failure
This is the first step towards rollback, since we still don't spill
pages with WAL, we can simply invalidate page cache in case of failure.
2025-05-28 15:54:28 +02:00
Jussi Saurio
ad0f2bb399 Merge 'Small VDBE insn tweaks' from Jussi Saurio
1. allow calling op_null with Insn::BeginSubrtn
    - BeginSubrtn is identical to Null, but named differently so that
its use in context is clearer
2. Insn::Return: add possibility to fallthrough on non-integer values as
per sqlite spec

Closes #1588
2025-05-27 20:19:31 +03:00
meteorgan
d9d3a5ecbb Use the SetCookie opcode to implement user_version pragma 2025-05-28 00:31:11 +08:00
Jussi Saurio
6914d61180 allow calling op_null with Insn::BeginSubrtn 2025-05-27 19:09:15 +03:00
Jussi Saurio
70965f4b28 Insn::Return: add possibility to fallthrough on non-integer values as per sqlite spec 2025-05-27 19:09:10 +03:00
Jussi Saurio
a88e1c38f3 Merge 'Fix bug: op_vopen should replace cursor slot, not add new one' from Jussi Saurio
Found this when reviewing #1528 locally and this was crashing
```sql
INSERT INTO t SELECT * FROM generate_series(1,10,1);
```
Reason was that `op_vopen` was not replacing the already allocated
cursor slot, but using `.insert()`

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #1583
2025-05-27 12:50:11 +03:00
Pere Diaz Bou
312bb5205a Merge 'Reset idx delete state after successful finish' from Pere Diaz Bou
If we don't reset the state of `IdxDelete`, next `IdxDelete` will start
in `Deleting` state which is completely wrong since it should seek from
the start.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1584
2025-05-27 11:31:25 +02:00
Pere Diaz Bou
a5a8a52a07 reset-idx-delete-state 2025-05-27 10:47:21 +02:00
Jussi Saurio
360b1fcdae Fix bug: op_vopen should replace cursor slot, not add new one 2025-05-27 10:52:36 +03:00
Jussi Saurio
3ba9f2ab97 Small cleanups to pager/wal/vdbe - mostly naming
- Instead of using a confusing CheckpointStatus for many different things,
  introduce the following statuses:
    * PagerCacheflushStatus - cacheflush can result in either:
      - the WAL being written to disk and fsynced
      - but also a checkpoint to the main BD file, and fsyncing the main DB file

      Reflect this in the type.
    * WalFsyncStatus - previously CheckpointStatus was also used for this, even
      though fsyncing the WAL doesn't checkpoint.
    * CheckpointStatus/CheckpointResult is now used only for actual checkpointing.

- Rename HaltState to CommitState (program.halt_state -> program.commit_state)
- Make WAL a non-optional property in Pager
  * This gets rid of a lot of if let Some(...) boilerplate
  * For ephemeral indexes, provide a DummyWAL implementation that does nothing.
- Rename program.halt() to program.commit_txn()
- Add some documentation comments to structs and functions
2025-05-26 10:37:34 +03:00
PThorpe92
c2ec6caae1 Finish integrating xConnect into vtable open api 2025-05-24 14:49:58 -04:00
Jussi Saurio
597020bc0c Merge 'Support values statement and values in select' from meteorgan
Close: #866
**limbo output**:
```
limbo> explain values(1, 2);
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     5     0                    0   Start at 5
1     Integer            1     1     0                    0   r[1]=1
2     Integer            2     2     0                    0   r[2]=2
3     ResultRow          1     2     0                    0   output=r[1..2]
4     Halt               0     0     0                    0
5     Goto               0     1     0                    0

limbo> explain values(1, 2), (3, 4);
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     16    0                    0   Start at 16
1     InitCoroutine      1     9     2                    0
2     Integer            1     2     0                    0   r[2]=1
3     Integer            2     3     0                    0   r[3]=2
4     Yield              1     0     0                    0
5     Integer            3     2     0                    0   r[2]=3
6     Integer            4     3     0                    0   r[3]=4
7     Yield              1     0     0                    0
8     EndCoroutine       1     0     0                    0
9     InitCoroutine      1     0     2                    0
10    Yield              1     15    0                    0
11    Copy               2     4     0                    0   r[4]=r[2]
12    Copy               3     5     0                    0   r[5]=r[3]
13    ResultRow          4     2     0                    0   output=r[4..5]
14    Goto               0     10    0                    0
15    Halt               0     0     0                    0
16    Goto               0     1     0                    0

limbo> explain select * from (values(1, 2), (3, 4));
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     16    0                    0   Start at 16
1     InitCoroutine      1     9     2                    0
2     Integer            1     2     0                    0   r[2]=1
3     Integer            2     3     0                    0   r[3]=2
4     Yield              1     0     0                    0
5     Integer            3     2     0                    0   r[2]=3
6     Integer            4     3     0                    0   r[3]=4
7     Yield              1     0     0                    0
8     EndCoroutine       1     0     0                    0
9     InitCoroutine      1     0     2                    0
10    Yield              1     15    0                    0
11    Copy               2     4     0                    0   r[4]=r[2]
12    Copy               3     5     0                    0   r[5]=r[3]
13    ResultRow          4     2     0                    0   output=r[4..5]
14    Goto               0     10    0                    0
15    Halt               0     0     0                    0
16    Transaction        0     0     0                    0   write=false
17    Goto               0     1     0                    0
```
**sqlite output**:
```
sqlite> explain values(1, 2);
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     5     0                    0   Start at 5
1     Integer        1     1     0                    0   r[1]=1
2     Integer        2     2     0                    0   r[2]=2
3     ResultRow      1     2     0                    0   output=r[1..2]
4     Halt           0     0     0                    0
5     Goto           0     1     0                    0
sqlite> explain values(1, 2), (3, 4);
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     16    0                    0   Start at 16
1     InitCoroutine  1     9     2                    0
2     Integer        1     4     0                    0   r[4]=1
3     Integer        2     5     0                    0   r[5]=2
4     Yield          1     0     0                    0
5     Integer        3     4     0                    0   r[4]=3
6     Integer        4     5     0                    0   r[5]=4
7     Yield          1     0     0                    0
8     EndCoroutine   1     0     0                    0
9     InitCoroutine  1     0     2                    0
10      Yield          1     15    0                    0   next row of 2-ROW VALUES CLAUSE
11      Copy           4     8     0                    2   r[8]=r[4]
12      Copy           5     9     0                    2   r[9]=r[5]
13      ResultRow      8     2     0                    0   output=r[8..9]
14    Goto           0     10    0                    0
15    Halt           0     0     0                    0
16    Goto           0     1     0                    0
sqlite>  explain select * from (values(1, 2), (3, 4));
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     16    0                    0   Start at 16
1     InitCoroutine  1     9     2                    0
2     Integer        1     4     0                    0   r[4]=1
3     Integer        2     5     0                    0   r[5]=2
4     Yield          1     0     0                    0
5     Integer        3     4     0                    0   r[4]=3
6     Integer        4     5     0                    0   r[5]=4
7     Yield          1     0     0                    0
8     EndCoroutine   1     0     0                    0
9     InitCoroutine  1     0     2                    0
10      Yield          1     15    0                    0   next row of 2-ROW VALUES CLAUSE
11      Copy           4     8     0                    2   r[8]=r[4]
12      Copy           5     9     0                    2   r[9]=r[5]
13      ResultRow      8     2     0                    0   output=r[8..9]
14    Goto           0     10    0                    0
15    Halt           0     0     0                    0
16    Goto           0     1     0                    0
```

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1549
2025-05-23 13:56:31 +03:00
Zaid Humayun
4312d371fb addresses comment https://github.com/tursodatabase/limbo/pull/1548#discussion_r2102606810 by @jussisaurio
this commit changes the btree_destroy() signature to return an Option<usize>. This more closely resembles Rust semantics instead of passing a pointer to a usize.
However, I'm unsure if I'm handling the cursor result correctly
2025-05-23 00:46:05 +05:30
meteorgan
34e05ef974 make values work in subquery 2025-05-23 00:30:04 +08:00
Zaid Humayun
4072a41c9c Drop Table now uses an ephemeral table as a scratch table
Now when dropping a table, an ephemeral table is created as a scratch table. If a root page of some other table is moved into the page occupied by the root page of the table being dropped, that row is first written into an ephemeral table. Then on a next pass, it is deleted from the schema table and then re-inserted with the new root page.

This happens during AUTOVACUUM when deleting a root page will force the last root page to move into the slot being vacated by the root page of the table being deleted
2025-05-22 19:39:46 +05:30
Jussi Saurio
533a00eae3 Fix bug in op_decr_jump_zero() 2025-05-22 11:40:49 +03:00
Pere Diaz Bou
35f7317724 add default page cache 2025-05-21 14:11:21 +02:00
Alecco
4ef3c1d04d page_cache: fix insert and evict logic
insert() fails if key exists (there shouldn't be two) and panics if
it's different pages, and also fails if it can't make room for the page.

Replaced the limited pop_if_not_dirty() function with make_room_for().
It tries to evict many pages as requested spare capacity. It should come
handy later by resize() and Pager. make_room_for() tries to make room or
fails if it can't evict enough entries.

For make_room_for() I also tried with an all-or-nothing approach, so if
say a query requests a lot more than possible to make room for, it
doesn't evict a bunch of pages from the cache that might be useful. But
implementing this approach got very complicated since it needs to keep
exclusive PageRefs and collecting this caused segfaults. Might be worth
trying again in the future. But beware the rabbit hole.

Updated page cache test logic for new insert rules.

Updated Pager.allocate_page() to handle failure logic but needs further
work. This is to show new cache insert handling. There are many places
to update.

Left comments on callers of pager and page cache needing to update
error handling, for now.
2025-05-21 14:09:39 +02:00
Jussi Saurio
42dc824794 Fix: make OpenEphemeral use new_index() instead of new() 2025-05-20 14:22:17 +03:00
Jussi Saurio
35350a2368 Add IndexKeyInfo to btree 2025-05-20 14:22:17 +03:00
Pekka Enberg
e102cd0be5 Merge 'Add support for DISTINCT aggregate functions' from Jussi Saurio
Reviewable commit by commit. CI failures are not related.
Adds support for e.g. `select first_name, sum(distinct age),
count(distinct age), avg(distinct age) from users group by 1`
Implementation details:
- Creates an ephemeral index per distinct aggregate, and jumps over the
accumulation step if a duplicate is found

Closes #1507
2025-05-20 13:58:57 +03:00
pedrocarlo
52533cab40 only pass collations for index in cursor + adhere to order of columns in index 2025-05-19 15:22:55 -03:00
pedrocarlo
5b15d6aa32 Get the table correctly from the connection instead of table_references + test to confirm unique constraint 2025-05-19 15:22:55 -03:00
pedrocarlo
4a3119786e refactor BtreeCursor and Sorter to accept Vec of collations 2025-05-19 15:22:55 -03:00
pedrocarlo
f28ce2b757 add collations to btree cursor 2025-05-19 15:22:55 -03:00
pedrocarlo
5bd47d7462 post rebase adjustments to accomodate new instructions that were created before the merge conflicts 2025-05-19 15:22:15 -03:00