Commit Graph

2705 Commits

Author SHA1 Message Date
Jussi Saurio
360b1fcdae Fix bug: op_vopen should replace cursor slot, not add new one 2025-05-27 10:52:36 +03:00
Jussi Saurio
b72b99c973 Merge 'feature: INSERT INTO <table> SELECT' from Pedro Muniz
Closes #1528 .
- Modified `translate_select` so that the caller can define if the
statement is top-level statement or a subquery.
- Refactored `translate_insert` to offload the translation of multi-row
VALUES and SELECT statements to `translate_select`
- I did not try to change much of `populate_column_registers` as I did
not want to break `translate_virtual_table_insert`. Ideally, I would
want to unite this remaining logic folding `populate_column_registers`
into `populate_columns_multiple_rows` and the
`translate_virtual_table_insert` into `translate_insert`. But, I think
this may be best suited for a separate PR.
## TODO
- ~Tests~ - *Done*
- ~Need to emit a temp table when we are selecting and inserting into
the Same Table -
https://github.com/sqlite/sqlite/blob/master/src/insert.c#L1369~ -
*Done*
- Optimization when table have the exact same schema - open an Issue
about it
- Virtual Tables do not benefit yet from this feature - open an Issue
about it

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1566
2025-05-27 10:50:26 +03:00
Jussi Saurio
3ba9f2ab97 Small cleanups to pager/wal/vdbe - mostly naming
- Instead of using a confusing CheckpointStatus for many different things,
  introduce the following statuses:
    * PagerCacheflushStatus - cacheflush can result in either:
      - the WAL being written to disk and fsynced
      - but also a checkpoint to the main BD file, and fsyncing the main DB file

      Reflect this in the type.
    * WalFsyncStatus - previously CheckpointStatus was also used for this, even
      though fsyncing the WAL doesn't checkpoint.
    * CheckpointStatus/CheckpointResult is now used only for actual checkpointing.

- Rename HaltState to CommitState (program.halt_state -> program.commit_state)
- Make WAL a non-optional property in Pager
  * This gets rid of a lot of if let Some(...) boilerplate
  * For ephemeral indexes, provide a DummyWAL implementation that does nothing.
- Rename program.halt() to program.commit_txn()
- Add some documentation comments to structs and functions
2025-05-26 10:37:34 +03:00
pedrocarlo
1410e57112 correct union result_row or yield emission + test 2025-05-26 01:06:26 -03:00
pedrocarlo
ee93316c46 fix num_values detection + emitting correct column for temp_table + tests 2025-05-25 19:15:28 -03:00
pedrocarlo
e3fd1e589e support using a INSERT SELECT that references the same table in both statements 2025-05-25 19:15:28 -03:00
pedrocarlo
90e3c8483d tests with compound select 2025-05-25 19:15:28 -03:00
pedrocarlo
72c1f2f582 fix rebase issues and make code compile by cloning query type. Adjust the compound select behavior with insert 2025-05-25 19:13:40 -03:00
pedrocarlo
c8144340a0 adjust proper ordering for value insert 2025-05-25 19:12:30 -03:00
pedrocarlo
810211b3d1 passing incorrect number of values to virtual table insert 2025-05-25 19:12:30 -03:00
pedrocarlo
4bcfc8ca60 create separate function to populate multiple columns in a multi-row VALUES clause or in an INSERT INTO <table> SELECT. Virtual Table insert is broken, need to fix it still 2025-05-25 19:12:30 -03:00
pedrocarlo
bb7da39c72 remove assumption that translate_select is always called from a top-level context + adjust insert to use translate_select when needed 2025-05-25 19:12:30 -03:00
pedrocarlo
fd9e0db5cc pass the owned ast to translate_insert + remove assumption of a list of values in populate_columns_insert 2025-05-25 19:02:17 -03:00
pedrocarlo
15ffdd3e51 modify translate_select to return number of result columns 2025-05-25 19:02:17 -03:00
Jussi Saurio
07fa3a9668 Rename SelectQueryType to QueryDestination 2025-05-25 21:23:04 +03:00
Jussi Saurio
d893a55c55 UNION 2025-05-25 21:23:04 +03:00
Jussi Saurio
7c07c09300 Add stable internal_id property to TableReference
Currently our "table id"/"table no"/"table idx" references always
use the direct index of the `TableReference` in the plan, e.g. in
`SelectPlan::table_references`. For example:

```rust
Expr::Column { table: 0, column: 3, .. }
```

refers to the 0'th table in the `table_references` list.

This is a fragile approach because it assumes the table_references
list is stable for the lifetime of the query processing. This has so
far been the case, but there exist certain query transformations,
e.g. subquery unnesting, that may fold new table references from
a subquery (which has its own table ref list) into the table reference
list of the parent.

If such a transformation is made, then potentially all of the Expr::Column
references to tables will become invalid. Consider this example:

```sql
-- Assume tables: users(id, age), orders(user_id, amount)

-- Get total amount spent per user on orders over $100
SELECT u.id, sub.total
FROM users u JOIN
     (SELECT user_id, SUM(amount) as total
      FROM orders o
      WHERE o.amount > 100
      GROUP BY o.user_id) sub
WHERE u.id = sub.user_id

-- Before subquery unnesting:
-- Main query table_references: [users, sub]
-- u.id refers to table 0, column 0
-- sub.total refers to table 1, column 1
--
-- Subquery table_references: [orders]
-- o.user_id refers to table 0, column 0
-- o.amount refers to table 0, column 1
--
-- After unnesting and folding subquery tables into main query,
-- the query might look like this:

SELECT u.id, SUM(o.amount) as total
FROM users u JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100
GROUP BY u.id;

-- Main query table_references: [users, orders]
-- u.id refers to table index 0 (correct)
-- o.amount refers to table index 0 (incorrect, should be 1)
-- o.user_id refers to table index 0 (incorrect, should be 1)
```

We could ofc traverse every expression in the subquery and rewrite
the table indexes to be correct, but if we instead use stable identifiers
for each table reference, then all the column references will continue
to be correct.

Hence, this PR introduces a `TableInternalId` used in `TableReference`
as well as `Expr::Column` and `Expr::Rowid` so that this kind of query
transformations can happen with less pain.
2025-05-25 20:26:17 +03:00
Jussi Saurio
b5ac095716 Fix off-by-one error in max_frame after WAL load 2025-05-25 19:34:51 +03:00
Jussi Saurio
f388bc571e Merge 'xConnect for virtual tables to query core db connection' from Preston Thorpe
Re-Opening #1076 because it had bit-rotted to a point of no return.
However it has improved. Now with Weak references and no incrementing Rc
strong counts.
This also includes a better test extension that returns info about the
other tables in the schema.
![image](https://github.com/user-
attachments/assets/4292dc9c-121e-4ba2-8a51-4533bbcf2afd)
(theme doesn't show rows column)

Closes #1366
2025-05-25 14:37:38 +03:00
Jussi Saurio
621ae60ab5 Merge 'Reconstruct WAL frame cache when WAL is opened' from Jussi Saurio
Fixes #1567
Probably also fixes #1485
Currently we are simply unable to read any WAL frames from disk once a
fresh process w/ Limbo is opened, since we never try to read anything
from disk unless we already have it in our in-memory frame cache.
This commit implements a crude way of reading entire WAL into memory as
a single buffer and reconstructing the frame cache.

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #1570
2025-05-25 14:35:47 +03:00
Jussi Saurio
385c0d8987 clippy stfu part 2: electric boogaloo 2025-05-25 10:32:23 +03:00
Jussi Saurio
64ef3f1343 simplify condition 2025-05-25 10:22:46 +03:00
Jussi Saurio
20e65c0125 bump max_loops to 100k 2025-05-25 10:21:41 +03:00
Pere Diaz Bou
a91f8aee78 Merge 'set non-shared cache by default' from Pere Diaz Bou
Shared cache requires more locking mechasnisms. We still have multi
threading issues not related to shared cache so it is wise to first fix
those and then once they are fixed, we can incrementally add shared
cache back with locking in place.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1568
2025-05-25 08:50:21 +02:00
PThorpe92
98e1c0ddd4 Remove unused method for converting from ext type without ownership 2025-05-24 17:22:12 -04:00
PThorpe92
cf163f2dc0 Prevent double free in ext connection 2025-05-24 16:49:52 -04:00
PThorpe92
1cacbf1f0d Close statements in extension tests, and use mut pointers for stmt 2025-05-24 16:45:25 -04:00
PThorpe92
d63f9d8cff Make sure all resources are cleaned up properly in xconnect 2025-05-24 16:38:33 -04:00
PThorpe92
a4ed464ec4 Add some traces for errors in xconnect 2025-05-24 15:44:06 -04:00
PThorpe92
999205f896 Add documentation for connection api 2025-05-24 15:19:37 -04:00
Jussi Saurio
09df7e0c48 Merge 'TPC-H with criterion and nyrkio' from Pedro Muniz
Added a benchmark to bench TPC-H with criterion and have it uploaded to
Nyrkio. I did not delete the other tpc-h job because maybe someone uses
it and I'm not aware of it.

Closes #1560
2025-05-24 21:54:48 +03:00
PThorpe92
a2f8b2dfea Fix add check for invalid argv index for vtab constraints in main loop 2025-05-24 14:49:58 -04:00
PThorpe92
d11ef6b9c5 Add execute method to xConnect db interface for vtables 2025-05-24 14:49:58 -04:00
PThorpe92
c2ec6caae1 Finish integrating xConnect into vtable open api 2025-05-24 14:49:58 -04:00
PThorpe92
f61ccc78e8 Add from_ffi_ptr method to create OwnedValue from Ext type without taking ownership 2025-05-24 14:49:58 -04:00
PThorpe92
d51614a4fd Create extern functions to support vtab xConnect in core/ext 2025-05-24 14:49:57 -04:00
Jussi Saurio
fc45e0ec0d Reconstruct WAL frame cache when WAL is opened
Currently we are simply unable to read any WAL frames from disk
once a fresh process w/ Limbo is opened, since we never try to read
anything from disk unless we already have it in our in-memory
frame cache.

This commit implements a crude way of reading entire WAL into memory
as a single buffer and reconstructing the frame cache.
2025-05-24 18:29:44 +03:00
Jussi Saurio
8ed5334ca7 tests/fuzz: add compound_select_fuzz() 2025-05-24 13:12:41 +03:00
Jussi Saurio
f6443ae742 Support LIMIT with UNION ALL 2025-05-24 13:12:41 +03:00
Jussi Saurio
08bda9cc58 UNION ALL 2025-05-24 13:12:41 +03:00
Pere Diaz Bou
54b1647148 set non-shared cache by default
Shared cache requires more locking mechasnisms. We still have multi
threading issues not related to shared cache so it is wise to first fix
those and then once they are fixed, we can incrementally add shared
cache back with locking in place.
2025-05-24 11:59:54 +02:00
Jussi Saurio
0b2c3298aa Merge 'refactor: introduce walk_expr() and walk_expr_mut() to reduce repetitive pattern matching' from Jussi Saurio
We do a lot of
```rust
match expr {
  ...
}
```
just to find some specific case like `Expr::Column` deep inside an
expression tree. This PR introduces new helpers `walk_expr()` and
`walk_expr_mut()` that handle the tree-walking part, and the business
logic functions where this tree traversal is used can focus on the exact
enum variants of `Expr` that they are interested in.

Closes #1564
2025-05-23 22:00:20 +03:00
Jussi Saurio
70433e100d Merge 'btree: fix infinite looping in backwards iteration of btree table' from Jussi Saurio
Closes #1562
Existing "fuzz test" (not really fuzz, but kinda) didn't catch this due
to `LIMIT 3` clause

Closes #1563
2025-05-23 21:46:16 +03:00
pedrocarlo
1c4af7d0aa change sample count to 10 2025-05-23 12:37:03 -03:00
Jussi Saurio
2e095e6d03 Merge 'Add some comments for values statement' from meteorgan
follow up: #1549
simultaneously, address a warning in CI as the package `sqlite3-parser`
has been renamed to `limbo_sqlite3_parser`.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1565
2025-05-23 17:30:30 +03:00
meteorgan
3bf0ce7fb3 Add some comments for values statement 2025-05-23 22:11:34 +08:00
Jussi Saurio
1a937462b3 Merge 'core/pragma: Add support for update user_version' from Diego Reis
It also changes the type from u32 to i32 since
sqlite supports negative values

Closes #1559
2025-05-23 17:00:55 +03:00
Jussi Saurio
c18c6a00fa refactor: use walk_expr() in resolving vtab constraints 2025-05-23 16:28:56 +03:00
Jussi Saurio
fbfd2b2c38 refactor: use walk_expr_mut() in rewrite_expr() 2025-05-23 16:27:28 +03:00
Jussi Saurio
362347c474 refactor: use walk_expr() in determine_where_to_eval_expr() 2025-05-23 16:27:28 +03:00