```sql
limbo> select * from products where id between 1 and 3
UNION ALL
select * from products where id between 2 and 4;
┌───┬─────────┬──────┐
│ 1 │ hat │ 79.0 │
├───┼─────────┼──────┤
│ 2 │ cap │ 82.0 │
├───┼─────────┼──────┤
│ 3 │ shirt │ 18.0 │
├───┼─────────┼──────┤
│ 2 │ cap │ 82.0 │
├───┼─────────┼──────┤
│ 3 │ shirt │ 18.0 │
├───┼─────────┼──────┤
│ 4 │ sweater │ 25.0 │
└───┴─────────┴──────┘
limbo> select * from products where id between 1 and 3
UNION
select * from products where id between 2 and 4;
┌───┬─────────┬──────┐
│ 1 │ hat │ 79.0 │
├───┼─────────┼──────┤
│ 2 │ cap │ 82.0 │
├───┼─────────┼──────┤
│ 3 │ shirt │ 18.0 │
├───┼─────────┼──────┤
│ 4 │ sweater │ 25.0 │
└───┴─────────┴──────┘
limbo>
```
Similarly as UNION ALL (#1541 ), supports LIMIT but not OFFSET or ORDER
BY.
Augments `compound_select_fuzz()` to work with both UNION and UNION ALL
Closes#1545
Closes#1557
Currently our "table id"/"table no"/"table idx" references always use
the direct index of the `TableReference` in the plan, e.g. in
`SelectPlan::table_references`. For example:
```rust
Expr::Column { table: 0, column: 3, .. }
```
refers to the 0'th table in the `table_references` list.
This is a fragile approach because it assumes the table_references list
is stable for the lifetime of the query processing. This has so far been
the case, but there exist certain query transformations, e.g. subquery
unnesting, that may fold new table references from a subquery (which has
its own table ref list) into the table reference list of the parent.
If such a transformation is made, then potentially all of the
Expr::Column references to tables will become invalid. Consider this
example:
```sql
-- Assume tables: users(id, age), orders(user_id, amount)
-- Get total amount spent per user on orders over $100
SELECT u.id, sub.total
FROM users u JOIN
(SELECT user_id, SUM(amount) as total
FROM orders o
WHERE o.amount > 100
GROUP BY o.user_id) sub
WHERE u.id = sub.user_id
-- Before subquery unnesting:
-- Main query table_references: [users, sub]
-- u.id refers to table 0, column 0
-- sub.total refers to table 1, column 1
--
-- Subquery table_references: [orders]
-- o.user_id refers to table 0, column 0
-- o.amount refers to table 0, column 1
--
-- After unnesting and folding subquery tables into main query,
-- the query might look like this:
SELECT u.id, SUM(o.amount) as total
FROM users u JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100
GROUP BY u.id;
-- Main query table_references: [users, orders]
-- u.id refers to table index 0 (correct)
-- o.amount refers to table index 0 (incorrect, should be 1)
-- o.user_id refers to table index 0 (incorrect, should be 1)
```
We could ofc traverse every expression in the subquery and rewrite the
table indexes to be correct, but if we instead use stable identifiers
for each table reference, then all the column references will continue
to be correct.
Hence, this PR introduces a `TableInternalId` used in `TableReference`
as well as `Expr::Column` and `Expr::Rowid` so that this kind of query
transformations can happen with less pain. I used a separate newtype
struct for `TableInternalId` because it made the refactor a lot easier
due to not having to spend time thinking which `usize` is what.
---
Potential follow-up: `join_order` can be removed from `SelectPlan`
because the `table_references` vec can simply be sorted after join
reordering, because the `Expr::Column` references will continue to be
valid.
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes#1561
Currently our "table id"/"table no"/"table idx" references always
use the direct index of the `TableReference` in the plan, e.g. in
`SelectPlan::table_references`. For example:
```rust
Expr::Column { table: 0, column: 3, .. }
```
refers to the 0'th table in the `table_references` list.
This is a fragile approach because it assumes the table_references
list is stable for the lifetime of the query processing. This has so
far been the case, but there exist certain query transformations,
e.g. subquery unnesting, that may fold new table references from
a subquery (which has its own table ref list) into the table reference
list of the parent.
If such a transformation is made, then potentially all of the Expr::Column
references to tables will become invalid. Consider this example:
```sql
-- Assume tables: users(id, age), orders(user_id, amount)
-- Get total amount spent per user on orders over $100
SELECT u.id, sub.total
FROM users u JOIN
(SELECT user_id, SUM(amount) as total
FROM orders o
WHERE o.amount > 100
GROUP BY o.user_id) sub
WHERE u.id = sub.user_id
-- Before subquery unnesting:
-- Main query table_references: [users, sub]
-- u.id refers to table 0, column 0
-- sub.total refers to table 1, column 1
--
-- Subquery table_references: [orders]
-- o.user_id refers to table 0, column 0
-- o.amount refers to table 0, column 1
--
-- After unnesting and folding subquery tables into main query,
-- the query might look like this:
SELECT u.id, SUM(o.amount) as total
FROM users u JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100
GROUP BY u.id;
-- Main query table_references: [users, orders]
-- u.id refers to table index 0 (correct)
-- o.amount refers to table index 0 (incorrect, should be 1)
-- o.user_id refers to table index 0 (incorrect, should be 1)
```
We could ofc traverse every expression in the subquery and rewrite
the table indexes to be correct, but if we instead use stable identifiers
for each table reference, then all the column references will continue
to be correct.
Hence, this PR introduces a `TableInternalId` used in `TableReference`
as well as `Expr::Column` and `Expr::Rowid` so that this kind of query
transformations can happen with less pain.
Re-Opening #1076 because it had bit-rotted to a point of no return.
However it has improved. Now with Weak references and no incrementing Rc
strong counts.
This also includes a better test extension that returns info about the
other tables in the schema.

(theme doesn't show rows column)
Closes#1366
Fixes#1567
Probably also fixes#1485
Currently we are simply unable to read any WAL frames from disk once a
fresh process w/ Limbo is opened, since we never try to read anything
from disk unless we already have it in our in-memory frame cache.
This commit implements a crude way of reading entire WAL into memory as
a single buffer and reconstructing the frame cache.
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#1570
Shared cache requires more locking mechasnisms. We still have multi
threading issues not related to shared cache so it is wise to first fix
those and then once they are fixed, we can incrementally add shared
cache back with locking in place.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1568
Added a benchmark to bench TPC-H with criterion and have it uploaded to
Nyrkio. I did not delete the other tpc-h job because maybe someone uses
it and I'm not aware of it.
Closes#1560
Currently we are simply unable to read any WAL frames from disk
once a fresh process w/ Limbo is opened, since we never try to read
anything from disk unless we already have it in our in-memory
frame cache.
This commit implements a crude way of reading entire WAL into memory
as a single buffer and reconstructing the frame cache.
Adds support for `UNION ALL` and introduces `Plan::CompoundSelect` so
that it can be extended to support `UNION/EXCEPT/INTERSECT` as well
```sql
do_execsql_test_on_specific_db {:memory:} select-union-all-1 {
CREATE TABLE t1(x INTEGER);
CREATE TABLE t2(x INTEGER);
CREATE TABLE t3(x INTEGER);
INSERT INTO t1 VALUES(1),(2),(3);
INSERT INTO t2 VALUES(4),(5),(6);
INSERT INTO t3 VALUES(7),(8),(9);
SELECT x FROM t1
UNION ALL
SELECT x FROM t2
UNION ALL
SELECT x FROM t3;
} {1
2
3
4
5
6
7
8
9}
do_execsql_test_on_specific_db {:memory:} select-union-all-with-filters {
CREATE TABLE t4(x INTEGER);
CREATE TABLE t5(x INTEGER);
CREATE TABLE t6(x INTEGER);
INSERT INTO t4 VALUES(1),(2),(3),(4);
INSERT INTO t5 VALUES(5),(6),(7),(8);
INSERT INTO t6 VALUES(9),(10),(11),(12);
SELECT x FROM t4 WHERE x > 2
UNION ALL
SELECT x FROM t5 WHERE x < 7
UNION ALL
SELECT x FROM t6 WHERE x = 10;
} {3
4
5
6
10}
```
Supports LIMIT. Currently does not support `WITH()`, `OFFSET` or `ORDER
BY` and explicitly returns a parse error if those are present.
Closes#1541