Commit Graph

9136 Commits

Author SHA1 Message Date
Glauber Costa
cb7c04ffad return error instead of panic for invalid syntax on views
I have accidentally typed "create materialized views", and noticed
that this panics, instead of returning an error. Fix it.
2025-09-19 03:59:28 -05:00
Glauber Costa
f149b40e75 Implement JOINs in the DBSP circuit
This PR improves the DBSP circuit so that it handles the JOIN operator.
The JOIN operator exposes a weakness of our current model: we usually
pass a list of columns between operators, and find the right column by
name when needed.

But with JOINs, many tables can have the same columns. The operators
will then find the wrong column (same name, different table), and
produce incorrect results.

To fix this, we must do two things:
1) Change the Logical Plan. It needs to track table provenance.
2) Fix the aggregators: it needs to operate on indexes, not names.

For the aggregators, note that table provenance is the wrong
abstraction. The aggregator is likely working with a logical table that
is the result of previous nodes in the circuit. So we just need to be
able to tell it which index in the column array it should use.
2025-09-19 03:59:28 -05:00
Glauber Costa
9f3d119a5a move hashable row tests to dbsp.rs
The operator.rs file was so huge, that we didn't even notice there was a
test block in the middle of the file that was testing things that were
long moved to dbsp.rs (the HashableRow). Move the tests there now.
2025-09-19 03:59:28 -05:00
Glauber Costa
e2f0e372a1 move the join operator to its own file.
The code is becoming impossible to reason about with everything in
operator.rs
2025-09-19 03:59:28 -05:00
Glauber Costa
aa8fcdbe54 move the aggregate operator to its own file.
The code is becoming impossible to reason about with everything in
operator.rs
2025-09-19 03:59:24 -05:00
Glauber Costa
7178d8d31c move the project operator to its own file.
The code is becoming impossible to reason about with everything in
operator.rs
2025-09-19 03:57:11 -05:00
Glauber Costa
ee914fc543 move the filter operator to its own file.
The code is becoming impossible to reason about with everything in
operator.rs
2025-09-19 03:57:11 -05:00
Glauber Costa
9747d6c6b6 move the input operator to its own file.
The code is becoming impossible to reason about with everything in
operator.rs
2025-09-19 03:57:11 -05:00
Glauber Costa
6be5eb74d9 Implement the Join Operator
The join operator is also a stateful operator. It keeps the input deltas
stored in the state, for both the left and right branches of the join.

JOINs extract a join key, which is the values that were used in the
join's equality statement. That key is now our zset_id, and it points
to a collection of rows.
2025-09-19 03:57:11 -05:00
Glauber Costa
2e7a45559b add joins to the logical plan 2025-09-19 03:57:11 -05:00
Glauber Costa
5b4a6e5c2d view: catch all tables mentioned, instead of just one.
Ahead of the implementation of JOINs, we need to evolve the
IncrementalView, which currently only accepts a single base table,
to keep a list of tables mentioned in the statement.
2025-09-19 03:57:11 -05:00
Glauber Costa
0b3317d449 extract columns from all tables in case of joins.
Our code for view needs to extract the list of columns used in the view.
We currently extract only from "the base table", but once we have joins,
we need a more complex structure, that keeps the mapping of
(tables, columns).

This actually affects both views and materialized views: for views, the
queries with joins work just fine, because views are just aliases for
a query. But the list of columns returned by pragma table_info on the
view is incorrect. We add a test to make sure it is fixed.

For materialized views, we add extensive tests to make sure that the
columns are extracted correctly.
2025-09-19 03:57:11 -05:00
Pekka Enberg
635ac1c8be Merge 'whopper: Gracefully handle file size limits in simulator' from Avinash Sajjanshetty
fixes #3199
I love running whopper in `chaos` mode. However, sometimes when the file
limit exceeds 1GB, it panics because whopper does not allow larger
files.
This patch changes this behaviour slightly:
1. Increased the file limit to 8GB and also added a soft limit of 6GB
2. The simulator shares a hashmap of `<file_name, file_size>` with the
IO
3. The IO layer updates the logical file size whenever it does inserts
4. In the sim we check the file size and if it exceeds the soft limit,
we stop
5. In chaos mode, this will move on to the next test
The panics can still happen, but they are largely reduced now.

Closes #3216
2025-09-19 11:27:01 +03:00
Avinash Sajjanshetty
56a5069647 end sim test if the size exceeds soft limit 2025-09-19 12:56:47 +05:30
Avinash Sajjanshetty
6c49af34e4 Update IO to track file sizes 2025-09-19 12:47:26 +05:30
Jussi Saurio
867dbfc6ae Merge 'Fix MVCC concurrency bugs' from Jussi Saurio
Reviewed-by: Pekka Enberg <penberg@iki.fi>

Closes #3214
2025-09-19 10:00:43 +03:00
Jussi Saurio
22a8992b6b mvcc: un-ignore mvcc fuzz test 2025-09-19 09:18:20 +03:00
Jussi Saurio
7410200a9f mvcc: add higher weight for BEGIN CONCURRENT in fuzz test 2025-09-19 09:18:20 +03:00
Jussi Saurio
8555d81a62 mvcc: keep existing begin timestamp when upgrading mv tx to exclusive 2025-09-19 09:18:20 +03:00
Jussi Saurio
c289ab90bc mvcc: fix trying to end pager tx in rollback 2025-09-19 09:18:20 +03:00
Jussi Saurio
ed06c7c423 mvcc: fix hang when non-concurrent tx holds write lock 2025-09-19 09:18:20 +03:00
Jussi Saurio
30596de741 mvcc: dont set tx state to commit before actually committing 2025-09-19 09:18:20 +03:00
Jussi Saurio
c90e729d5f Merge 'core/storage: Wrap Pager::header_ref_state in RwLock' from Pekka Enberg
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3212
2025-09-19 09:16:08 +03:00
Jussi Saurio
5aa788691b Merge 'Fix math functions compatibility issues' from Levy A.
Adds `round`, `hex`, `unhex`, `abs`, `lower`, `upper`, `sign` and `log`
(with base) to the expression fuzzer.
Rounding with the precision argument still has some incompatibilities.

Closes #3160
2025-09-19 09:15:11 +03:00
Jussi Saurio
44988c2eb7 Merge 'core/mvcc: Kill noop storage' from Pekka Enberg
We don't need it for anything.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3213
2025-09-19 09:13:53 +03:00
Pekka Enberg
3f35267b7c core/mvcc: Kill noop storage
We don't need it for anything.
2025-09-19 08:52:57 +03:00
Pekka Enberg
508858dac6 core/storage: Wrap Pager::header_ref_state in RwLock 2025-09-19 08:38:45 +03:00
Pekka Enberg
01c4f22a42 Merge 'simulator: Fix shrinking' from Pedro Muniz
When shrinking, the depending table should first start with the
dependencies from the overall property first. I broke this in my
previous Sim Connections PR.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3203
2025-09-19 08:29:49 +03:00
Pekka Enberg
0ce6469a4b Merge 'Fix some Rust compilation warnings' from Samuel Marks
Nothing fancy yet, assuming you merge this I'll do this one next:
```
warning: function pointer comparisons do not produce meaningful results since their addresses are not guaranteed to be unique
   --> core/types.rs:403:5
    |
398 | #[derive(Debug, Clone, PartialEq)]
    |                        --------- in this derive macro expansion
...
402 |     pub step_fn: StepFunction,
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^
403 |     pub finalize_fn: FinalizeFunction,
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: the address of the same function can vary between different codegen units
    = note: furthermore, different functions could have the same address after being merged together
    = note: for more information visit <https://doc.rust-lang.org/nightly/core/ptr/fn.fn_addr_eq.html>
```
And fix a test failure that I resolved in Python (specific to macOS
hosts). Basically this PR is putting my toe in the water to see how open
you are to contribs!

Closes #3211
2025-09-19 08:28:53 +03:00
Jussi Saurio
bab814948c Merge ' core/mvcc: LogicalLog simple append serializer ' from Pere Diaz Bou
This is wip but since I have it working on a single connection maybe
this is worth to review right now.
I haven't done any header work right now for the on disk format, but
instead I've basically added the skeleton for logical log transactions
on MVCC to work.
**before merging this pr I have to remove cherry picked commit from
https://github.com/tursodatabase/turso/pull/3156**

Closes #3158
2025-09-19 08:08:07 +03:00
Samuel Marks
e333f151ba [*.rs] Resolve warnings (mostly "hiding a lifetime that's elided elsewhere is confusing") 2025-09-18 22:47:43 -05:00
Preston Thorpe
5819832c4e Merge 'translate/insert: fix program.result_columns when inserting multiple rows' from Preston Thorpe
closes #3206 and https://github.com/tursodatabase/turso-go/issues/21
What happens is the coroutine overwrites the program state.. so the
`RETURNING` result columns are from the `translate_select` that happens
when we insert multiple rows.
```rust
let result = translate_select(schema, select, syms, program, query_destination, connection)?;
// overwrites program.result_columns
```

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3208
2025-09-18 17:34:42 -04:00
PThorpe92
c941955444 Fix issue with result columns being inappropriate for inserting multiple rows 2025-09-18 14:35:12 -04:00
Pere Diaz Bou
abaf2118a3 clippy 2025-09-18 19:26:46 +02:00
Pere Diaz Bou
402f171ce4 fix compilation error on logical_log 2025-09-18 19:20:37 +02:00
Pere Diaz Bou
242c48c813 core/mvcc: logical log fix offset addition 2025-09-18 19:14:56 +02:00
Pere Diaz Bou
b40e699c8c core/mvcc: don't end pager tx on logical log 2025-09-18 19:12:45 +02:00
Pere Diaz Bou
4f5833c681 fmt 2025-09-18 18:40:13 +02:00
Pere Diaz Bou
0fd704d00f core/mvcc: begin_tx with logical log don't use pager 2025-09-18 18:39:57 +02:00
Pere Diaz Bou
3e7a074f82 core/mvcc: fix review comments 2025-09-18 18:27:57 +02:00
Pere Diaz Bou
ef341338dc core/mvcc: rebase fix 2025-09-18 18:24:55 +02:00
Pere Diaz Bou
ff3c79d5d7 remove mvvmode and set logical log as default 2025-09-18 18:22:25 +02:00
Pere Diaz Bou
0e5b0fe8c4 perf/throughput/turso: add io option 2025-09-18 18:22:06 +02:00
Pere Diaz Bou
e6eb3adcbd core/mvcc/logical-log: sync 2025-09-18 18:22:06 +02:00
Pere Diaz Bou
d53c64e84b core/schema: parse schema rows for MVCC transactions 2025-09-18 18:22:06 +02:00
Pere Diaz Bou
a0555c254d core/mvcc/logical-log: change schema on update 2025-09-18 18:22:06 +02:00
Pere Diaz Bou
ba798076a0 perf/throughput/turso: add env-filter 2025-09-18 18:22:06 +02:00
Pere Diaz Bou
91c04133e9 perf/throughput/turso: allow logical log benchmark 2025-09-18 18:22:06 +02:00
Pere Diaz Bou
50c18ada1c core/mvcc: logical log update header on commit 2025-09-18 18:22:06 +02:00
Pere Diaz Bou
e2824835dc fix all open_file use cases for mvcc mode 2025-09-18 18:22:05 +02:00