Commit Graph

8127 Commits

Author SHA1 Message Date
bit-aloo
ffcadd00ae evaluate limit or offset expr 2025-08-26 19:56:12 +05:30
bit-aloo
28439efd09 make offset and limit Expr 2025-08-26 19:56:11 +05:30
bit-aloo
ea3ab2a9c7 add ifNeg op_code 2025-08-26 19:55:42 +05:30
Jussi Saurio
66d00915d7 Merge 'Improve documentation of page pinning' from Jussi Saurio
Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2797
2025-08-26 17:18:25 +03:00
Jussi Saurio
bf98bf4576 Merge 'Fix missing functions after revert' from Pedro Muniz
When we reverted the #2789 , we had already merged #2793 . That Pr used
some helper methods that were created in #2789. So I just added them
back here + fixed the simulator dockerfile.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2803
2025-08-26 17:18:06 +03:00
pedrocarlo
aa025c9798 fix missing functions after revert 2025-08-26 10:13:45 -03:00
Pekka Enberg
26ba09c45f Revert "Merge 'Remove double indirection in the Parser' from Pedro Muniz"
This reverts commit 71c1b357e4, reversing
changes made to 6bc568ff69 because it
actually makes things slower.
2025-08-26 14:58:21 +03:00
Pekka Enberg
b7bf4b55ed Merge 'Fail CI run if Turso output differs from SQLite in TPC-H queries' from Jussi Saurio
We had missed #2794 because we don't get any visible feedback on PRs if
we return wrong results from TPC-H, so let's make that happen here.

Closes #2798
2025-08-26 14:36:22 +03:00
Jussi Saurio
e65742e5ff Fail CI if tursodb output differs from sqlite in tpc-h queries 2025-08-26 11:30:37 +03:00
Jussi Saurio
bf58d179db Improve documentation of page pinning 2025-08-26 10:13:25 +03:00
Pekka Enberg
5dd1bca4d3 Merge 'Decouple SQL generation from Simulator crate' from Pedro Muniz
Decouple Sql generation code from simulator code, so that it can
potentially be reused for fuzzing on other crates and to create a
`GenerationContext` trait so that it becomes easier to create
`Simulation Profiles`. Ideally in further PRs, I want to expand the
`GenerationContext` trait so we can guide the generation with context
from the simulation profile.
Depends on #2789 .

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2793
2025-08-26 09:41:58 +03:00
Pekka Enberg
3176df64a2 Merge 'Fix: return NULL for rowid() when cursor's null flag is on' from Jussi Saurio
Fixes TPC-H query 13 from returning an incorrect result. In this
specific case, we were returning non-null `IdxRowid` values for the
right-hand side table even when there was no match with the left-hand
side table, meaning the join produced matches even in cases where there
shouldn't have been any.
Closes #2794

Closes #2795
2025-08-26 09:33:49 +03:00
Jussi Saurio
3905f0af46 Add regression test for issue 2794 2025-08-26 09:21:58 +03:00
Jussi Saurio
e52f807c7d Fix: return NULL for rowid() when cursor's null flag is on
Fixes TPC-H query 13 from returning an incorrect result. In this specific
case, we were returning non-null `IdxRowid` values for the right-hand side
table even when there was no match with the left-hand side table, meaning
the join produced matches even in cases where there shouldn't have been any.

Closes #2794
2025-08-26 09:08:48 +03:00
Pekka Enberg
ec73b809a9 antithesis-tests: Enable multi-threading 2025-08-26 08:37:35 +03:00
Pekka Enberg
114ece0375 Merge 'Make fill_cell_payload() safe for async IO and cache spilling' from Jussi Saurio
## Make fill_cell_payload() safe for async IO and cache spilling
### Problems:
1. fill_cell_payload() is not re-entrant because it can yield IO
   on allocating a new overflow page, resulting in losing some of the
   input data.
2. fill_cell_payload() in its current form is not safe for cache
spilling
   because the previous overflow page in the chain of allocated overflow
pages
   can be evicted by a spill caused by the next overflow page
allocation,
   invalidating the page pointer and causing corruption.
3. fill_cell_payload() uses raw pointers and `unsafe` as a workaround
from a previous time when we used to clone `WriteState`, resulting in
hard-to-read code.
### Solutions:
1. Introduce a new substate to the fill_cell_payload state machine to
handle
   re-entrancy wrt. allocating overflow pages.
2. Always pin the current overflow page so that it cannot be evicted
during the
   overflow chain construction. Also pin the regular page the overflow
chain is
   attached to, because it is immediately accessed after
fill_cell_payload is done.
3. Remove all explicit usages of `unsafe` from `fill_cell_payload`
(although our pager is ofc still extremely unsafe under the hood :] )
Note that solution 2 addresses a problem that arose in the development
of page cache
spilling, which is not yet implemented, but will be soon.
### Miscellania:
1. Renamed a bunch of variables to be clearer
2. Added more comments about what is happening in fill_cell_payload

Closes #2737
2025-08-26 08:36:46 +03:00
Pekka Enberg
6e78c23ce7 Merge 'Remove Windows IO in place of Generic IO' from Preston Thorpe
Generic IO and Windows IO were identical, since we don't do anything
windows specific (maybe when someone eventually wants to implement an IO
back-end for windows `IO_RING` API, we can bring it back :)
This does use the exact impl of `WindowsIO`, simply renaming it to
Generic IO.. because I don't think GenericIO had ever been used.

Reviewed-by: Pedro Muniz (@pedrocarlo)

Closes #2790
2025-08-26 08:33:04 +03:00
Pekka Enberg
8f11311473 Merge 'Improve encryption API' from Avinash Sajjanshetty
This patch brings a bunch of quality of life improvements to encryption:
1. Previously, we just let any string to be used as a key. I have
updated the `PRAGMA hexkey=''` to get the key in hex. I have also
renamed from `key`, because that will be used to get passphrase
2. Added `PRAGMA cipher` so that now users can select which cipher they
want to use (for now, either `aegis256` or `aes256gcm`)
3. We now set the encryption context when both cipher and key are set
I also updated tests to reflect this.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2779
2025-08-26 08:32:29 +03:00
Pekka Enberg
71c1b357e4 Merge 'Remove double indirection in the Parser' from Pedro Muniz
Sometimes in the parser we had enum Variants that contained
`Vec<Box<Expr>>` which is unnecessary and inefficient.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2789
2025-08-26 08:31:17 +03:00
Pekka Enberg
6bc568ff69 Merge 'Update TPC-H running instructions in PERF.md' from Alex Miller
Closes #2756

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2792
2025-08-26 08:29:42 +03:00
pedrocarlo
8010b7d0c7 make simulator use sql_generation crate as dependency 2025-08-25 22:59:31 -03:00
pedrocarlo
d3240844ec refactor Core to remove the double indirection 2025-08-25 22:59:31 -03:00
pedrocarlo
0c1228b484 add Generation context trait to decouple Simulator specific code 2025-08-25 22:59:31 -03:00
pedrocarlo
642060f283 refactor sql_generation/model/query 2025-08-25 22:59:31 -03:00
pedrocarlo
0285bdd72c copy generation code from simulator 2025-08-25 22:59:31 -03:00
pedrocarlo
b16f96b507 create sql_generation crate 2025-08-25 22:59:31 -03:00
Alex Miller
34da6611c1 Update TPC-H running instructions in PERF.md
Closes #2756
2025-08-25 17:43:42 -07:00
Preston Thorpe
401d8e5f74 Merge 'ci: fix merge-pr issue to escape command-line backticks' from Ceferino Patino
Resolves #2778.
I think this issue is probably because backticks in shell end up causing
the command-line to attempt to resolve whatever was inside it as a
command.
Added `shlex.quote` to the commit title in order to hopefully fix this.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2788
2025-08-25 20:33:15 -04:00
PThorpe92
8c64b772e7 Use previous WindowsIO impl as generic IO 2025-08-25 19:04:14 -04:00
C4 Patino
75c85e6284 ci: fix merge-pr issue to escape command-line backticks 2025-08-25 17:59:58 -05:00
pedrocarlo
5108c72a28 remove box from Vec<Box<Expr>> 2025-08-25 19:47:46 -03:00
PThorpe92
177c717f25 Remove windows IO in place of Generic IO 2025-08-25 18:47:21 -04:00
Preston Thorpe
76f2fb5a93 Merge 'Truncate the WAL on last connection close' from Preston Thorpe
`sqlite3.h`
```c
/* Usually, when a database in [WAL mode] is closed or detached from a
** database handle, SQLite checks if there are other connections to the
** same database, and if there are no other database connection (if the
** connection being closed is the last open connection to the database),
** then SQLite performs a [checkpoint] (in truncate mode) before closing the connection and
** deletes the WAL file.
...
```
Currently, the WAL grows unbounded. and because we don't have a `shm`
file, we do not trust `nbackfills`, and we read (and backfill) the
entire WAL every time we open it. So unless there is a manual `PRAGMA
wal_checkpoint(truncate);` issued by a user, this will severely degrade
performance, at least for the first cacheflush each time a database is
opened.
SQLite, when closing the final connection, will automatically run a
checkpoint in truncate mode. We should do this as well :)

Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #2761
2025-08-25 17:19:42 -04:00
PThorpe92
2d661e3304 Apply review suggestions, add logging 2025-08-25 16:56:43 -04:00
PThorpe92
748e339f68 Make clippy happy 2025-08-25 16:52:34 -04:00
PThorpe92
1b514e6d0f Only checkpoint final remaining DB connection, and use Truncate mode 2025-08-25 16:52:29 -04:00
Pekka Enberg
e57f59d744 Merge 'Fix several issues with integrity_check' from Jussi Saurio
Things that were just wrong:
1. No pages other than the root page were checked, because no looping
was done. Add a loop.
2. Rightmost child page was never added to page stack. Add it.
New integrity check features:
- Add overflow pages to stack as well
- Check that no page is referenced more than once in the tree

Closes #2781
2025-08-25 19:05:32 +03:00
Pekka Enberg
6baa4cd1c0 Merge 'DBSP projection' from Pekka Enberg
This PR implements the ProjectOperator for DBSP circuits.

Closes #2773
2025-08-25 19:05:20 +03:00
Pekka Enberg
9a748fb816 Merge 'sqlite3: Implement sqlite3_malloc() and sqlite3_free()' from Pekka Enberg
Closes #2783
2025-08-25 18:15:52 +03:00
Preston Thorpe
4301f1e0e6 Merge 'Use vectored I/O for appending WAL frames' from Preston Thorpe
This PR adds a method `append_frames_vectored` that takes N frames and
optionally the `db size` which will need to be set for the last (commit)
frame, and it calculates the checksums and submits them as a single
`pwritev` call, drastically reducing the number of syscalls needed for
each write operation.

Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #2751
2025-08-25 10:59:25 -04:00
Pekka Enberg
e3cfd1b68e Merge 'sqlite3: Implement sqlite3_next_stmt()' from Pekka Enberg
Closes #2780
2025-08-25 17:51:56 +03:00
Pekka Enberg
9f6468ec82 sqlite3: Implement sqlite3_malloc() and sqlite3_free() 2025-08-25 17:51:07 +03:00
Pekka Enberg
e3ffc82a1d core/incremental: Fix expression compiler to use new parser 2025-08-25 17:48:20 +03:00
Pekka Enberg
8eab179a53 parser/ast: Add Register AST node 2025-08-25 17:48:17 +03:00
Glauber Costa
ffab4a89a2 addressed review comments from Jussi 2025-08-25 17:48:17 +03:00
Glauber Costa
097510216e implement the projector operator for DBSP
My goal with this patch is to be able to implement the ProjectOperator
for DBSP circuits using VDBE for expression evaluation.

*not* doing so is dangerous for the following reason: we will end up
with different, subtle, and incompatible behavior between SQLite
expressions if they are used in views versus outside of views.

In fact, even in our prototype had them: our projection tests, which
used to pass, were actually wrong =) (sqlite would return something
different if those functions were executed outside the view context)

For optimization reasons, we single out trivial expressions: they don't
have go through VDBE. Trivial expressions are expressions that only
involve Columns, Literals, and simple operators on elements of the same
type. Even type coercion takes this out of the realm of trivial.

Everything that is not trivial, is then translated with translate_expr -
in the same way SQLite will, and then compiled with VDBE.

We can, over time, make this process much better. There are essentially
infinite opportunities for optimization here. But for now, the main
warts are:
* VDBE execution needs a connection
* There is no good way in VDBE to pass parameters to a program.
* It is almost trivial to pollute the original connection. For example,
  we need to issue HALT for the program to stop, but seeing that halt
  will usually cause the program to try and halt the original program.

Subprograms, like the ones we use in triggers are a possible solution,
but they are much more expensive to execute, especially given that our
execution would essentially have to have a program with no other role
than to wrap the subprogram.

Therefore, what I am doing is:
* There is an in-memory database inside the projection operator (an
  obvious optimization is to share it with *all* projection operators).
* We obtain a connection to that database when the operator is created
* We use that connection to execute our VDBE, which offers a clean, safe
  and isolated way to execute the expression.
* We feed the values to the program manually by editing the registers
  directly.
2025-08-25 17:48:17 +03:00
Glauber Costa
38def26704 Add expr_compiler
To be used in DBSP-based projections. This will compile an expression
to VDBE bytecode and execute it.

To do that we need to add a new type of Expression, which we call a
Register.

This is a way for us to pass parameters to a DBSP program which will be
not columns or literals, but inputs from the DBSP deltas.
2025-08-25 17:48:17 +03:00
Glauber Costa
911b4c38a6 do not ignore silent failures from view creation
We have an issue at the moment that when a materialized view fails
to be created, we just swallow the error and leave the database in
a funny state.

We have can_create_view() to detect those issues early, but not all
errors can be detected that early.
2025-08-25 17:48:17 +03:00
Jussi Saurio
8cae10f744 Fix several issues with integrity_check
Things that were just wrong:

1. No pages other than the root page were checked, because no looping
was done. Add a loop.
2. Rightmost child page was never added to page stack. Add it.

New integrity check features:

- Add overflow pages to stack as well
- Check that no page is referenced more than once in the tree
2025-08-25 16:51:57 +03:00
PThorpe92
37a7ec7477 Update append_frames_vectored to use new encryption_ctx and apply review 2025-08-25 09:50:57 -04:00