Commit Graph

10673 Commits

Author SHA1 Message Date
Glauber Costa
7e4bacca55 remove join operator
I am 100% sure they are total bullshit by now, since we don't implement
the join operator yet. The code evolved a lot, and in every turn there
are issues with aggregators, projectors, filters... some subtle, some
not so subtle.

We keep having to patch join slightly as we make changes to the API, but
we don't truly exercise whether or not they keep working because there
is no support for them in the views. Therefore: let's remove it. We'll
bring it back later.
2025-08-27 11:18:54 -05:00
Glauber Costa
05b275f865 remove min/max and add more tests for other aggregations
min/max require O(N) storage because of deletions. It is easy to see
why: if you *add* a new row, you can quickly and incrementally check
if it is smaller / larger than the previous accumulator.

But when you *delete* a row you can't do that and have to check the
previous values.

Feldera uses something called "traces" which to me look a lot like
indexes. When we implement materialization, this is easy to do. But to
avoid having something broken, we'll just disable min / max until then.
2025-08-27 11:18:54 -05:00
Glauber Costa
6e2bd364ee fix issue with rowids and deletions
The operator itself should handle deletions and updates that change
the rowid by consolidating its state.

Our current materialized views track state themselves, so we don't
see this problem now. But it becomes apparent once we switch the
views to use circuits.
2025-08-27 11:18:54 -05:00
Glauber Costa
dbe29e4bab fix aggregator operator
It needs to keep track of the old values to emit retractions (when the
aggregation changes, remove old value, insert new)
2025-08-27 11:18:54 -05:00
Glauber Costa
c776e4eefb First implementation of Logical plan
This is a first pass on logical plans. The idea is that the DBSP
compiler will have an easier time operating on a logical plan, that
exposes linear algebra operators, than on SQL expr.

To keep this simple, we only support filters, aggregates and projections
for now, and will add more later as we agree on the core of the
implementation.

To make sure that the implementations is reasonable, I tried my best to
generate a couple of logical plans using Datafusion and seeing if we
were generating something similar.

Our plans are not the same as Datafusion's, though. There are two
important differences:

* SQLite is weird, and it allows columns that are not part of the group
  by statement to appear in aggregated statements. For example:
  select a, count(b) from table group by c; <== that "a" is usually not
  permitted and datafusion will reject it. SQLite will be happy to
  accept it

* Datafusion will not generate a projection on queries like this:
  select sum(hex(a)) from table, and just keep the complex expression
  hex(a) inside the aggregation. For DBSP to work well, we'll need an
  explicit aggregation there.

Because there are no users yet, I am marking this as [cfg(test)], but
I wanted to put this out there ASAP.
2025-08-27 11:18:54 -05:00
Avinash Sajjanshetty
9e663c7f46 Add IOContext to carry encryption/checksum ctx 2025-08-27 21:33:05 +05:30
Pekka Enberg
e6ae7f990d Merge 'improve sync engine' from Nikita Sivukhin
This PR makes plenty of changes to the sync engine among which the main
ones are:
1. Now, we have only single database file which we use to directly
execute `pull` and `push` requests (so we don't have `draft` / `synced`
databases)
2. Last-write-win strategy were fixed a little bit - because before this
PR insert-insert conflict wasn't resolved automatically
3. Now sync engine can apply arbitrary `transform` function to the
logical changes
4. Sync-engine-aware checkpoint was implemented. Now, database created
by sync-engine has explicit `checkpoint()` method which under the hood
will use additional file to save frames needed for revert operation
during pull
5. Pull operation were separated into 2 phases internally: wait for
changes & apply changes
    * The problem is that pull operation itself (e.g. apply) right now
require exclusive lock to the sync engine and if user wants to pull &
push independently this will be problematic (as pull will lock the db
and push will never succeed)
6. Added support for V1 pull protocol
7. Exposed simple `stats()` method which return amount of pending cdc
operations and current wal size

Closes #2810
2025-08-27 18:08:21 +03:00
Pekka Enberg
bf7f80a937 core/io: Switch Unix I/O to use libc::pwrite()
We use libc elsewhere for fault injection reasons, so let's do this
call-site too.
2025-08-27 17:56:23 +03:00
Nikita Sivukhin
b67f14c785 fix clippy 2025-08-27 15:57:38 +04:00
Nikita Sivukhin
6e124d927e fix clippy 2025-08-27 15:51:29 +04:00
Nikita Sivukhin
009aa479bf improve sync engine 2025-08-27 15:30:00 +04:00
Pekka Enberg
30c5473151 Merge 'Add some docs on encryption' from Avinash Sajjanshetty
Reviewed-by: Preston Thorpe <preston@turso.tech>
Reviewed-by: bit-aloo (@Shourya742)

Closes #2805
2025-08-27 13:27:00 +03:00
Pekka Enberg
73ef26125b Merge 'Remove Go bindings' from Preston Thorpe
Go bindings have moved to their own repo
https://github.com/tursodatabase/turso-go
Otherwise it is a pain in the ass to have a Go module as a subdirectory
of another repository. and now, they can get the TLC that they needed,
as I found out that it never worked for windows, and the embedding
wasn't working properly in some cases. We want them to be plug and play.

Closes #2807
2025-08-27 13:26:45 +03:00
Pekka Enberg
eced1fe7db Merge 'core/storage: Micro-optimize Pager::commit_dirty_pages()' from Pekka Enberg
There's no need to call io.now() unless debug tracing is on. Let's
micro-optimize commit_dirty_pages() to avoid the unnecessary call.

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #2808
2025-08-27 13:26:22 +03:00
Pekka Enberg
2921033b28 core/storage: Micro-optimize Pager::commit_dirty_pages()
There's no need to call io.now() unless debug tracing is on. Let's
micro-optimize commit_dirty_pages() to avoid the unnecessary call.
2025-08-27 11:12:43 +03:00
TcMits
1b048b2628 clippy+fmt 2025-08-27 15:08:32 +07:00
TcMits
bfff90b70e unrelated changes 2025-08-27 15:02:58 +07:00
TcMits
4ddfdb2a62 finish 2025-08-27 14:58:35 +07:00
TcMits
50bdaec6d0 merge main 2025-08-27 13:36:54 +07:00
TcMits
3cd43b9374 no need to return error in fmt 2025-08-27 13:22:40 +07:00
PThorpe92
37f71b84c8 Remove go github workflow 2025-08-26 19:41:53 -04:00
PThorpe92
bcdcd47358 Remove Go bindings (moved to their own repo tursodatabase/turso-go) 2025-08-26 19:13:17 -04:00
Preston Thorpe
d122105f8c Merge 'Rename Go driver to turso to not conflict with sqlite3' from Preston Thorpe
Also rename Limbo -> Turso while we are here..

Closes #2806
2025-08-26 14:32:09 -04:00
PThorpe92
2614a42294 Update package name in go CI 2025-08-26 14:18:57 -04:00
PThorpe92
4cf111e3c2 Rename Go driver to turso to not conflict with sqlite3, rename limbo->turso 2025-08-26 14:13:42 -04:00
Avinash Sajjanshetty
4d7b4bb711 Add some docs on encryption 2025-08-26 23:11:48 +05:30
Preston Thorpe
77476de547 Merge 'Refactor: Cell instead of RefCell to store CipherMode in connection' from Avinash Sajjanshetty
follow up PR for the review here:
https://github.com/tursodatabase/turso/pull/2779#discussion_r2299460752

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2804
2025-08-26 11:39:05 -04:00
Avinash Sajjanshetty
caa00e31f8 Use Cell instead of RefCell because its nice 2025-08-26 20:00:13 +05:30
bit-aloo
881b986302 improve the limi exprs test with foreach block 2025-08-26 19:56:25 +05:30
bit-aloo
05267454dc remove redundant step during limit/offset evaluation and add test coverage most of the datatypes and some expression 2025-08-26 19:56:25 +05:30
bit-aloo
51d40092db add empty table references, and error out in case if the table references are present in limit/offset 2025-08-26 19:56:25 +05:30
bit-aloo
a3b87cd97f add review comments 2025-08-26 19:56:25 +05:30
bit-aloo
9bebc9b5c7 clippy'ed 2025-08-26 19:56:25 +05:30
bit-aloo
50d9b0f7e3 ignore limit for value less than 9 2025-08-26 19:56:25 +05:30
bit-aloo
a16bee4574 move to new parser 2025-08-26 19:56:24 +05:30
bit-aloo
26d71603ac add a test to test limit expressiveness 2025-08-26 19:56:12 +05:30
bit-aloo
7e5043edfc add null and boolean limit handling 2025-08-26 19:56:12 +05:30
bit-aloo
1e5275682d make early exit for value less than equal zero 2025-08-26 19:56:12 +05:30
bit-aloo
ffcadd00ae evaluate limit or offset expr 2025-08-26 19:56:12 +05:30
bit-aloo
28439efd09 make offset and limit Expr 2025-08-26 19:56:11 +05:30
Jussi Saurio
d56868703d Add assertion: we read a page with the correct id 2025-08-26 17:25:54 +03:00
bit-aloo
ea3ab2a9c7 add ifNeg op_code 2025-08-26 19:55:42 +05:30
Jussi Saurio
66d00915d7 Merge 'Improve documentation of page pinning' from Jussi Saurio
Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2797
2025-08-26 17:18:25 +03:00
Jussi Saurio
bf98bf4576 Merge 'Fix missing functions after revert' from Pedro Muniz
When we reverted the #2789 , we had already merged #2793 . That Pr used
some helper methods that were created in #2789. So I just added them
back here + fixed the simulator dockerfile.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2803
2025-08-26 17:18:06 +03:00
pedrocarlo
aa025c9798 fix missing functions after revert 2025-08-26 10:13:45 -03:00
Pekka Enberg
26ba09c45f Revert "Merge 'Remove double indirection in the Parser' from Pedro Muniz"
This reverts commit 71c1b357e4, reversing
changes made to 6bc568ff69 because it
actually makes things slower.
2025-08-26 14:58:21 +03:00
Pekka Enberg
b7bf4b55ed Merge 'Fail CI run if Turso output differs from SQLite in TPC-H queries' from Jussi Saurio
We had missed #2794 because we don't get any visible feedback on PRs if
we return wrong results from TPC-H, so let's make that happen here.

Closes #2798
2025-08-26 14:36:22 +03:00
Jussi Saurio
e65742e5ff Fail CI if tursodb output differs from sqlite in tpc-h queries 2025-08-26 11:30:37 +03:00
Jussi Saurio
bf58d179db Improve documentation of page pinning 2025-08-26 10:13:25 +03:00
Pekka Enberg
5dd1bca4d3 Merge 'Decouple SQL generation from Simulator crate' from Pedro Muniz
Decouple Sql generation code from simulator code, so that it can
potentially be reused for fuzzing on other crates and to create a
`GenerationContext` trait so that it becomes easier to create
`Simulation Profiles`. Ideally in further PRs, I want to expand the
`GenerationContext` trait so we can guide the generation with context
from the simulation profile.
Depends on #2789 .

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2793
2025-08-26 09:41:58 +03:00