Commit Graph

9807 Commits

Author SHA1 Message Date
bit-aloo
460b87fdfb Refactor simulator logger initialization
- Changed `init_logger()` to return `anyhow::Result<()>`
- Removed deprecated usage of `with_ansi`
2025-10-02 19:18:33 +05:30
bit-aloo
889ae2cd78 Remove log and env_logger in favor of tracing
- Deleted `log` and `env_logger` from simulator dependencies
- Migrated remaining `log::error!` and `log::trace!` calls to `tracing` macros
2025-10-02 19:09:09 +05:30
Pekka Enberg
641c3a73d0 Merge 'sim: add Profile::SimpleMvcc' from Jussi Saurio
- max 2 connections
- max 1 table
- no faults
- no indexes
Makes discovering very basic bugs in MVCC much easier

Closes #3517
2025-10-02 12:12:15 +03:00
Jussi Saurio
fa6ee6b850 Merge 'Fix: JOIN USING should pick columns from left table, not right' from Jussi Saurio
Closes #3468
Closes #3479

Closes #3485
2025-10-02 10:16:38 +03:00
Jussi Saurio
d3c9ef3a5c sim: add Profile::SimpleMvcc
- max 2 connections
- max 1 table
- no faults
- no indexes

Makes discovering very basic bugs in MVCC much easier
2025-10-02 10:14:31 +03:00
Jussi Saurio
7360edc169 Merge 'mvcc: dont try to end pager tx on connection close' from Jussi Saurio
closes #3487

Closes #3491
2025-10-02 10:06:23 +03:00
Pekka Enberg
17e07e620a Merge 'fix/vdbe: reset op_transaction state properly' from Jussi Saurio
essentially after the first runthrough of `op_transaction` per a given
`ProgramState`, we weren't resetting the instruction state to `Start´ at
all, which means we didn't do any transaction state checking/updating
after that.
PR includes a rust bindings regression test that used to panic before
this change, and I bet it also fixes this issue in turso-go:
https://github.com/tursodatabase/turso-go/issues/28

Closes #3516
2025-10-02 09:25:23 +03:00
Jussi Saurio
f48165eb72 fix/vdbe: reset op_transaction state properly
essentially after the first runthrough of `op_transaction` per a
given `ProgramState`, we weren't resetting the instruction state
to `Start´ at all, which means we didn't do any transaction state
checking/updating.

PR includes a rust bindings regression test that used to panic before
this change, and I bet it also fixes this issue in turso-go:

https://github.com/tursodatabase/turso-go/issues/28
2025-10-02 08:40:41 +03:00
Pekka Enberg
f4bb9f1a66 Merge 'bindings/rust: don't panic if user provides invalid parameter' from Jussi Saurio
Now returns e.g.:
```rust
SqlExecutionFailure(
  "Invalid argument supplied: Unknown parameter ':email' for query 'INSERT INTO users (email, created_at) VALUES (?, ?)'.
  Make sure you're using the correct parameter syntax - named: (:foo), positional: (?, ?)"
)
```
instead of unwrapping a None value and panicing

Closes #3515
2025-10-02 08:32:52 +03:00
Jussi Saurio
f06aac6192 bindings/rust: don't panic if user provides invalid parameter
Now returns e.g.:

```rust
SqlExecutionFailure(
  "Invalid argument supplied: Unknown parameter ':email' for query 'INSERT INTO users (email, created_at) VALUES (?, ?)'.
  Make sure you're using the correct parameter syntax - named: (:foo), positional: (?, ?)"
)
```

instead of unwrapping a None value and panicing
2025-10-02 07:42:52 +03:00
Jussi Saurio
a9d782e319 Merge 'Add encryption internals docs' from Avinash Sajjanshetty
preview - https://github.com/tursodatabase/turso/blob/8d2ef700c9b087a7e2
904c25052e4365395b33b3/docs/manual.md#encryption-1

Closes #3461
2025-10-02 07:04:16 +03:00
Jussi Saurio
3c9a6993e3 Merge 'core/storage: Apple platforms support' from Charly Delaroche
Closes #3507
2025-10-02 07:01:56 +03:00
Jussi Saurio
0e1a0e34a6 Merge 'Allow workflow_dispatch for all CI to allow for re-running jobs' from Preston Thorpe
Currently it is mad annoying when 1 job fails and you want a green
check, you have to force push and run them all over again -_-

Closes #3511
2025-10-02 07:01:04 +03:00
Jussi Saurio
e65eae764c Merge 'Resolve appropriate column name for rowid alias/PK' from Preston Thorpe
closes https://github.com/tursodatabase/turso/issues/3512

Closes #3513
2025-10-02 06:59:18 +03:00
Jussi Saurio
9e4ea6ea34 Merge 'core/mvcc/logical-log: fail in read_more_data if couldn't read enough' from Pere Diaz Bou
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3500
2025-10-02 06:58:28 +03:00
Jussi Saurio
bb4e54ca73 Merge 'fix/mvcc: deserialize table_id as i64' from Jussi Saurio
Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #3492
2025-10-02 06:58:01 +03:00
Jussi Saurio
30e6524c4e Fix: JOIN USING should pick columns from left table, not right
Closes #3468
Closes #3479
2025-10-02 06:56:52 +03:00
Jussi Saurio
c0da38e24a Merge 'Clear WhereTerm 'from_outer_join' state when LEFT JOIN is optimized to INNER JOIN' from Jussi Saurio
Closes #3470
## Background
In a query like `SELECT * FROM t LEFT JOIN s ON t.a=s.a WHERE s.a =
'foo'` we can remove the LEFT JOIN and replace it with an `INNER JOIN`
because NULL values will never be equal to 'foo'. Rewriting as `INNER
JOIN` allows the optimizer to also reorder the table join order to come
up with a more efficient query plan. In fact, we have this optimization
already.
## Problem
However, there is a dumb bug where `WhereTerm`s involving this join
still retain their `from_outer_join` state, resulting in forcing the
evaluation of those terms at the original join index, which results in
completely wrong bytecode if the join optimizer decides to reorder the
join as `s JOIN t` instead. Effectively it will evaluate `t.a=s.a` after
table `s` is open but table `t` is not open yet.
## Fix
This PR fixes that issue by clearing `from_outer_join` properly from the
relevant `WhereTerm`s.

Closes #3475
2025-10-02 06:56:07 +03:00
Jussi Saurio
78cccdd87a Merge 'Substr fix UTF-8' from Pedro Muniz
Fixes:
- `start_value` and `length_value` should be casted to integers
- proper handling of utf-8 characters
- do not need to cast blob to string, as substr in blobs refers to byte
indexes and not char-indexes

Closes #3465
2025-10-02 06:55:38 +03:00
PThorpe92
efac598232 Resolve appropriate column name for rowid alias/PK 2025-10-01 21:49:42 -04:00
Preston Thorpe
b310411997 Merge 'printf should truncates floats' from Pavan Nambi
closes https://github.com/tursodatabase/turso/issues/3308

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3415
2025-10-01 19:31:39 -04:00
PThorpe92
a35f8a427f Allow workflow_dispatch for all CI to allow for re-running individual jobs 2025-10-01 19:01:10 -04:00
Preston Thorpe
4066718979 Merge 'Reject unsupported FROM clauses in UPDATE' from Mikaël Francoeur
Before, FROM clauses were simply ignored:
```
turso> update t set a = b from (select random() as b);
  × Parse error: no such column: b
```
Now, they will be rejected with a clear message. It also makes it
clearer that they need to be implemented:
```
turso> update t set a = b from (select random() as b);
  × Parse error: FROM clause is not supported in UPDATE
```

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3509
2025-10-01 17:17:39 -04:00
Avinash Sajjanshetty
ca0d738f4d Add encryption internals docs 2025-10-02 00:14:28 +05:30
Mikaël Francoeur
6307774201 reject FROM clauses 2025-10-01 14:20:23 -04:00
Charly Delaroche
5856dc8733 core/storage: Apple platforms support 2025-10-01 09:59:22 -07:00
Pekka Enberg
bbd2c812c2 github: Reduce macOS workflows
We're getting hit by macOS runner limits so let's reduce the need for them a bit.
2025-10-01 19:16:55 +03:00
Pekka Enberg
c4121441bf Merge 'simulator: reopen database with mvcc and indexes when necessary' from Pedro Muniz
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3503
2025-10-01 19:15:53 +03:00
Pekka Enberg
d217cbeb18 github: Switch Python build to macos-latest
Let's see if we get less throttled like this.
2025-10-01 19:13:08 +03:00
pedrocarlo
b624e449bc simulator: reopen database with mvcc when necessary 2025-10-01 11:38:11 -03:00
Pekka Enberg
4666544ea6 Turso 0.2.0-pre.13 2025-10-01 16:40:53 +03:00
Pekka Enberg
02023ce821 Merge 'core/storage: Switch page cache queue to linked list' from Pekka Enberg
The page cache implementation uses a pre-allocated vector (`entries`)
with fixed capacity, along with a custom hash map and freelist. This
design requires expensive upfront allocation when creating a new
connection, which severely impacted performance in workloads that open
many short-lived connections (e.g., our concurrent write benchmarks that
create a new connection per transaction).
Therefore, replace the pre-allocated vector with an intrusive doubly-
linked list. This eliminates the page cache initialization overhead from
connection establishment, but also reduces memory usage to entries that
are actually used. Furthermore, the approach allows us to grow the page
cache with much less overhead.
The patch improves concurrent write throughput benchmark by 4x for
single-threaded performance.
Before:
```
$ write-throughput --threads 1 --batch-size 100 -i 1000 --mode concurrent
Running write throughput benchmark with 1 threads, 100 batch size, 1000 iterations, mode: Concurrent
Database created at: write_throughput_test.db
Thread 0: 100000 inserts in 3.82s (26173.63 inserts/sec)
```
After:
```
$ write-throughput --threads 1 --batch-size 100 -i 1000 --mode concurrent
Running write throughput benchmark with 1 threads, 100 batch size, 1000 iterations, mode: Concurrent
Database created at: write_throughput_test.db
Thread 0: 100000 inserts in 0.90s (110848.46 inserts/sec)
```

Closes #3456
2025-10-01 16:39:47 +03:00
Pere Diaz Bou
b89df44339 fmt 2025-10-01 14:31:30 +02:00
Pere Diaz Bou
dc0245d758 core/mvcc/logical-log: fail in read_more_data if couldn't read enough 2025-10-01 14:25:22 +02:00
Pekka Enberg
981a762fd7 Merge 'Improve throughput benchmarks' from Pekka Enberg
Closes #3493
2025-10-01 15:24:03 +03:00
Pekka Enberg
4d77786b53 Merge 'Beta' from Pekka Enberg
Reviewed-by: Glauber Costa <glommer@gmail.com>

Closes #3484
2025-10-01 15:23:28 +03:00
Jussi Saurio
8166680ad8 Merge 'make connect() method optional and call it implicitly on first query execution' from Nikita Sivukhin
- mostly needed for Drizzle - because other clients with ESM can just
use await connect(...) wrapper

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3462
2025-10-01 15:19:07 +03:00
Pekka Enberg
2b168cf7b0 core/storage: Switch page cache queue to linked list
The page cache implementation uses a pre-allocated vector (`entries`)
with fixed capacity, along with a custom hash map and freelist. This
design requires expensive upfront allocation when creating a new
connection, which severely impacted performance in workloads that open
many short-lived connections (e.g., our concurrent write benchmarks that
create a new connection per transaction).

Therefore, replace the pre-allocated vector with an intrusive
doubly-linked list. This eliminates the page cache initialization
overhead from connection establishment, but also reduces memory usage to
entries that are actually used. Furthermore, the approach allows us to
grow the page cache with much less overhead.

The patch improves concurrent write throughput benchmark by 4x for
single-threaded performance.

Before:

```
$ write-throughput --threads 1 --batch-size 100 -i 1000 --mode concurrent
Running write throughput benchmark with 1 threads, 100 batch size, 1000 iterations, mode: Concurrent
Database created at: write_throughput_test.db
Thread 0: 100000 inserts in 3.82s (26173.63 inserts/sec)
```

After:

```
$ write-throughput --threads 1 --batch-size 100 -i 1000 --mode concurrent
Running write throughput benchmark with 1 threads, 100 batch size, 1000 iterations, mode: Concurrent
Database created at: write_throughput_test.db
Thread 0: 100000 inserts in 0.90s (110848.46 inserts/sec)
```
2025-10-01 14:41:35 +03:00
Pekka Enberg
51f4f1fb8b perf/throughput: Add plotting scripts
This adds few helper scripts to plot throughput results.
2025-10-01 14:08:26 +03:00
Pekka Enberg
3fcb0581ec perf/throughput: Fix thread pool size in Turso benchmark
Replace #[tokio::main] with explicit Runtime builder to set the number
of tokio worker threads to match the benchmark thread count. This
ensures proper thread control and avoids interference from default
tokio thread pool sizing.
2025-10-01 14:08:26 +03:00
Jussi Saurio
ee6b943586 Merge 'fix/mvcc: set log offset to end of file after recovery finishes' from Jussi Saurio
otherwise we start overwriting existing log entries
Closes #3495

Reviewed-by: Nikita Sivukhin (@sivukhin)
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #3496
2025-10-01 13:52:12 +03:00
Jussi Saurio
480d066147 Merge 'mvcc: dont use mv store for ephemeral tables' from Jussi Saurio
not sure how these would even work with mvcc - either way, an ephemeral
table use an ephemeral database file and pager so i don't think putting
its writes into MV store makes sense
TBH i have no idea if there are any weird interactions here but the code
we have now for sure does not work
Closes #3486

Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #3490
2025-10-01 13:50:30 +03:00
Jussi Saurio
760ebe1370 Merge 'Add Database::indexes_enabled()' from Jussi Saurio
Needed Elsewhere ™️ for fixing simulator `reopen_database()`

Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #3488
2025-10-01 13:50:11 +03:00
Jussi Saurio
b2f9854b1c Add more documentation for WhereTerm::from_outer_join 2025-10-01 13:42:36 +03:00
Jussi Saurio
11bb5f9507 Merge 'Simulator: Concurrent transactions' from Pedro Muniz
Depends on #3272.
First big step towards: #1851
- Add ignore error flag to `Interaction` to ignore parse errors when
needed, and still properly report other errors from intermediate
queries.
- adjusted shrinking to accommodate transaction statements from
different connections and properly remove extensional queries from some
properties
- MVCC: generates `Begin Concurrent` and `Commit` statements that are
interleaved to test snapshot isolation between connection transactions.
- MVCC: if the next interactions are going to contain a DDL statement,
we first commit all transaction and execute the DDL statements serially

Closes #3278
2025-10-01 12:53:32 +03:00
Jussi Saurio
e9f0c59bcc fix/mvcc: set log offset to end of file after recovery finishes
otherwise we start overwriting existing log entries
2025-10-01 12:46:24 +03:00
Pekka Enberg
63895dfecd perf/throughput: Simplify benchmark output to CSV format
Remove verbose output from rusqlite benchmark and output only CSV
format: system,threads,batch_size,compute,throughput

This makes it easier to parse and plot benchmark results.
2025-10-01 11:06:27 +03:00
Pekka Enberg
eeb14b25c6 perf/throughput: Replace think time with CPU-bound compute time
Replace the sleep-based --think parameter with a --compute parameter
that uses a busy loop to simulate realistic CPU or GPU bound business
logic (e.g., parsing, data aggregation, or ML inference). The compute
time is now specified in microseconds instead of milliseconds for
finer granularity.
2025-10-01 11:06:27 +03:00
Jussi Saurio
bcb941f33b fix/mvcc: deserialize table_id as i64 2025-10-01 10:26:23 +03:00
Jussi Saurio
c395e051cb mvcc: dont try to end pager tx on connection close 2025-10-01 10:17:41 +03:00