MVCC bootstrap connection got stuck into an infinite statement reparsing
loop because the bootstrap procedure happened before the on-disk schema
was deserialized.
closes#3518Closes#3522
The VDBE step() function was taking Arc<MvStore> by value, causing it to
be cloned on every single step of query execution. This resulted in
thousands of atomic reference count increments/decrements per query,
showing up as a major hotspot in profiling.
Changed step() and related functions to take Option<&Arc<MvStore>>
instead, passing a reference rather than cloning the Arc. This
eliminates the unnecessary atomic operations while maintaining the same
semantics.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#3520
MVCC bootstrap connection got stuck into an infinite statement
reparsing loop because the bootstrap procedure happened before the
on-disk schema was deserialized.
The VDBE step() function was taking Arc<MvStore> by value, causing it to
be cloned on every single step of query execution. This resulted in
thousands of atomic reference count increments/decrements per query,
showing up as a major hotspot in profiling.
Changed step() and related functions to take Option<&Arc<MvStore>>
instead, passing a reference rather than cloning the Arc. This eliminates
the unnecessary atomic operations while maintaining the same semantics.
essentially after the first runthrough of `op_transaction` per a given
`ProgramState`, we weren't resetting the instruction state to `Start´ at
all, which means we didn't do any transaction state checking/updating
after that.
PR includes a rust bindings regression test that used to panic before
this change, and I bet it also fixes this issue in turso-go:
https://github.com/tursodatabase/turso-go/issues/28Closes#3516
essentially after the first runthrough of `op_transaction` per a
given `ProgramState`, we weren't resetting the instruction state
to `Start´ at all, which means we didn't do any transaction state
checking/updating.
PR includes a rust bindings regression test that used to panic before
this change, and I bet it also fixes this issue in turso-go:
https://github.com/tursodatabase/turso-go/issues/28
Now returns e.g.:
```rust
SqlExecutionFailure(
"Invalid argument supplied: Unknown parameter ':email' for query 'INSERT INTO users (email, created_at) VALUES (?, ?)'.
Make sure you're using the correct parameter syntax - named: (:foo), positional: (?, ?)"
)
```
instead of unwrapping a None value and panicing
Closes#3515
Now returns e.g.:
```rust
SqlExecutionFailure(
"Invalid argument supplied: Unknown parameter ':email' for query 'INSERT INTO users (email, created_at) VALUES (?, ?)'.
Make sure you're using the correct parameter syntax - named: (:foo), positional: (?, ?)"
)
```
instead of unwrapping a None value and panicing
Closes#3470
## Background
In a query like `SELECT * FROM t LEFT JOIN s ON t.a=s.a WHERE s.a =
'foo'` we can remove the LEFT JOIN and replace it with an `INNER JOIN`
because NULL values will never be equal to 'foo'. Rewriting as `INNER
JOIN` allows the optimizer to also reorder the table join order to come
up with a more efficient query plan. In fact, we have this optimization
already.
## Problem
However, there is a dumb bug where `WhereTerm`s involving this join
still retain their `from_outer_join` state, resulting in forcing the
evaluation of those terms at the original join index, which results in
completely wrong bytecode if the join optimizer decides to reorder the
join as `s JOIN t` instead. Effectively it will evaluate `t.a=s.a` after
table `s` is open but table `t` is not open yet.
## Fix
This PR fixes that issue by clearing `from_outer_join` properly from the
relevant `WhereTerm`s.
Closes#3475
Fixes:
- `start_value` and `length_value` should be casted to integers
- proper handling of utf-8 characters
- do not need to cast blob to string, as substr in blobs refers to byte
indexes and not char-indexes
Closes#3465
Before, FROM clauses were simply ignored:
```
turso> update t set a = b from (select random() as b);
× Parse error: no such column: b
```
Now, they will be rejected with a clear message. It also makes it
clearer that they need to be implemented:
```
turso> update t set a = b from (select random() as b);
× Parse error: FROM clause is not supported in UPDATE
```
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#3509
The page cache implementation uses a pre-allocated vector (`entries`)
with fixed capacity, along with a custom hash map and freelist. This
design requires expensive upfront allocation when creating a new
connection, which severely impacted performance in workloads that open
many short-lived connections (e.g., our concurrent write benchmarks that
create a new connection per transaction).
Therefore, replace the pre-allocated vector with an intrusive doubly-
linked list. This eliminates the page cache initialization overhead from
connection establishment, but also reduces memory usage to entries that
are actually used. Furthermore, the approach allows us to grow the page
cache with much less overhead.
The patch improves concurrent write throughput benchmark by 4x for
single-threaded performance.
Before:
```
$ write-throughput --threads 1 --batch-size 100 -i 1000 --mode concurrent
Running write throughput benchmark with 1 threads, 100 batch size, 1000 iterations, mode: Concurrent
Database created at: write_throughput_test.db
Thread 0: 100000 inserts in 3.82s (26173.63 inserts/sec)
```
After:
```
$ write-throughput --threads 1 --batch-size 100 -i 1000 --mode concurrent
Running write throughput benchmark with 1 threads, 100 batch size, 1000 iterations, mode: Concurrent
Database created at: write_throughput_test.db
Thread 0: 100000 inserts in 0.90s (110848.46 inserts/sec)
```
Closes#3456
- mostly needed for Drizzle - because other clients with ESM can just
use await connect(...) wrapper
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#3462
The page cache implementation uses a pre-allocated vector (`entries`)
with fixed capacity, along with a custom hash map and freelist. This
design requires expensive upfront allocation when creating a new
connection, which severely impacted performance in workloads that open
many short-lived connections (e.g., our concurrent write benchmarks that
create a new connection per transaction).
Therefore, replace the pre-allocated vector with an intrusive
doubly-linked list. This eliminates the page cache initialization
overhead from connection establishment, but also reduces memory usage to
entries that are actually used. Furthermore, the approach allows us to
grow the page cache with much less overhead.
The patch improves concurrent write throughput benchmark by 4x for
single-threaded performance.
Before:
```
$ write-throughput --threads 1 --batch-size 100 -i 1000 --mode concurrent
Running write throughput benchmark with 1 threads, 100 batch size, 1000 iterations, mode: Concurrent
Database created at: write_throughput_test.db
Thread 0: 100000 inserts in 3.82s (26173.63 inserts/sec)
```
After:
```
$ write-throughput --threads 1 --batch-size 100 -i 1000 --mode concurrent
Running write throughput benchmark with 1 threads, 100 batch size, 1000 iterations, mode: Concurrent
Database created at: write_throughput_test.db
Thread 0: 100000 inserts in 0.90s (110848.46 inserts/sec)
```