Commit Graph

10667 Commits

Author SHA1 Message Date
RS2007
bdf720d205 adding regression test for duplicate cte 2025-10-31 23:15:11 +05:30
RS2007
7fff8daaa5 Fixing clippy error 2025-10-31 23:14:08 +05:30
RS2007
1f576593ec adding duplicate cte name checks in planner.rs 2025-10-31 23:14:08 +05:30
Pekka Enberg
41aa49c7b5 Merge 'Fix self-insert SUM when table uses INTEGER PRIMARY KEY' from Duy Dang
Close #3868

Closes #3870
2025-10-31 17:01:22 +02:00
Pekka Enberg
cdd9ec3438 Merge 'bindings/java: Implement setObject(int, Object) in JDBC4PreparedStatement' from Orange banana
## Purpose
* Implement `setObject(int, Object)` to support binding of common Java
types to SQL parameters in JDBC4.
* This implementation currently covers only standard JDBC4 supported
types. LOB and stream bindings are not yet implemented.
## Changes
* Implemented JDBC4PreparedStatement#setObject(int, Object) handling for
  * `String`, `Integer`, `Long`, `Boolean`, `Double`, `Float`, `Byte`,
`Short`
  * `byte[]`, `Date`, `Time`, `Timestamp`, `BigDecimal`
* Added validation for unsupported types (`Blob`, `Clob`, `InputStream`,
`Reader`)
* Added corresponding unit test `testSetObjectCoversAllSupportedTypes`
to verify correctness
## Note
* Additional work (e.g., LOB/Stream handling) will be addressed
separately once driver support is available.
## Related Issue
#615

Reviewed-by: Kim Seon Woo (@seonWKim)

Closes #3864
2025-10-31 17:00:31 +02:00
Pekka Enberg
11f95253a4 Merge 'Update Java package version in scripts/update-version.py' from Pekka Enberg
Closes #3873
2025-10-31 15:59:46 +02:00
Pekka Enberg
8ae49b0dad Add Java badge to README.md 2025-10-31 13:57:33 +02:00
Pekka Enberg
8ee5b5621e Update Java package version in scripts/update-version.py 2025-10-31 13:43:45 +02:00
Orange flavored banana
5fef79d9f6 feat(jdbc): remove unnecessary java.sql prefixes in setObject 2025-10-31 10:38:30 +09:00
Duy Dang
733dc762ed Fix self-insert SUM when table uses INTEGER PRIMARY KEY 2025-10-31 03:34:10 +07:00
Pekka Enberg
331ba14e7c Turso 0.3.0 2025-10-30 18:16:12 +02:00
Pekka Enberg
a4d43d51d4 Update CHANGELOG.md 2025-10-30 18:15:59 +02:00
Pekka Enberg
c91b66ba38 Turso 0.3.0-pre.7 2025-10-30 18:15:14 +02:00
Pekka Enberg
128f2f1ca5 Merge 'Add 'make test-single'' from Jussi Saurio
e.g. `make test-single TEST=subquery.test`
Plus: chmod +x to all tcl tests in testing folder

Closes #3865
2025-10-30 14:19:46 +02:00
Pekka Enberg
43b5ea5363 Merge 'antithesis: Upload config image in GitHub Actions workflow' from Pekka Enberg
The Antithesis config image was not being uploaded during CI runs, only
the workload image. This caused experiment failures when the config
image expired from the registry after 6 months of inactivity.

Closes #3863
2025-10-30 12:57:42 +02:00
Jussi Saurio
7e65657ab0 Add 'make test-single'
e.g. `make test-single TEST=subquery.test`

Plus: chmod +x to all tcl tests in testing folder
2025-10-30 11:38:56 +02:00
Orange flavored banana
4cd007f2eb Test(jdbc): Add coverage for setObject(int, Object) 2025-10-30 15:35:31 +09:00
Pekka Enberg
d71a33a188 antithesis: Upload config image in GitHub Actions workflow
The Antithesis config image was not being uploaded during CI runs, only
the workload image. This caused experiment failures when the config
image expired from the registry after 6 months of inactivity.
2025-10-30 07:49:44 +02:00
Orange flavored banana
53ab453015 Feat(jdbc): Implement setObject(int, Object) in JDBC4PreparedStatement 2025-10-30 09:54:42 +09:00
Pekka Enberg
84a367b00e Merge 'Implement wasNull tracking in ResultSet getter methods' from 김민석
## Summary
Implemented comprehensive wasNull tracking and refactored getter methods
in JDBC4ResultSet to ensure JDBC specification compliance and improve
code maintainability.
### Changes
Added wasNull tracking to all getter methods: Covers primitive types,
objects, dates/times, streams, and BigDecimal
Refactored columnLabel getters to use delegation pattern: Eliminates
code duplication and ensures consistent wasNull behavior
### Bug Fixes & Code Quality
- Fixed getString(String) to return null instead of empty string for
null values
- Added @Nullable annotation to getBytes(String) to fix NullAway error
- Preserved String parsing in getDate(String) for TEXT-formatted dates
- Extracted timezone offset calculation to helper method
### Testing
Added comprehensive tests for wasNull tracking, columnLabel getters,
stream methods, and null handling

Closes #3838
2025-10-29 18:10:42 +02:00
Pekka Enberg
d6f6cb3524 Merge 'perf/throughput: Improve reproducibility' from Pekka Enberg
Improve reproducibility by documenting the steps needed to run the
benchmarks and generate the plots. Also simplify plot generation a bit.

Closes #3843
2025-10-29 18:10:34 +02:00
Pekka Enberg
50ad2f801a Turso 0.3.0-pre.6 2025-10-29 17:54:10 +02:00
Pekka Enberg
eaff2d135f Merge 'Fix database state going back in time after sync' from Nikita Sivukhin
This PR fixes sync engine bug which leads to the state of db going back
in time.
The mistake was made in the pull operation which before fetched
information about last commited changes to the remote separately. This
crates a problem since pull already works with fixed WAL updates
received earlier from remote - and this WAL update can be inconsistent
with more fresh value of last_change_id fetched from remote.
The fix is to use only WAL update and "extract" necessary information
from it. In order to do that sync now read meta sync table while pull
operation is in progress (at the moment when local changes are rolled
back and remote changes already applied) and do not use any external
source to consume that information.
Also, this PR fixes bug in the JS tursodatabase client and reset
statement in the finally block opposed to the previous approach to reset
statement at the beginning. The problem with previous approach were in
cases when client do not fully consumed the statement (e.g. abort
iteration and take only one row) in which case the statement will be
kept active and can prevent another write transaction from starting or
just occupy place as a read transaction.

Closes #3860
2025-10-29 17:53:45 +02:00
Jussi Saurio
79442b3da6 Merge 'translate: disallow correlated subqueries in HAVING and ORDER BY' from Jussi Saurio
These are supported by SQLite, but we cannot handle them correctly yet.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3861
2025-10-29 16:05:43 +02:00
Jussi Saurio
6cf2072b51 translate: disallow correlated subqueries in HAVING and ORDER BY
These are supported by SQLite, but we cannot handle them correctly yet.
2025-10-29 15:37:19 +02:00
Nikita Sivukhin
d013876998 cargo fmt 2025-10-29 16:46:51 +04:00
Nikita Sivukhin
9e04687108 add one more test 2025-10-29 16:24:05 +04:00
Nikita Sivukhin
e5b11a3278 uncomment tests 2025-10-29 16:24:05 +04:00
Nikita Sivukhin
e27b0d5d6b add more tests 2025-10-29 16:24:05 +04:00
Nikita Sivukhin
82d54999b1 fix pull operation in sync engine
- before we fetched pull generation and last_change_id from the remote during pull - which is incorrect because fetched information can be inconsistent with WAL updates we received from the server (latest server state can be in "future" compared to the WAL updates we got since we can make push in parallel with updates pull operation)
- now we read information about "server state" (pull generation, last_change_id) directly from the local DB right after we applied changes from the remote which get us consistent view on the state considering WAL updates we got
- also fetching remote in the pull is bad - since pull block writes and network call with unpredictable latency poorly affect writes to the database
2025-10-29 16:24:05 +04:00
Nikita Sivukhin
b01cec2ba4 wip 2025-10-29 16:24:05 +04:00
Nikita Sivukhin
4c98861590 adjust logs 2025-10-29 16:24:05 +04:00
Nikita Sivukhin
7e63135abb reset statement after execution 2025-10-29 16:24:05 +04:00
Jussi Saurio
96990e1168 Merge 'Stmt reset cursors' from Nikita Sivukhin
This PR reset cursor state in the `stmt.reset()` method because under
the hood statement caches some BTree state which can be no longer valid
at the moment of next statement run.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3859
2025-10-29 14:04:52 +02:00
Jussi Saurio
7a7cc832d6 Merge 'reset move_to_right_state cached state in case of quick balancing' from Nikita Sivukhin
Reset cached value for `move_to_right_state` in case of `balance_quick`.
I don't know if it's possible to hit this situation with current
generation of VM programs - so don't know what test I can add here.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3858
2025-10-29 14:04:40 +02:00
Nikita Sivukhin
c8be79ca94 cargo fmt 2025-10-29 15:15:45 +04:00
Nikita Sivukhin
a2d11f9263 reset cursors when statement is reseted 2025-10-29 15:13:00 +04:00
Nikita Sivukhin
35c323730c add test to reproduce the bug with cached cursors for statement in between of different runs
thread 'query_processing::test_read_path::test_stmt_reset' panicked at core/storage/sqlite3_ondisk.rs:754:9:
assertion failed: self.page_type() == PageType::TableLeaf
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
test query_processing::test_read_path::test_stmt_reset ... FAILED
2025-10-29 15:13:00 +04:00
Nikita Sivukhin
9629e2f26a reset move_to_right_state cached state in case of quick balancing 2025-10-29 14:58:42 +04:00
Jussi Saurio
4bf8ad8cfd Merge 'Support subqueries in all positions of a SELECT statement' from Jussi Saurio
Follow-up to #3847.
Adds support for subqueries in all other positions of a SELECT (the
result list, GROUP BY, ORDER BY, HAVING, LIMIT, OFFSET).
Turns out I am a sql noob and didn't realize that correlated subqueries
are supported in basically all positions except LIMIT/OFFSET, so added
support for those too + accompanying TCL tests.
Thankfully the abstractions introduced in #3847 carry over to this very
well so the code change is relatively small (over half of the diff is
tests and a lot of the remaining diff is just moving logic around).

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3852
2025-10-29 10:19:39 +02:00
Jussi Saurio
fcb927ed24 Merge 'Initialize LIMIT after after ORDER BY / GROUP BY initialization' from Jussi Saurio
Closes #3853
Currently LIMIT 0 jumps to "after the main loop", and it is done before
ORDER BY and GROUP BY cursor have had a chance to be initialized, which
causes a panic.
Simplest fix for now is to delay the LIMIT initialization.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3854
2025-10-29 10:17:05 +02:00
Jussi Saurio
29fe3b585a Add more tests and disable correlated IN-subqueries in HAVING position
I discovered a flaw in our current translation that makes queries of type
HAVING foo IN (SELECT ...) not work properly - in these cases we need to
defer translation of the subquery until later.

I will fix this in a future PR because I suspect it's not trivial.
2025-10-29 09:57:55 +02:00
Jussi Saurio
ad723b615f Merge 'index_method: fully integrate into query planner' from Nikita Sivukhin
This PR completely integrate custom indices to the query planner.
In order to do that new `Cursor::IndexMethod` is introduced with few
correlated changes in the VM implementation:
1. Added special `IndexMethod{Create,Destroy,Query}` opcodes to handle
index method creation, deletion and query
2. `Next` , `IdxRowid` , `IdxInsert`, `IdxDelete` opcodes updated to
properly handle new cursor case

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3827
2025-10-29 09:42:37 +02:00
Pekka Enberg
f10431d24f perf/throughput: Improve reproducibility
Improve reproducibility by documenting the steps needed to run the
benchmarks and generate the plots. Also simplify plot generation a bit.
2025-10-28 15:08:53 +02:00
Pekka Enberg
067c4f624b Turso 0.3.0-pre.5 2025-10-28 14:49:34 +02:00
Pekka Enberg
dae2930743 Merge 'core: Switch to FxHash to improve performance' from Pekka Enberg
The default Rust hash map is slow for integer keys. Switch to FxHash
instead to reduce executed instructions for, for example, throughput
benchmark.
Before:
```
penberg@turing:~/src/tursodatabase/turso/perf/throughput/turso$ perf stat ../../../target/release/write-throughput --threads 1 --batch-size 100 --compute 0 -i 10000
Turso,1,100,0,106875.21

 Performance counter stats for '../../../target/release/write-throughput --threads 1 --batch-size 100 --compute 0 -i 10000':

          2,908.02 msec task-clock                       #    0.310 CPUs utilized
            30,508      context-switches                 #   10.491 K/sec
               261      cpu-migrations                   #   89.752 /sec
               813      page-faults                      #  279.572 /sec
    20,655,313,128      instructions                     #    1.73  insn per cycle
                                                  #    0.14  stalled cycles per insn
    11,930,088,949      cycles                           #    4.102 GHz
     2,845,040,381      stalled-cycles-frontend          #   23.85% frontend cycles idle
     3,814,652,892      branches                         #    1.312 G/sec
        54,760,600      branch-misses                    #    1.44% of all branches

       9.372979876 seconds time elapsed

       2.276835000 seconds user
       0.530135000 seconds sys
```
After:
```
penberg@turing:~/src/tursodatabase/turso/perf/throughput/turso$ perf stat ../../../target/release/write-throughput --threads 1 --batch-size 100 --compute 0 -i 10000
Turso,1,100,0,108663.84

 Performance counter stats for '../../../target/release/write-throughput --threads 1 --batch-size 100 --compute 0 -i 10000':

          2,838.65 msec task-clock                       #    0.308 CPUs utilized
            30,629      context-switches                 #   10.790 K/sec
               351      cpu-migrations                   #  123.650 /sec
               818      page-faults                      #  288.165 /sec
    19,887,102,451      instructions                     #    1.72  insn per cycle
                                                  #    0.14  stalled cycles per insn
    11,593,166,024      cycles                           #    4.084 GHz
     2,830,298,617      stalled-cycles-frontend          #   24.41% frontend cycles idle
     3,764,334,333      branches                         #    1.326 G/sec
        53,157,766      branch-misses                    #    1.41% of all branches

       9.218225731 seconds time elapsed

       2.231889000 seconds user
       0.508785000 seconds sys

```

Closes #3837
2025-10-28 14:49:09 +02:00
Pekka Enberg
76da008bc2 Merge 'bindings/rust: Enable mimalloc as global allocator' from Pekka Enberg
This improves performance by using mimalloc for memory allocation in the
Rust bindings.

Closes #3839
2025-10-28 14:49:02 +02:00
Pekka Enberg
810ed8ad60 Merge 'Don't allow autovacuum to be flipped on non-empty databases' from Pavan Nambi
Turso incorrectly creates the first table in an autovacuumed table in
page 2.
(Note: this is on collaboration with @LeMikaelF)
SQLite does not allow enabling or disabling auto-vacuum after the first
table has been created
(https://sqlite.org/pragma.html#pragma_auto_vacuum). This is because the
sequence of the pages in the databases is different when auto-vacuum is
enabled, because the first b-tree page must be page 3 instead of 2, to
make room for the first [Pointer Map
page](https://sqlite.org/fileformat.html#pointer_map_or_ptrmap_pages).
But Turso doesn't currently consider this, which can lead to data loss.
The simplest way to reproduce this is to create an autovacuumed
databases with either `pragma auto_vacuum=full` so that autovacuum runs
on each commit, and then create a table with some data. Turso will
incorrectly create the new table on page 2. After this, every time a new
page is created, either through a page split or because a new table is
created, Turso will write a 5-byte pointer in page 2, starting from the
top of the page, thereby overwriting existing data.
For example, let's start with a clean database and the first bytes of
page 2. It starts with `0d`, the discriminator for a leaf page
([source](https://www.sqlite.org/fileformat.html#b_tree_pages)). The
next interesting number is the number of cells contained in this page
(`01`) at offset 5.
```
$ cargo run -- /tmp/a.db
turso> create table t(a);
turso> insert into t values ('myvalue');

$ dbtotxt /tmp/a.db
| size 8192 pagesize 4096 filename a.db
| page 1 offset 0
# ...snip...
| page 2 offset 4096
|      0: 0d 00 00 00 01 0f f5 00 0f f5 00 00 00 00 00 00   ................
|   4080: 00 00 00 00 00 09 01 02 1b 6d 79 76 61 6c 75 65   .........myvalue
| end a.db
```
Pointer map pages are located every N pages, starting from page 2, and
contain a list of 5-byte pointers that represent the parent page of a
certain page. So whenever Turso or SQLite needs to add a page, it will
overwrite 5 bytes of page 2. This means that for data loss to occur, it
is sufficient to add a single page to the database, for example by
creating a table. Offset 5 will then be zeroed out:
```
$ cargo run -- /tmp/a.db
turso> create table t(a);
turso> insert into t values ('myvalue');
turso> pragma auto_vacuum=full;
turso> create table tt(a);

$ dbtotxt /tmp/a.db
| size 12288 pagesize 4096 filename a.db
| page 1 offset 0
# ...snip...
| page 2 offset 4096
|      0: 01 00 00 00 00 0f f5 00 0f f5 00 00 00 00 00 00   ................
|   4080: 00 00 00 00 00 09 01 02 1b 6d 79 76 61 6c 75 65   .........myvalue
```
Creating more tables, or adding more B-tree pages, will keep overwriting
the rest of the page, until the cells themselves are also overwritten.
## Reproducing the issue in the simulator
We have been unable to reproduce this exact corruption mode in the
simulator, but patching it shows many failure modes, all of which don't
occur with the unpatched simulator. The following seeds are failing. The
following seeds are showing the issue when the patched simulator is ran
against `main`:
- `11522841279124073062`, with "Assertion 'table inquisitive_graham_159
should contain all of its expected values' failed: table
inquisitive_graham_159 does not contain the expected values, the
simulator model has more rows than the database"
- `7057400018220918989`, `16028085350691325843`, `7721542713659053944`,
and `203017821863546118`, with "Failed to read ptrmap key=XXX"
- `12533694709304969540`, `18357088553315413457`, `3108945730906932377`,
with "Integrity Check Failed: Cell N in page 2 is out of range."
    - `4757352625344646473`, with "dirty pages should be empty for read
txn"
- `7083498604824302257`, with "header_size: 6272, header_len_bytes: 2,
payload.len(): 13"
- `17881876827470741581`, with "ParseError("no such table:
focused_historians_416")"
- `2092231500503735693`, with "range end index 4789 out of range for
slice of length 4096"
- `7555257419378470845`, with malformed database schema
(imaginative_ontivero\u{1})"
- `12905270229511147245`, with "index out of bounds: the len is 4096 but
the index is 4096"
## Fixing the issue
- When DB is opened, we read the `auto_vacuum` state, instead of
assuming `auto_vacuum=none`.
- Don't allow auto_vacuum to be flipped on non-empty databases as if we
allow this it could cause overlap with existing bits.(ptrmap could
overwrite existing data)
- Modify integrity check to avoid reporting that page 2 is orphaned in
auto-vacuumed databases.
Fixes #3752

Closes #3830
2025-10-28 14:48:35 +02:00
Jussi Saurio
ec1eac2943 Include subqueries in all positions in subquery fuzz test 2025-10-28 14:32:55 +02:00
Jussi Saurio
ca70df21ac Update COMPAT.md 2025-10-28 13:11:12 +02:00