Commit Graph

10819 Commits

Author SHA1 Message Date
Nikita Sivukhin
a25e3e76eb wip 2025-11-12 13:21:34 +04:00
Nikita Sivukhin
6f7edcaddd agent review fixes 2025-11-12 12:32:45 +04:00
Nikita Sivukhin
be12ca01aa add is_hole / punch_hole optional methods to IO trait and remove is_hole method from Database trait 2025-11-12 12:04:42 +04:00
Nikita Sivukhin
b73ff13b88 add simple implementation of Sparse IO 2025-11-12 12:04:12 +04:00
Nikita Sivukhin
d519945098 make ArenaBuffer unsafe Send + Sync 2025-11-12 10:54:40 +04:00
Nikita Sivukhin
33375697d1 add partial database storage implementation 2025-11-12 10:53:25 +04:00
Nikita Sivukhin
a855a657aa report network stats 2025-11-12 10:53:25 +04:00
Nikita Sivukhin
02275a6fa1 fix js bindings 2025-11-12 10:53:25 +04:00
Nikita Sivukhin
98db727a99 integrate extra io stepping logic to the JS bindings 2025-11-12 10:53:25 +04:00
Nikita Sivukhin
f3dc19cb00 UNSAFE: make Completion to be Send + Sync 2025-11-12 10:53:25 +04:00
Nikita Sivukhin
d42b5c7bcc wip 2025-11-12 10:53:25 +04:00
Nikita Sivukhin
95f31067fa add has_hole API in the DatabaseStorage trait 2025-11-12 10:53:25 +04:00
Nikita Sivukhin
34f1072071 add hooks to plug partial sync in the sync engine 2025-11-12 10:53:25 +04:00
Preston Thorpe
dad7feffca Merge 'Completion: make it Send + Sync' from Nikita Sivukhin
This PR makes Completion to be `Send` and also force internal callbacks
to be `Send`.
The reasons for that is following:
1. `io_uring` right now can execute completion at any moment potentially
on arbitrary thread, so we already implicitly rely on that property of
`Completion` and its callbacks
2. In case of partial sync
(https://github.com/tursodatabase/turso/pull/3931), there will be an
additional requirement for Completion to be Send as it will be put in
the separate queue associated with `DatabaseStorage` (which is Send +
Sync) processed in parallel with main IO
3. Generally, it sounds pretty natural in the context of async io to
have `Send` Completion so it can be safely transferred between threads
The approach in the PR is hacky as `Completion` made `Send` in a pretty
unsafe way. The main reason why Rust can't derive `Send` automatically
is following:
1. Many completions holds `Arc<Buffer>` internally which needs to be
marked with unsafe traits explicitly as it holds `ptr: NonNull<u8>`
2. `Completion` holds `CompletionInner` as `Arc` which internally holds
completion callback as `Box<XXXComplete>`, but because it's guarded by
`Arc` - Rust forces completion callback to also be Sync (not only Send)
and as we usually move Completion in the callback - we get a cycle here
and with current code Send for Completion implies Sync for Completion.
So, in order to fix this, PR marks `ArenaBuffer` as Send + Sync and
forces completion callbacks to be Send + Sync too. It's seems like
`Sync` requirement is theoretically unnecessary and `Send` should be
enough - but with current code organization Send + Sync looks like the
simplest approach.
Making `ArenaBuffer` Sync sounds almost correct, although I am worried
about read/write access to it as internally `ArenaBuffer` do not
introduce any synchronization of its reads/writes - so potentially we
already can hit some multi-threading bugs with io_uring do to
`ArenaBuffer` used from different threads (or maybe there are some
implicit memory barriers in another parts of the code which can
guarantee us that we will properly use `ArenaBuffer` - but this sounds
like a pure luck)

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3935
2025-11-11 20:10:52 -05:00
Nikita Sivukhin
78b6eeae80 cargo fmt 2025-11-11 22:47:25 +04:00
Nikita Sivukhin
5e09c4f0c0 make completion send + sync 2025-11-11 22:42:20 +04:00
Nikita Sivukhin
9a9aacaf32 fix compilation 2025-11-11 22:22:34 +04:00
Nikita Sivukhin
6e3b364bb5 make completion callbacks Send
- IO uring already use this because it can invoke callback on another thread
2025-11-11 21:44:12 +04:00
Pere Diaz Bou
c4d89662a8 Merge 'core/mvcc: use btree cursor to navigate rows' from Pere Diaz Bou
The current implementation is simple, we have a pointer called
`CursorPosition::Loaded` that points to a rowid and if it's poiting to
either btree or mvcc.
Moving with `next` will `peek` both btree and mvcc to ensure we load the
correct next value. This draws some inefficiencies for now as we could
simply skip one or other in different cases.
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Combine MVCC index with a BTree-backed lazy cursor (including rootpage
mapping) and add row-version state checks, updating VDBE open paths and
tests.
>
> - **MVCC Cursor (`core/mvcc/cursor.rs`)**:
>   - Introduce hybrid cursor that merges MVCC index with `BTreeCursor`;
enhanced `CursorPosition` (tracks `in_btree`/`btree_consumed`).
>   - Implement state machine for `next`, coordinating MVCC/BTree
iteration and filtering via `RowVersionState`.
>   - `current_row()` now yields immutable records from BTree or MVCC;
add `read_mvcc_current_row`.
>   - Update `rowid`, `seek`, `rewind`, `last`, `seek_to_last`,
`exists`, `insert` to honor hybrid positioning.
> - **MVCC Store (`core/mvcc/database/mod.rs`)**:
>   - Add `RowVersionState` and `find_row_last_version_state`.
>   - Remove eager table initialization/scan helpers and `loaded_tables`
tracking.
>   - Add `get_real_table_id` for mapping negative IDs to physical root
pages.
> - **VDBE (`core/vdbe/execute.rs`)**:
>   - Route BTree cursor creation through
`maybe_transform_root_page_to_positive` and promote to `MvCursor`
without pager arg.
>   - Apply mapping in `OpenRead`, `OpenWrite`, `OpenDup`, and index
open paths.
> - **Tests (`core/mvcc/database/tests.rs`)**:
>   - Adjust to new cursor API; add coverage for BTree+MVCC iteration
and gaps after checkpoint/restart.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
b581519be4. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

Closes #3829
2025-11-11 17:53:17 +01:00
Pere Diaz Bou
b581519be4 more clippy 2025-11-10 17:20:15 +01:00
Pere Diaz Bou
32469bad10 clippy mvcc 2025-11-10 17:13:34 +01:00
Pere Diaz Bou
a08b5f2239 core/mvcc: next and rewind skip btree rows that are in should be updated/deleted in mvcc 2025-11-10 16:51:01 +01:00
Pere Diaz Bou
2fd4407a03 core/execute: map negative root page to positive if we can 2025-11-10 16:51:01 +01:00
Pere Diaz Bou
9004d4f3f1 core/mvcc: remove intialize of mvcc table 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
58f5b9c018 core/mvcc: is_btree_allocated fix 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
420447d6bd core/mvcc/tests: fix use read_mvcc_current_row 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
198e0434d0 core/mvcc/cursor: current_row return either btree or mvcc 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
e78590b948 core/mvcc: add is_btree_allocated to MvccId 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
4b616d1fd8 core/mvcc/cursor: next use both btree cursor and mvcc cursor to decide on row 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
7b7bf6738c core/mvcc/tests: test mixed btree mvcc cursor 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
7d930e3df3 core/mvcc/test: add test for restart after checkpoint 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
724bc94f96 core/mvcc/cursor: rewind with btree 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
a7614267af core/mvcc/cursor: next with btree 2025-11-10 16:48:13 +01:00
Pere Diaz Bou
38f6d20def core/mvcc/cursor: CursorPosition::Loaded include if points to btree 2025-11-10 16:48:13 +01:00
Jussi Saurio
a47ac6cb96 Commit changes to workspace Cargo.lock 2025-11-10 11:58:09 +02:00
Jussi Saurio
d0da6b5d16 Merge 'Fix seek not applying correct affinity to seek expr' from Pedro Muniz
Depends on #3923 .
To have similar semantics to how `op_compare` works, we need to apply an
affinity to the values referenced in the `SeekKey` that is used for
seeking. This means keeping some affinity metadata for the `WhereTerms`
in the optimization phase, then before seeking, we emit an affinity
conversion.  Had to dig deep in the sqlite code to understand this
better.
Unfortunately, we cannot have just one compare function to rule them all
here, as we have a specialized/optimized compare code to handle records
that have not yet been deserialized.
Closes #3707

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3925
2025-11-10 11:28:29 +02:00
Jussi Saurio
b024fdb17d Merge 'core: update aegis' from Daeho Ro
It seems that the build on macos arm is failing with `aegis` v0.9.0.
So, here I update `aegis`.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3561
2025-11-10 11:27:01 +02:00
pedrocarlo
176fa283bf add some ai generated queries to test for affinity related queries 2025-11-10 11:15:54 +02:00
pedrocarlo
32535ef4ed only emit affinity check on index seek + check if affinity is necessary at all 2025-11-10 11:15:54 +02:00
pedrocarlo
27e234f949 add affinity of the expr in the seek key, and emit affinity instruction before seeking 2025-11-10 11:15:54 +02:00
Pekka Enberg
e929c252b4 Merge 'bindings/java: implement stream binding methods (int, InputStream, int) in JDBC4PreparedStatement' from Orange banana
## Purpose
* Implement `setAsciiStream(int, InputStream, int)`,
`setUnicodeStream(int, InputStream, int)`, and `setBinaryStream(int,
InputStream, int)` methods in JDBC4PreparedStatemen
## Changes
* `setAsciiStream(int, InputStream, int)`: Reads ASCII bytes, converts
to `String` using `US_ASCII` and binds with `bindText()`.
* `setUnicodeStream(int, InputStream, int)`: Reads bytes as `UTF-8`
encoded text and binds with `bindText()`.
* `setBinaryStream(int, InputStream, int)`: Reads raw bytes and binds
with `bindBlob()`.
* Added consistent error handling and validation
  * null stream - `bindNull()`
  * Negative length - throws `SQLException`
  * Empty stream  - Empty String or Empty Array
  * I/O errors - throw `SQLException`
* Ensures consistency between `setXxxStream` and `getXxxStream` methods,
so data written and read use the same encoding.
## Related Issue
* #615

Reviewed-by: Kim Seon Woo (@seonWKim)

Closes #3917
2025-11-10 11:07:08 +02:00
Pekka Enberg
d872237ca8 Merge 'workflows: Add GITHUB_TOKEN to all Nyrkiö steps' from Henrik Ingo
Previously we didn't use GITHUB_TOKEN for anything. But now that PR
meta-data must be fetched with a extra GitHub API call, then PRs at
least will always nedd GITHUB_TOKEN.

Closes #3918
2025-11-10 09:03:38 +02:00
Pekka Enberg
b74ddf30f9 Merge 'extensions/vtabs: implement remaining opcodes' from Preston Thorpe
The only real benefit right now here is the ability to rename virtual
tables.
Then this now properly calls `VBegin` at the start of a vtab write
transaction, despite none of our extensions needing or implementing
transactions at this point.
```console
explain insert into t values ('key','value');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     10    0                    0   Start at 10
1     VOpen              0     0     0                    0    t
2     VBegin             0     0     0                    0   
3     Null               0     1     0                    0   r[1]=NULL
4     Null               0     3     0                    0   r[3]=NULL
5     String8            0     4     0     key            0   r[4]='key'
6     String8            0     5     0     value          0   r[5]='value'
7     VUpdate            0     5     1                    0   args=r[1..5]
8     Close              0     0     0                    0   
9     Halt               0     0     0                    0   
10    Transaction        0     2     1                    0   iDb=0 tx_mode=Write
11    Goto               0     1     0                    0   
Exiting Turso SQL Shell.
```

Closes #3930
2025-11-10 09:03:07 +02:00
Pekka Enberg
7891be96fd Merge 'Refactor affinity conversions for reusability' from Pedro Muniz
Depends on #3920
Moves some code around so it is easier to reuse and less cluttered in
`execute.rs`, and changes how `compare` works. Instead of mutating some
register, we now just return the possible `ValueRef` representation of
that affinity. This allows other parts of the codebase to reuse this
logic without needing to have an owned `Value` or a `&mut Register`

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3923
2025-11-10 09:02:22 +02:00
Pekka Enberg
2be515247f Merge 'Create AsValueRef trait to allow us to be agnostic over ownership of Value or ValueRef' from Pedro Muniz
Depends on #3919
Also change `op_compare` to reuse the same compare_immutable logic
First step to finish #2304

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3920
2025-11-10 09:01:59 +02:00
Pekka Enberg
4bb0edac5e Merge 'Move value functions to separate file' from Pedro Muniz
Makes it easier to visualize what is related to Value and what is
related to opcodes. This will also facilitate in my next PR to
generalize certain function over `Value` and `ValueRef` as listed in
#2304

Closes #3919
2025-11-10 09:01:29 +02:00
Preston Thorpe
49f9b74c56 Merge 'Avoid heavy macro' from Nikita Sivukhin
Rewrite `parse_modifier` function because its current version lead to
enormous amount of generated LLVM code which significantly increase
compilation time
```sh
$> cargo llvm-lines
  Lines                  Copies               Function name
  -----                  ------               -------------
  1322611                29544                (TOTAL)
   278720 (21.1%, 21.1%)     1 (0.0%,  0.0%)  turso_core::functions::datetime::parse_modifier
```
Before:
```sh
$> cargo check
warning: `turso_core` (lib) generated 2 warnings
    Finished `dev` profile [unoptimized] target(s) in 5.61s
```
After:
```sh
$> cargo check
warning: `turso_core` (lib) generated 2 warnings
    Finished `dev` profile [unoptimized] target(s) in 2.24s
```

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3929
2025-11-09 11:37:06 -05:00
PThorpe92
5c207618a7 Fix extensions py test 2025-11-09 11:35:57 -05:00
PThorpe92
b443b09516 Remove VRollback and VCommit as they are unused opcodes in sqlite 2025-11-09 11:27:09 -05:00
PThorpe92
94b6d254a9 Fix comment on vtab_txn_states 2025-11-09 11:08:52 -05:00