Commit Graph

6899 Commits

Author SHA1 Message Date
pedrocarlo
cf951e24cd add state machine for is_empty_table in preparation for IO Completion refactor 2025-07-31 11:49:12 -03:00
pedrocarlo
7012860800 create separate state machines file 2025-07-31 11:49:12 -03:00
Preston Thorpe
bd9df6262f Merge 'IN queries' from Glauber Costa
Merge 'IN queries' from Glauber Costa

Implement IN queries.
It is currently as todo!(), but my main motivation is that scavenging
for EXPLAINs, that pattern, at least in simple queries like SELECT ...
IN (1,2,3) uses the AddImm instruction we just added.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2342
2025-07-31 10:00:18 -04:00
Jussi Saurio
eeceefe49d Merge 'fix/wal: only rollback WAL if txn was write + fix start state for WalFile' from Jussi Saurio
Closes #2363
## What
The following sequence of actions is possible:
```
Some committed frames already exist in the WAL. shared.pages_in_frames.len() > 0.

Brand new connection does this:
BEGIN
^-- deferred, no read tx started yet, so its `self.start_pages_in_frames` is `0`
       because it's a brand new WalFile instance

ROLLBACK   <-- calls `wal.rollback()` and truncates `shared.pages_in_frames` to length `0`

PRAGMA wal_checkpoint();
^-- because `pages_in_frames` is empty, it doesnt actually
checkpoint anything but still sets shared.max_frame to 0, causing effectively data loss
```
## Fix
- Only call `wal.rollback()` for write transactions
- Set `start_pages_in_frames` correctly so that this doesn't happen even
if a regression starts calling `wal.rollback()` again

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2366
2025-07-31 16:16:20 +03:00
Jussi Saurio
998d288cb8 Merge 'vdbe: Disallow checkpointing in transaction' from Jussi Saurio
Closes #2358

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2365
2025-07-31 16:12:49 +03:00
Glauber Costa
9d41fa4489 implement IN patterns for non-conditional SELECT queries
Extracts the core logic of IN from the conditional version, and uses the
conditional metadata to determine the jump. Then Uses the AddImm
operator we just added to force the integer conversion at the end (like
SQLite does).
2025-07-31 08:11:41 -05:00
Glauber Costa
9e8ba5263b Implement the AddImm opcode
It is a simple opcode. The hard part was finding a sqlite statement
that uses it =)
2025-07-31 08:08:07 -05:00
Jussi Saurio
218c2e65ff Merge 'fix/bindings/rust: return errors instead of swallowing them and returning None' from Jussi Saurio
Closes #2359

Closes #2360
2025-07-31 15:44:34 +03:00
Jussi Saurio
981175d80a Merge 'fix/wal: make db_changed check detect cases where max frame happens to be the same' from Jussi Saurio
Closes #2361 (already been reopened once)
Only `max_frame` check is not enough -- connections can have the same
max frame but the DB has still changed. Compare checksums and checkpoint
sequences too.

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2367
2025-07-31 15:44:04 +03:00
Jussi Saurio
62e804480e fix/wal: make db_changed check detect cases where max frame happens to be the same 2025-07-31 14:37:33 +03:00
Jussi Saurio
e88707c6fd fix/wal: only rollback WAL if txn was write 2025-07-31 14:18:43 +03:00
Jussi Saurio
9e1fca2eba vdbe: disallow checkpointing in interactive tx 2025-07-31 13:16:33 +03:00
Jussi Saurio
2cbbe1afa5 Merge 'fix/wal: reset page cache when another connection checkpointed in between' from Jussi Saurio
Closes #2361

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #2362
2025-07-31 13:10:13 +03:00
Jussi Saurio
39dec647a7 fix/wal: reset page cache when another connection checkpointed in between 2025-07-31 12:44:22 +03:00
Jussi Saurio
6e2218c3ed fix/bindings/rust: return errors instead of swallowing them and returning None 2025-07-31 11:57:17 +03:00
Pekka Enberg
bac3add778 Merge 'Fix merge script to prevent incorrectly marking merged contributor PRs as closed' from Preston Thorpe
The script as it is now, about ~50% of the time will incorrectly mark
contributor PR's as `Closed` instead of `Merged` after merging.
We are super fortunate to have such an awesome community here working on
this project, let's make sure that all our contributors github profiles
are reflecting the work they are putting in 😄

Closes #2354
2025-07-31 11:03:50 +03:00
Pekka Enberg
d5c5839ee4 Merge 'Serverless JavaScript driver improvements' from Pekka Enberg
Closes #2349
2025-07-31 10:14:22 +03:00
Jussi Saurio
7d082ab614 small fix after header accessor refactor 2025-07-31 10:05:52 +03:00
Jussi Saurio
f619556344 Merge 'Direct DatabaseHeader reads and writes – with_header and with_header_mut' from Levy A.
This PR introduces two methods to pager. Very much inspired by
`with_schema` and `with_schema_mut`. `Pager::with_header` and
`Pager::with_header_mut` will give to the closure a shared and unique
reference respectively that are transmuted references from the `PageRef`
buffer.
This PR also adds type-safe wrappers for `Version`, `PageSize`,
`CacheSize` and `TextEncoding`, as they have special in-memory
representations.
Writing the `DatabaseHeader` is just a single `memcpy` now.
```rs
pub fn write_database_header(&self, header: &DatabaseHeader) {
    let buf = self.as_ptr();
    buf[0..DatabaseHeader::SIZE].copy_from_slice(bytemuck::bytes_of(header));
}
```
`HeaderRef` and `HeaderRefMut` are used in the `with_header*` methods,
but also can be used on its own when there are multiple reads and writes
to the header, where putting everything in a closure would add too much
nesting.

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2234
2025-07-31 10:02:47 +03:00
Jussi Saurio
62d79e8c16 Merge 'refactor/btree: simplify get_next_record()/get_prev_record()' from Jussi Saurio
When traversing, we are only interested the following things:
- Is the page a leaf or not
- Is the page an index or table page
- If not a leaf, what is the left child page
This means we don't have to read the entire cell, just the left child
page.

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2317
2025-07-31 10:02:08 +03:00
Jussi Saurio
99e20e46bb Merge 'Accumulate/batch vectored writes when backfilling during checkpoint' from Preston Thorpe
After significant digging into what was causing (particularly writes) to
be so much slower for io_uring back-end, it was determined that
particularly checkpointing was incredibly slow, for several reasons. One
is that we essentially end up calling `submit_and_wait` for every page.
This PR (of course, heavily conflicts with my other open PR) attempts to
remedy this: addding `pwritev` to the File trait for IO back-ends that
want to support it, and aggregates contiguous writes into a series of
`pwritev` calls instead of individually
### Performance:
`make bench-vfs SQL="insert into products (name,price) values
(randomblob(4096), randomblob(2048));" N=1000`
# Update:
**main**
<img width="505" height="194" alt="image" src="https://github.com/user-
attachments/assets/8e4a27af-0bb6-4e01-8725-00bc9f8a82d6" />
**this branch**
<img width="555" height="197" alt="image" src="https://github.com/user-
attachments/assets/fad1f685-3cb0-4e06-aa9d-f797a0db8c63" />
The same test (any test with writes) on this updated branch is now
roughly as fast as syscall IO back-end, often runs will be faster.
Illustrating a checkpoint. Every `count=N` where N > 1 is M syscalls
saved, where M = N - 1.
(roughly ~850 syscalls saved)
<img width="590" height="534" alt="image" src="https://github.com/user-
attachments/assets/a6171ac9-1192-4d3e-a6bf-eeda3f43af07" />
(if you are wondering about why it didn't add 12000-399 and 12400-417,
it's because there is a `512` page batch limit that was hit to prevent
hitting `IOV_MAX`, in the rare case that it's lower than 1024 and the
entire checkpoint is a single run)

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2278
2025-07-31 07:30:57 +03:00
PThorpe92
c35fa2416d Merge 'Add cli Dockerfile' from Pere Diaz Bou
Shamelessly vibe coded shit to add simple docker image to run the cli
:).
`docker build -f Dockerfile.cli -t turso-cli . && docker run -it turso-
cli`

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2325
2025-07-31 00:05:30 -04:00
PThorpe92
9cba19309e Add .dockerignore and Makefile commands to support docker 2025-07-31 00:00:44 -04:00
PThorpe92
ab22dafbe1 Fix merge_pr.py script to avoid marking contributor PRs as closed 2025-07-30 22:49:57 -04:00
PThorpe92
07137c7aaf Merge 'Implement the Cast opcode' from Glauber Costa
Our compat matrix mentions a couple of opcodes: ToInt, ToBlob, etc.
Those opcodes do not exist.
Instead, there is a single Cast opcode, that takes the affinity as a
parameter.
Currently we just call a function when we need to cast. This PR fixes
the compat file, implements the cast opcode, and in at least one
instance, when explicitly using the CAST keyword, uses that opcode
instead of a function in the generated bytecode.

Reviewed-by: Preston Thorpe (@PThorpe92)
Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2352
2025-07-30 22:32:09 -04:00
PThorpe92
fcf634c82b Merge 'remove non-existent opcode' from Glauber Costa
$ egrep -rI "define OP" sqlite3.c | grep Cookie
sqlite3.c:#define OP_ReadCookie     99
sqlite3.c:#define OP_SetCookie     100

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2353
2025-07-30 22:22:43 -04:00
Glauber Costa
caec3f7c51 remove non-existent opcode
$ egrep -rI "define OP" sqlite3.c | grep Cookie
sqlite3.c:#define OP_ReadCookie     99
sqlite3.c:#define OP_SetCookie     100
2025-07-30 20:50:00 -05:00
Glauber Costa
4bd1582e7d Implement the Cast opcode
Our compat matrix mentions a couple of opcodes: ToInt, ToBlob, etc.
Those opcodes do not exist.

Instead, there is a single Cast opcode, that takes the affinity as a
parameter.

Currently we just call a function when we need to cast. This PR fixes
the compat file, implements the cast opcode, and in at least one
instance, when explicitly using the CAST keyword, uses that opcode
instead of a function in the generated bytecode.
2025-07-30 20:44:54 -05:00
PThorpe92
2e741641e6 Add test to assert we are backfilling all the rows properly with vectored writes 2025-07-30 19:42:54 -04:00
PThorpe92
ade1c182de Add is_full method to checkpoint batch 2025-07-30 19:42:54 -04:00
PThorpe92
693b71449e Clean up writev batching and apply suggestions 2025-07-30 19:42:53 -04:00
PThorpe92
ef69df7258 Apply review suggestions 2025-07-30 19:42:53 -04:00
PThorpe92
73882b97d6 Remove unnecessary collecting CQEs into an array in run_once, comments 2025-07-30 19:42:53 -04:00
PThorpe92
28283e4d1c Fix bench_vfs python script to use fresh db for each run 2025-07-30 19:42:52 -04:00
PThorpe92
efcffd380d Clean up io_uring writev implementation, add iovec and cqe cache 2025-07-30 19:42:52 -04:00
PThorpe92
689007cb74 Remove unrelated io_uring changes 2025-07-30 19:42:52 -04:00
PThorpe92
c0800ecc29 Update test to match cacheflush behavior 2025-07-30 19:42:51 -04:00
PThorpe92
b8e6cd5ae2 Fix taking page content from cached pages in checkpoint loop 2025-07-30 19:42:51 -04:00
PThorpe92
b04128b585 Fix write_pages_vectored to properly track completion 2025-07-30 19:42:50 -04:00
PThorpe92
0f94cdef03 Fix io_uring pwritev to properly handle partial writes 2025-07-30 19:42:50 -04:00
PThorpe92
88445328a5 Handle partial writes for pwritev calls in io_uring and fix JS bindings 2025-07-30 19:42:50 -04:00
PThorpe92
daec8aeb22 impl pwritev for simulator file 2025-07-30 19:42:49 -04:00
PThorpe92
5f01eaae35 Fix default io:;File::pwritev impl 2025-07-30 19:42:49 -04:00
PThorpe92
62f004c898 Fix write counter for writev batching in checkpoint 2025-07-30 19:42:49 -04:00
PThorpe92
d189f66328 Add pwritev to wasm/js api 2025-07-30 19:42:48 -04:00
PThorpe92
7b2163208b batch backfilling pages when checkpointing 2025-07-30 19:42:48 -04:00
Levy A.
2bde1dbd42 fix: PageSize bounds check 2025-07-30 17:33:59 -03:00
Levy A.
fe66c61ff5 add usable_space to DatabaseHeader
we already have the `DatabaseHeader`, we don't need the cached result
2025-07-30 17:33:59 -03:00
Levy A.
e35fdb8263 feat: zero-copy DatabaseHeader 2025-07-30 17:33:59 -03:00
Pekka Enberg
9bd053033a serverless: Fix Connection.run() implementation 2025-07-30 21:42:45 +03:00