Span creation in debug mode is very slow and impacts our ability to run
the Simulator faster.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2146
If we use `Assumption` here, the simulator just goes to the next
property instead of halting here.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2147
This PR updates to version Rust 1.88.0 ([Release
notes](https://releases.rs/docs/1.88.0/)) and fixes all the clippy
errors that come with the new Rust version.
This is possible in the latest Rust version:
```rust
if let Some(foo) = bar && foo.is_cool() {
...
}
```
There are three complications in the migration (so far):
- A BUNCH of Clippy warnings (mostly fixed in
https://github.com/tursodatabase/limbo/pull/1827)
- Windows cross compilation failed; linking `advapi32` on windows fixes
it
- Since Rust 1.87.0, advapi32 is not linked by default anymore
([Release notes](https://github.com/rust-
lang/rust/blob/master/RELEASES.md#compatibility-notes-1),
[PR](https://github.com/rust-lang/rust/pull/138233))
- Rust is more strict with FFIs and aligning pointers now. CI checks
failed with error below
- Fixed in https://github.com/tursodatabase/turso/pull/2064
```
thread 'main' panicked at
core/ext/vtab_xconnect.rs:64:25:
misaligned pointer dereference: address must be
a multiple of 0x8 but is 0x7ffd9d901554
```
Closes#1807
We were passing the table columns' collations (all of them) in order,
instead of the index column collations. Two issues:
1. This is wrong
2. There's now an assertion in the Sorter that actually panics if the
length of sort order and collations is not the same
Closes#2140
> "But it ain't about how hard ya hit. It's about how hard you can get
hit and keep moving forward."
> -- Rocky
Fix ~obviously~ wrong (my bad) Linux build.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2143
Closes#1998. Now I am queuing IO to be run at some later point in time.
Also Latency for some reason is slowing the simulator a looot for some
runs.
This PR also adds a StateMachine variant in Balance as now `free_pages`
is correctly an asynchronous function. With this change, we now need a
state machine in the `Pager` so that `free_pages` can be reentrant.
Lastly, I removed a timeout in `checkpoint_shutdown` as it was
triggering constantly due to the slightly increased latency.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1943
## What was wrong
During running simulations for #1988 I ran into a post-balance
validation error where the correct divider cell could not be found from
the parent.
This was caused by divider cell insertion happening this way:
- First divider cell caused overflow
- Second technically had space to fit, so we didn't add it to overflow
cells
- During balance validation, we were not able to find the divider in the
expected slot.
## First fix attempt
I looked at SQLite source, and it seems SQLite always adds the cell to
overflow cells if there are existing overflow cells, and doesn't allow
normal insertion even if the cell payload would fit:
```c
if( pPage->nOverflow || sz+2>pPage->nFree ){
...add to overflow cells...
}
```
So, I changed our implementation to do the same, which fixed the balance
validation issue.
## The sequel
However, then I ran into another issue:
A cell inserted during balancing in the `edit_page()` stage was added to
overflow cells, which should not happen. The reason for this was the
changed logic in `insert_into_page()`, outlined above. Since the page
being balanced contained not-yet-cleared overflow cells, any insert to
it ended up being shoved into the overflow cells vector too.
It looks like - unlike us - SQLite doesn't use the equivalent of
`insert_into_cell()` in its implementation of `page_insert_array()`
which explains this.
## Second fix
For simplicity, I made a second version of `insert_into_cell()` called
`insert_into_cell_during_balance()` which allows regular cell insertion
despite existing overflow cells, since the existing overflow cells are
what caused the balance to happen in the first place and will be cleared
as soon as `edit_page()` is done.
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#2138
During running simulations for #1988 I ran into a post-balance validation
error where the correct divider cell could not be found from the parent.
This was caused by divider cell insertion happening this way:
- First divider cell caused overflow
- Second technically had space to fit, so we didn't add it to overflow cells
I looked at SQLite source, and it seems SQLite always adds the cell to overflow
cells if there are existing overflow cells:
```c
if( pPage->nOverflow || sz+2>pPage->nFree ){
...add to overflow cells...
}
```
So, I changed our implementation to do the same, which fixed the balance validation
issue.
However, then I ran into another issue:
A cell inserted during balancing in the `edit_page()` stage was added to overflow cells,
which should not happen. The reason for this was the changed logic in `insert_into_page()`,
outlined above.
It looks like SQLite doesn't use `insert_into_cell()´ in its implementation of `page_insert_array()`
which explains this.
For simplicity, I made a second version of `insert_into_cell()` called `insert_into_cell_during_balance()`
which allows regular cell insertion despite existing overflow cells, since the existing overflow cells are
what caused the balance to happen in the first place and will be cleared as soon as `edit_page()` is done.
doing e.g. `RUST_LOG=debug target/debug/tursodb foo.db 'SELECT * FROM
bar' &> output.txt` didn't generate traces because the tracer was
initialized after `app.first_run()`
Closes#2114
This unblocks proper testing in simulator where esp. with indexes
enabled, by far the most common reason for sim failure is cache being
full.
Reviewed-by: Pekka Enberg <penberg@iki.fi>
Closes#2135
These are nearly always used together in some form, so it makes sense to
colocate them, and it also makes many code paths simpler, as we don't
separately pass `collations` and `key_sort_order` around
As a side effect, as the bitfield-based `IndexKeySortOrder` is removed,
we now remove the arbitrary 64 column restriction for indexes, see e.g.
this sim failure which fails to 64+ index columns (not sure why it uses
an index if they are disabled):
https://github.com/tursodatabase/turso/actions/runs/16339391964/job/4615
8045158
Closes#2131
<img height="400" alt="image" src="https://github.com/user-
attachments/assets/bdd5c0a8-1bbb-4199-9026-57f0e5202d73" />
<img height="400" alt="image" src="https://github.com/user-
attachments/assets/7ea63e58-2ab7-4132-b29e-b20597c7093f" />
We were copying the schema preemptively on each `Database::connect`, now
the schema is shared until a change needs to be made by sharing a single
`Arc` and mutating it via `Arc::make_mut`. This is faster as reduces
memory usage.
Closes#2022
### Async IO performance, part 0
Relatively small and focused PR that mainly does two things, will add a
.md document of the proposed/planned improvements to the io_uring module
to fully revamp our async IO.
1. **Registration of file descriptors.**
At startup, by calling `io_uring_register_files_sparse` we can allocate
an array in shared kernel/user space by calling register_files_sparse
which initializes each slot to `-1`, and when we open a file we call
`io_uring_register_files_update`, providing an index into this array and
`fd`.
Then for the IO submission, we can reference the index into this array
instead of the fd, saving the kernel the work of looking up the fd in
the process file table, incrementing the reference count, doing the
operation, then finally decrementing the refcount. Instead the kernel
can just index into the array and do the operation.
This especially provides an improvement for cases like this, where files
are open for long periods of time, which the kernel will perform many
operations on.
The eventual goal of this, is to use Fixed read/write operations, where
both the file descriptor and the underlying buffer is registered with
the kernel. There is another branch continuing this work, that
introduces a buffer pool that memlock's one large 32MB arena mmap and
tries to use that wherever possible.
These Fixed operations are essentially the "holy grail" of io_uring
performance (for file operations).
2. **!Vectored IO**
This is kind of backwards, because the goal is to indeed implement
proper vectored IO and I'm removing some of the plumbing in this PR, but
currently we have been using `Writev`/`Readv`, while never submitting >
1 iovec at a time.
Writes to the WAL, especially, would benefit immensely from vectored IO,
as it is append-only and therefore all writes are contiguous. Regular
checkpointing/cache flushing to disk can also be adapted to aggregate
these writes and submit many in a single system call/opcode.
Until this is implemented, the bookkeeping and iovecs are unnecessary
noise/overhead, so let's temporarily remove them and revert to normal
`read`/`write` until they are needed and it can be designed from
scratch.
3. **Flags**
`setup_single_issuer` hints to the kernel that `IOURING_ENTER` calls
will all be sent from a single thread, and `setup_coop_taskrun` removes
some unnecessary kernel interrupts for providing cqe's which most single
threaded applications do not need. Both these flags demonstrate modest
improvement of performance.
Closes#2127
Enables formatting `Expr::Column` by adding the context to `ToTokens`
instead of creating a new unparsing implementation for each node.
`ToTokens` implemented for:
- [x] `UpdatePlan`
- [x] `Plan`
- [x] `JoinedTable`
- [x] `SelectPlan`
- [x] `DeletePlan`
Reviewed-by: Pedro Muniz (@pedrocarlo)
Closes#1949