This PR introduces two methods to pager. Very much inspired by
`with_schema` and `with_schema_mut`. `Pager::with_header` and
`Pager::with_header_mut` will give to the closure a shared and unique
reference respectively that are transmuted references from the `PageRef`
buffer.
This PR also adds type-safe wrappers for `Version`, `PageSize`,
`CacheSize` and `TextEncoding`, as they have special in-memory
representations.
Writing the `DatabaseHeader` is just a single `memcpy` now.
```rs
pub fn write_database_header(&self, header: &DatabaseHeader) {
let buf = self.as_ptr();
buf[0..DatabaseHeader::SIZE].copy_from_slice(bytemuck::bytes_of(header));
}
```
`HeaderRef` and `HeaderRefMut` are used in the `with_header*` methods,
but also can be used on its own when there are multiple reads and writes
to the header, where putting everything in a closure would add too much
nesting.
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes#2234
When traversing, we are only interested the following things:
- Is the page a leaf or not
- Is the page an index or table page
- If not a leaf, what is the left child page
This means we don't have to read the entire cell, just the left child
page.
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes#2317
After significant digging into what was causing (particularly writes) to
be so much slower for io_uring back-end, it was determined that
particularly checkpointing was incredibly slow, for several reasons. One
is that we essentially end up calling `submit_and_wait` for every page.
This PR (of course, heavily conflicts with my other open PR) attempts to
remedy this: addding `pwritev` to the File trait for IO back-ends that
want to support it, and aggregates contiguous writes into a series of
`pwritev` calls instead of individually
### Performance:
`make bench-vfs SQL="insert into products (name,price) values
(randomblob(4096), randomblob(2048));" N=1000`
# Update:
**main**
<img width="505" height="194" alt="image" src="https://github.com/user-
attachments/assets/8e4a27af-0bb6-4e01-8725-00bc9f8a82d6" />
**this branch**
<img width="555" height="197" alt="image" src="https://github.com/user-
attachments/assets/fad1f685-3cb0-4e06-aa9d-f797a0db8c63" />
The same test (any test with writes) on this updated branch is now
roughly as fast as syscall IO back-end, often runs will be faster.
Illustrating a checkpoint. Every `count=N` where N > 1 is M syscalls
saved, where M = N - 1.
(roughly ~850 syscalls saved)
<img width="590" height="534" alt="image" src="https://github.com/user-
attachments/assets/a6171ac9-1192-4d3e-a6bf-eeda3f43af07" />
(if you are wondering about why it didn't add 12000-399 and 12400-417,
it's because there is a `512` page batch limit that was hit to prevent
hitting `IOV_MAX`, in the rare case that it's lower than 1024 and the
entire checkpoint is a single run)
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2278
Our compat matrix mentions a couple of opcodes: ToInt, ToBlob, etc.
Those opcodes do not exist.
Instead, there is a single Cast opcode, that takes the affinity as a
parameter.
Currently we just call a function when we need to cast. This PR fixes
the compat file, implements the cast opcode, and in at least one
instance, when explicitly using the CAST keyword, uses that opcode
instead of a function in the generated bytecode.
Reviewed-by: Preston Thorpe (@PThorpe92)
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes#2352
Our compat matrix mentions a couple of opcodes: ToInt, ToBlob, etc.
Those opcodes do not exist.
Instead, there is a single Cast opcode, that takes the affinity as a
parameter.
Currently we just call a function when we need to cast. This PR fixes
the compat file, implements the cast opcode, and in at least one
instance, when explicitly using the CAST keyword, uses that opcode
instead of a function in the generated bytecode.
closes: #1790
Updates test_vector_distance to treat invalid inputs as non-errors,
skipping them instead. These cases aren't considered real errors, so no
explicit error handling is needed in the test.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2341
closes: #2101
Refactors exec_concat_ws to skip null and blob arguments instead of
inserting separators for them. Also adds a fuzz test.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2338
When traversing, we are only interested the following things:
- Is the page a leaf or not
- Is the page an index or table page
- If not a leaf, what is the left child page
This means we don't have to read the entire cell, just the left child
page.
closes#1893
Adds some fairly extensive tests but I'll continue to add some python
tests on top of the unit tests.
## Restart:
tested ✅
- open new DB
- create table and do a bunch of inserts
- `pragma wal_checkpoint(RESTART);`
- close db file
- re-open and verify we can read the wal/repopulate the frame cache
- verify min|max frame
tested ✅
- open same DB
- add more inserts
- `pragma wal_checkpoint(RESTART);`
- do _more_ inserts
- close
- re-open
- verify checksums/max_frame are valid
- verify row count
## Truncate
tested ✅
- open new db
- create table and add inserts
- `pragma wal_checkpoint(truncate);`
- close file
- verify WAL file is empty (32 bytes, header only)
- re-open file
- verify content/row count
tested ✅
- open db
- create table and insert many rows
- `pragma wal_checkpoint(truncate);`
- insert _more_ rows
- close db file
- verify WAL file is valid
- re-open file
- verify we can read entire file/repopulate the frame cache
<img width="541" height="315" alt="image" src="https://github.com/user-
attachments/assets/0470c795-5116-4866-b913-78c07b06b68c" />
```
# header
magic=0x377f0682
version=3007000
page_size=4096
seq=2
salt=ec475ff2-7ea94342
checksum=c9464aff-c571cc22
```
Closes#2179
Closes: #2323
This PR adds support for two new vector functions:
* vector_concat(x, y) – Concatenates two vectors of the same type.
* vector_slice(x, start_index, end_index) – Extracts a subvector from
the input vector.
Notes:
* Negative start_index or end_index is not supported
Reviewed-by: Nikita Sivukhin (@sivukhin)
Closes#2336