We make read_record faster by not allocating Vec if not needed. This is
why I introduced a simple `SmallVec<T>` that will have a stack allocated
list for the simplest workloads, and a heap allocated if we were to
require more stuff.
Both read_varint and read_value, at least in my mac m4, were not
inlined. Since these functions are called so many times it made sense to
inline them to avoid call overhead. With this I saw something like 20%
improvement over previous commit in my m4.
Improve allocation usage from ImmutableRecords by reusing them.
ImmutableRecord is basically a contigous piece of memory that holds the
current record. If we move to some other record we usually deallocate
the previous one and allocate a new one -- obviously this is wasteful.
With this commit we will reuse the ImmutableRecord to allow payload to
be extended if needed or reused if we can, making it faster to iterate
records basically.
Closes#1206Closes#447Closes#1117
Issues:
1. Rust floating point math fucntions are non deterministic.
2. SQLite have complex floating point display rules.
I changed rust functions to libm for math ops and implemented Display
trait for OwnedValue:Float. A lot of float formatting SQLite probably
inherits from C have to be handcrafted.
Closes#1208
This PR has two parts:
## 1. Make reading cells faster
My benchmark was simple, running `test_sequential_write` test and see
how long it takes. It is a nice benchmark because it will write a new
row and then `select * from t` for every write to check the contents.
## From:
```bash
cargo test test_sequential_write --release -- --nocapture --test-threads=1 7.21s user 0.34s system 88% cpu 8.527 total
```
### To:
```bash
cargo test test_sequential_write --release -- --nocapture --test-threads=1 3.14s user 0.31s system 82% cpu 4.161 total
```
## 2. Fix reading overflow pages.
The code to read overflow pages was wrong, `read_payload` would try to
read as many overflow pages as possible but if there weren't in cache
they would return `IO` and not read more pages after that. This PR makes
every read record to check if it needs to read from overflow pages and
if not, it will simply read from the current page.
Closes#1141
### The problem:
I often need to copy the output of an `Explain` statement to my
clipboard. Currently this is not possible because it currently will only
write to stdout.
All other limbo output, I am able to run `.output file` in the CLI, then
enter my query and in another tmux pane I simply `cat file | xclip -in
-selection clipboard`.
### The solution:
Expose a `statement.explain()` method that returns the query explanation
as a string. If the user uses something like `execute` instead of
prepare, it will default to `stdout` as expected, but this allows the
user to access the query plan on the prepared statement and do with it
what they please.
Closes#1166
Currently we can create two guards and unlock one of them and still have
two mutable references to the same data.
By being able to unlock only via guard we prevent unsafe scenarios.
Currently, the test below will pass and shows that we can hold two
mutable references to the same data. After this fix, this would result
in a deadlock (have to remove `lock.unlock()`)
```rust
#[test]
fn two_mutable_reference() {
let lock = SpinLock::new(42);
let guard = lock.lock();
lock.unlock();
let guard2 = lock.lock();
// two mutable references to same data
*guard = 10;
*guard2 = 20;
assert_eq!(*guard, 20);
assert_eq!(*guard2, 20);
}
```
Note: The javascript action failure is unrelated to this PR.
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#1194
OwnedValue has become a powerhouse of madness, mainly because I decided
to do it like that when I first introduced AggContext. I decided it was
enough and I introduced a `Register` struct that contains `OwnedValue`,
`Record` and `Aggregation`, this way we don't use `OwnedValue` for
everything make everyone's life harder.
This is the next step towards making ImmutableRecords the default
because I want to remove unnecessary allocations. Right now we clone
OwnedValues when we generate a record more than needed.
Closes#1188
OwnedValue has become a powerhouse of madness, mainly because I decided
to do it like that when I first introduced AggContext. I decided it was
enough and I introduced a `Register` struct that contains `OwnedValue`,
`Record` and `Aggregation`, this way we don't use `OwnedValue` for
everything make everyone's life harder.
This is the next step towards making ImmutableRecords the default
because I want to remove unnecessary allocations. Right now we clone
OwnedValues when we generate a record more than needed.
Fix memory hogging in MVCC scan cursor #1104
The current scan cursor loads all rowids at once, which blows up memory
on big tables.
Added BucketScanCursor that loads rowids in configurable batches -
better memory usage while keeping decent performance. It's a drop-in
replacement with the same interface.
Also included LazyScanCursor as an alternative that fetches one rowid at
a time, though it's less efficient due to log(n) skipmap lookups for
each row.
BucketScanCursor is the recommended approach for most use cases. WDYT?
Closes#1178
Currently we have a Record, which is a dumb vector of cloned values.
This is incredibly bad for performance as we do not want to clone
objects unless needed. Therefore, let's start by introducing this type
so that any record that has already been serialized will be returned
from btree in the format of a simple payload with reference to payload.
Closes#1176
This PR introduces structured fuzzing with
[libFuzzer](https://llvm.org/docs/LibFuzzer.html). The expression target
implementation is not complete, but already found a compatibility issue.
More fuzzing targets should be moved from `tests/fuzz` to `fuzz` and
benefit from more advanced fuzzing techniques.
- [x] Add fuzzing guide to `README.md`
- Install `cargo-fuzz`.
- Use the nightly version of cargo with the `fuzz` dev shell or use
rustup to switch versions.
- Run `cargo fuzz run ...`
- [x] Add all binary operations.
# 🐞 Bugs
Compatibility issue found when trying to `select ?` with a `NaN` value.
Sqlite returns `NULL`, while Limbo returns `NaN` (reasonable, but
incompatible).
```
thread '<unnamed>' panicked at fuzz_targets/expression.rs:130:5:
assertion `left == right` failed
left: Null
right: Float(NaN)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
==59288== ERROR: libFuzzer: deadly signal
#0 0x00010564c0f0 in __sanitizer_print_stack_trace+0x28 (librustc-nightly_rt.asan.dylib:arm64+0x5c0f0)
#1 0x0001024e7b64 in fuzzer::PrintStackTrace()+0x30 (expression:arm64+0x101c53b64)
#2 0x0001024da650 in fuzzer::Fuzzer::CrashCallback()+0x60 (expression:arm64+0x101c46650)
#3 0x000195fa6de0 in _sigtramp+0x34 (libsystem_platform.dylib:arm64+0x3de0)
#4 0x000195f6ff6c in pthread_kill+0x11c (libsystem_pthread.dylib:arm64+0x6f6c)
#5 0x000195e7c904 in abort+0x7c (libsystem_c.dylib:arm64+0x79904)
#6 0x000102580990 in std::sys::pal::unix::abort_internal::hd275d720c474f43c+0x8 (expression:arm64+0x101cec990)
#7 0x000102621604 in std::process::abort::h62d9ecef2f17e944+0x8 (expression:arm64+0x101d8d604)
#8 0x0001024d93bc in libfuzzer_sys::initialize::_$u7b$$u7b$closure$u7d$$u7d$::h3b4b43a8f9432830+0xb8 (expression:arm64+0x101c453bc)
#9 0x000102577de0 in std::panicking::rust_panic_with_hook::h19683f6fd94fb24c+0x2b8 (expression:arm64+0x101ce3de0)
#10 0x000102577970 in std::panicking::begin_panic_handler::_$u7b$$u7b$closure$u7d$$u7d$::h4e98e5e8777eac5e+0x8c (expression:arm64+0x101ce3970)
#11 0x0001025754e0 in std::sys::backtrace::__rust_end_short_backtrace::h12a2d70ebc9128b2+0x8 (expression:arm64+0x101ce14e0)
#12 0x000102577628 in rust_begin_unwind+0x1c (expression:arm64+0x101ce3628)
#13 0x000102623340 in core::panicking::panic_fmt::h8c4d74b8e5179d60+0x1c (expression:arm64+0x101d8f340)
#14 0x0001026236cc in core::panicking::assert_failed_inner::he8fd1f85d57f866a+0x104 (expression:arm64+0x101d8f6cc)
#15 0x0001025c73dc in core::panicking::assert_failed::h3e7590b91d46bff9 panicking.rs:364
#16 0x000100930910 in expression::do_fuzz::hfcf5c5e5fde1a31c expression.rs:130
#17 0x0001009373fc in rust_fuzzer_test_input lib.rs:359
#18 0x0001024d2f34 in std::panicking::try::do_call::hce6ebc856827ae8b+0xc4 (expression:arm64+0x101c3ef34)
#19 0x0001024d9624 in __rust_try+0x18 (expression:arm64+0x101c45624)
#20 0x0001024d896c in LLVMFuzzerTestOneInput+0x16c (expression:arm64+0x101c4496c)
#21 0x0001024dc3cc in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long)+0x148 (expression:arm64+0x101c483cc)
#22 0x0001024db8dc in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*)+0x58 (expression:arm64+0x101c478dc)
#23 0x0001024dd920 in fuzzer::Fuzzer::MutateAndTestOne()+0x258 (expression:arm64+0x101c49920)
#24 0x0001024de908 in fuzzer::Fuzzer::Loop(std::__1::vector<fuzzer::SizedFile, std::__1::allocator<fuzzer::SizedFile>>&)+0x38c (expression:arm64+0x101c4a908)
#25 0x0001024fd120 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long))+0x1bac (expression:arm64+0x101c69120)
#26 0x00010250d884 in main+0x34 (expression:arm64+0x101c79884)
#27 0x000195bf0270 (<unknown module>)
NOTE: libFuzzer has rudimentary signal handlers.
Combine libFuzzer with AddressSanitizer or similar for better crash reports.
SUMMARY: libFuzzer: deadly signal
MS: 3 CopyPart-ShuffleBytes-CrossOver-; base unit: a36928cfe783d55be82d526168a2da57372fdfdc
0xff,0xfd,0xff,0x3f,0x87,0x0,0x6e,0x6f,0x77,0x48,0x48,0x48,0xff,0x48,0xff,0xff,0x5b,0xff,0x5b,
\377\375\377?\207\000nowHHH\377H\377\377[\377[
artifact_prefix='/Users/levy/Documents/limbo/fuzz/artifacts/expression/'; Test unit written to /Users/levy/Documents/limbo/fuzz/artifacts/expression/crash-63bfc8813b82bd8b97c557650289a6bc2c055ca5
Base64: //3/P4cAbm93SEhI/0j//1v/Ww==
────────────────────────────────────────────────────────────────────────────────
Failing input:
artifacts/expression/crash-63bfc8813b82bd8b97c557650289a6bc2c055ca5
Output of `std::fmt::Debug`:
Value(
Real(
NaN,
),
)
Reproduce with:
cargo fuzz run expression artifacts/expression/crash-63bfc8813b82bd8b97c557650289a6bc2c055ca5
Minimize test case with:
cargo fuzz tmin expression artifacts/expression/crash-63bfc8813b82bd8b97c557650289a6bc2c055ca5
────────────────────────────────────────────────────────────────────────────────
```
Reviewed-by: Preston Thorpe (@PThorpe92)
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#1116