This PR reworks the unix I/O backend, removing runtime reference
counting/borrow checking and optimizing away the hashmap in favor of a
static array, with an unlikely fallback vec.
The only reason the fallback vec is there is because unlike the
`io_uring` module, we cannot simply index into the array with the fd as
the OS could theoretically give us a fd up to I believe 1024 so keeping
an array of that size for a few elements is unnecessary.
Closes#940
In order [experimentally
compile](https://github.com/DougAnderson444/wit-limbo) `limbo_core` to a
[wasm component](https://component-model.bytecodealliance.org/), limbo
needed to have no reliance on `js`, `js-sys`, `wasm-bindgen`, et al.
(for those who aren't familiar, there are many `wasm` runtimes and not
all of them play nice with `wasm-bindgen`)
This PR simply cleans up the dependencies, and puts them behind optional
flags and whatnot in order to enable this. Both `log` and `tracing` were
being used, so I reduced this only to `tracing`.
End result is limbo can be used like this:
https://github.com/DougAnderson444/wit-limbo
We can open a discussion on the possibilities that running limbo as a
wasm component can offer, including potentially using composable
components to implement the sqlite runtime extensions, as well as giving
us a clean interface for PlatformIO operations -- define them once,
implement many ways on various platforms. I'm new to limbo, but it looks
like current extension are Rust based deps and features flags, whereas
sqlite is runtime, right? What if limbo was runtime extensible too?
The WIT interface is largely sync (though I believe wasmtime has an
async feature), but in my limited exposure to limbo so far a lot of the
wasm seems sync already anyway. Again, topic for further discussion.
Suffice to say, aligning these deps in this way paves the road for
further experiments and possibilities.
Related: https://github.com/neilg63/julian_day_converter/pull/2
Related: https://github.com/tursodatabase/limbo/issues/950
Closes: https://github.com/tursodatabase/limbo/issues/950Closes#983
chrono has default features that are incompatible with some targets, such as non-js WebAssembly.
Removing the unused defaults allows limbo to compile to these targets
This PR introduce simple fuzz test for BTree insertion algorithm and
fixes few bugs found by fuzzer
- BTree algorithm returned early although there were overflow pages on
stack and more rebalances were needed
- BTree balancing algorithm worked under assumption that single page
will be enough for rebalance - although this is not always true (if page
were tightly packed with relatively big cells, insertion of new very big
cell can require 3 split pages to distribute the content between them)
- `overflow_cells` wasn't cleared properly during rebalancing
- insertions of dividers to the parent node were implemented incorrectly
- `defragment_page` didn't reset
`PAGE_HEADER_OFFSET_FRAGMENTED_BYTES_COUNT` field which can lead to
suboptimal usage of pages
Closes#951
Use knowledge of query plan to inform how much memory to initially
allocate for `ProgramBuilder` vectors
Some of them are exact, some are semi-random estimates
```sql
Prepare `SELECT 1`/Limbo/SELECT 1
time: [756.93 ns 758.11 ns 759.59 ns]
change: [-4.5974% -4.3153% -4.0393%] (p = 0.00 < 0.05)
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) low severe
1 (1.00%) low mild
3 (3.00%) high mild
1 (1.00%) high severe
Prepare `SELECT * FROM users LIMIT 1`/Limbo/SELECT * FROM users LIMIT 1
time: [1.4739 µs 1.4769 µs 1.4800 µs]
change: [-7.9364% -7.7171% -7.4979%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...`
time: [3.7440 µs 3.7520 µs 3.7596 µs]
change: [-5.4627% -5.1578% -4.8445%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
```
Closes#899
This PR adds support for `json_set`.
There are three helper functions added:
1. `json_path_from_owned_value`, this function turns an `OwnedValue`
into a `JsonPath`.
2. `find_or_create_target`, this function is similar to `find_target`
with the added bonus of creating the target if it doesn't exist. There
is a caveat with this function and that is that it will create
objects/arrays as it goes, meaning if you send `{}` into it and try
getting the path `$.some.nested.array[123].field`, it will return
`{"some":{"nested":array:[]}}` since creation of `some`, `nested` and
`array` will succeed, but accessing element `123` will fail.
3. `create_and_mutate_json_by_path`, this function is very similar to
`mutate_json_by_path` but calls `find_or_create_target` instead of
`find_target`
Related to #127Closes#878
We really need to make the WAL lock less expensive, but switching to
`parking_lot` is anyway something we should do.
Before:
```
Execute `SELECT 1`/Limbo
time: [56.230 ns 56.463 ns 56.688 ns]
```
After:
```
Execute `SELECT 1`/Limbo
time: [52.003 ns 52.132 ns 52.287 ns]
```
I was baffled previously, because any time that `free` was called on a
type from an extension, it would hang even when I knew it wasn't in use
any longer, and hadn't been double free'd.
After #737 was merged, I tried it again and noticed that it would no
longer hang... but only for extensions that were staticly linked.
Then I realized that we are using a global allocator, that likely wasn't
getting used in the shared library that is built separately that won't
inherit from our global allocator in core, causing some symbol mismatch
and the subsequent hanging on calls to `free`.
This PR adds the global allocator to extensions behind a feature flag in
the macro that will prevent it from being used in `wasm` and staticly
linked environments where it would conflict with limbos normal global
allocator. This allows us to properly free the memory from returning
extension functions over FFI.
This PR also changes the Extension type to a union field so we can store
int + float values inline without boxing them.
any additional tips or thoughts anyone else has on improving this would
be appreciated 👍Closes#803
This patch adds some libSQL vector extension functions such as
`vector()` and `vector_distance_cos()`, which can be used for exact
nearest neighbor search as follows:
```
limbo> SELECT embedding, vector_distance_cos(embedding, '[9, 9, 9]')
...> FROM movies ORDER BY vector_distance_cos(embedding, '[9, 9, 9]');
[4, 5, 6]|0.013072490692138672
[1, 2, 3]|0.07417994737625122
```
The current status of the PR is halfway. The new framing of simulation
runner where `setup_simulation` is separated from `run_simulation`
allows for injecting custom plans easily. The PR is currently missing
the functionality to update the `SimulatorEnv` ad hoc from the plan, as
the environment tables were typically created during the planning phase.
The next steps will be to implement a function `fn
mk_env(InteractionPlan, SimulatorEnv) -> SimulatorEnv`, add `--load`
flag to the CLI for loading a serialized plan file, making a
corresponding environment and running the simulation.
We can optionally combine this with a `--save` option, in which we keep
a seed-vault as part of limbo simulator, corresponding each seed with
its generated plan and save the time to regenerate existing seeds by
just loading them into memory. I am curious to hear thoughts on this?
Would the maintainers be open to adding such a seed-vault? Do you think
the saved time would be worth the complexity of the approach?
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#720