Commit Graph

126 Commits

Author SHA1 Message Date
Nikita Sivukhin
91fcb67b06 rewrite grammar generator and add fuzz test for arithmetic expressions 2025-02-02 18:39:24 +04:00
Glauber Costa
a7cc367c1f implement pragma pragma_list
List all available pragmas (Except pragma_list)
2025-01-31 06:44:56 -05:00
Pekka Enberg
3a4cb34606 Merge 'Fix memory leaks, make extension types more efficient' from Preston Thorpe
I was baffled previously, because any time that `free` was called on a
type from an extension, it would hang even when I knew it wasn't in use
any longer, and hadn't been double free'd.
After #737 was merged, I tried it again and noticed that it would no
longer hang... but only for extensions that were staticly linked.
Then I realized that we are using a global allocator, that likely wasn't
getting used in the shared library that is built separately that won't
inherit from our global allocator in core, causing some symbol mismatch
and the subsequent hanging on calls to `free`.
This PR adds the global allocator to extensions behind a feature flag in
the macro that will prevent it from being used in `wasm` and staticly
linked environments where it would conflict with limbos normal global
allocator. This allows us to properly free the memory from returning
extension functions over FFI.
This PR also changes the Extension type to a union field so we can store
int + float values inline without boxing them.
any additional tips or thoughts anyone else has on improving this would
be appreciated 👍

Closes #803
2025-01-30 13:31:17 +02:00
Pekka Enberg
cfc585813b Merge 'implement sqlite_source_id function' from Glauber Costa
Closes #811
2025-01-29 09:45:00 +02:00
Glauber Costa
8f24d18ad8 implement sqlite_source_id function 2025-01-28 14:55:38 -05:00
Jussi Saurio
e01555467f Add quickcheck property tests for vector extension 2025-01-28 15:53:11 +02:00
Pekka Enberg
ee05ad172b core: Bundle vector extension by default 2025-01-28 14:24:09 +02:00
Pekka Enberg
9462426685 Vector extension functions
This patch adds some libSQL vector extension functions such as
`vector()` and `vector_distance_cos()`, which can be used for exact
nearest neighbor search as follows:

```
limbo> SELECT embedding, vector_distance_cos(embedding, '[9, 9, 9]')
   ...> FROM movies ORDER BY vector_distance_cos(embedding, '[9, 9, 9]');
[4, 5, 6]|0.013072490692138672
[1, 2, 3]|0.07417994737625122
```
2025-01-28 14:24:09 +02:00
PThorpe92
793cdf8bad Fix memory issues, make extension types more efficient 2025-01-27 22:30:31 -05:00
Pekka Enberg
e8600fa2a1 Merge branch 'main' into static 2025-01-27 09:49:34 +02:00
Pekka Enberg
0918fc40d4 bindings/go: Rename to Limbo
...we'll likely call this Turso eventually, but right now, let's keep
the code consistent.
2025-01-26 20:58:10 +02:00
Pekka Enberg
0d0906dce4 Merge 'simulator: implement --load and --watch flags' from Alperen Keleş
The current status of the PR is halfway. The new framing of simulation
runner where `setup_simulation` is separated from `run_simulation`
allows for injecting custom plans easily. The PR is currently missing
the functionality to update the `SimulatorEnv` ad hoc from the plan, as
the environment tables were typically created during the planning phase.
The next steps will be to implement a function `fn
mk_env(InteractionPlan, SimulatorEnv) -> SimulatorEnv`, add `--load`
flag to the CLI for loading a serialized plan file, making a
corresponding environment and running the simulation.
We can optionally combine this with a `--save` option, in which we keep
a seed-vault as part of limbo simulator, corresponding each seed with
its generated plan and save the time to regenerate existing seeds by
just loading them into memory. I am curious to hear thoughts on this?
Would the maintainers be open to adding such a seed-vault? Do you think
the saved time would be worth the complexity of the approach?

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #720
2025-01-26 08:52:58 +02:00
PThorpe92
4be1f9c3cc Begin work on Go bindings (purego) 2025-01-25 11:37:12 -05:00
PThorpe92
c5e60d8e08 Enable only uuid by default, change tests back to account for this 2025-01-21 10:20:01 -05:00
PThorpe92
f13d035965 Enable wasm to static link extensions 2025-01-21 09:36:49 -05:00
PThorpe92
3d188eba0f Enable staticly linking with builtin extensions 2025-01-21 09:32:43 -05:00
sonhmai
a090fb927a centralize Rust integration and regression tests 2025-01-21 15:41:09 +07:00
sonhmai
cb631dafdc feat: wire checkpoint to bytecode execution 2025-01-20 08:34:13 +07:00
sonhmai
6243ffbab4 add dev dependencies for testing wal_checkpoint 2025-01-20 08:34:13 +07:00
sonhmai
66d6291f32 add scaffolding for supporting wal checkpoint 2025-01-20 08:34:13 +07:00
Pekka Enberg
0abb917604 Limbo 0.0.13 2025-01-19 13:30:56 +02:00
alpaylan
e476b9f697 implement watch mode
- add `--watch` flag
- start saving seeds in persistent storage
- make a separate version of execution functions that use `vector of interaction` instead of `InteractionPlan`
2025-01-18 23:54:03 +03:00
Jussi Saurio
af039ffa6e Merge 'Initial support for aggregate functions in extensions' from Preston Thorpe
#708
This PR adds basic support for the following API for defining
Aggregates, and changes the existing API for scalars.
```rust
register_extension! {
    scalars: { Double },
    aggregates: { MedianState },
}

#[derive(ScalarDerive)]
struct Double;

impl Scalar for Double {
    fn name(&self) -> &'static str {
        "double"
    }
    fn call(&self, args: &[Value]) -> Value {
        if let Some(arg) = args.first() {
            match arg.value_type() {
                ValueType::Float => {
                    let val = arg.to_float().unwrap();
                    Value::from_float(val * 2.0)
                }
                ValueType::Integer => {
                    let val = arg.to_integer().unwrap();
                    Value::from_integer(val * 2)
                }
                _ => {
                    println!("arg: {:?}", arg);
                    Value::null()
                }
            }
        } else {
            Value::null()
        }
    }
}

#[derive(AggregateDerive)]
struct MedianState;

impl AggFunc for MedianState {
    type State = Vec<f64>;

    fn name(&self) -> &'static str {
        "median"
    }
    fn args(&self) -> i32 { 1 }

    fn step(state: &mut Self::State, args: &[Value]) {
        if let Some(val) = args.first().and_then(Value::to_float) {
            state.push(val);
        }
    }

    fn finalize(state: Self::State) -> Value {
        if state.is_empty() {
            return Value::null();
        }
        let mut sorted = state;
        sorted.sort_by(|a, b| a.partial_cmp(b).unwrap());
        let len = sorted.len();
        if len % 2 == 1 {
            Value::from_float(sorted[len / 2])
        } else {
            let mid1 = sorted[len / 2 - 1];
            let mid2 = sorted[len / 2];
            Value::from_float((mid1 + mid2) / 2.0)
        }
    }
}

```
I know it's a bit more verbose than the previous version, but I think in
the long run this will be more robust, and it matches the aggregate API
of implementing a trait on a struct that you derive the relevant trait
on.
Also this allows for better registration of functions, I think passing
in the struct identifiers just feels much better than the `"func_name"
=> function_ptr`

Closes #721
2025-01-18 11:07:06 +02:00
PThorpe92
fc82461eff Complete percentile extension, enable col+delimeter args 2025-01-17 21:15:09 -05:00
김선우
eaa8743c36 Nit 2025-01-18 09:16:09 +09:00
김선우
fcadc2f825 Add connect function for creating connections from limbo db 2025-01-18 09:09:36 +09:00
PThorpe92
5dfc3b8787 Create simple extension for testing aggregate functions, add tests 2025-01-17 14:30:12 -05:00
alpaylan
28cde537a8 this commit;
- makes interaction plans serializable
- fixes the shadowing bug where non-created tables were assumed to be created in the shadow tables map
- makes small changes to make clippy happy
- reorganizes simulation running flow to remove unnecessary plan regenerations while shrinking and double checking
2025-01-17 01:30:46 +03:00
psvri
e43271f53b Implement regexp extension 2025-01-16 23:15:59 +05:30
Pekka Enberg
e8420a3cb7 Update Cargo.lock 2025-01-16 14:42:07 +02:00
Pekka Enberg
93903555aa Rename limbo_extension crate to limbo_ext 2025-01-16 14:40:52 +02:00
PThorpe92
343ccb3f72 Replace declare_scalar_functions in extension API with proc macro 2025-01-14 11:49:57 -05:00
PThorpe92
9c208dc866 Add tests for first extension 2025-01-14 07:27:35 -05:00
PThorpe92
c565fba195 Adjust types in extension API 2025-01-14 07:20:49 -05:00
PThorpe92
3412a3d4c2 Rough design for extension api/draft extension 2025-01-14 07:20:48 -05:00
PThorpe92
0a10d893d9 Sketch out runtime extension loading 2025-01-14 07:18:07 -05:00
Jorge López
a16282ea62 core: remove nix as a dependency 2025-01-14 11:06:13 +01:00
Pekka Enberg
b6ae8990e3 Limbo 0.0.12 2025-01-14 11:39:17 +02:00
Pekka Enberg
d223c72d03 Revert "core: Previous commits didn't actually remove nix as dependency, so do that here"
This reverts commit cca3846f95, we need to
bring it back unfortunately.
2025-01-14 10:25:11 +02:00
Pekka Enberg
945a91dee5 Merge 'core: Consolidate libc implementations' from Jorge López Tello
This shaves off `nix` as a dependency in core, which was only being used
in the io_uring backend for the `fcntl` advisory record locking. Since a
previous PR made `rustix` a dep not only for Mac but for any Unix, and
`rustix` also has `fcntl`, `nix` becomes redundant.
Furthermore, it reduces `libc` usage across core. The only remaining
uses are:
```rust
io_uring::opcode::Readv::new(fd, iovec as *const iovec as *const libc::iovec, 1)
io_uring::opcode::Writev::new(fd, iovec as *const iovec as *const libc::iovec, 1)
```
These two are a little ugly, but sadly the `io_uring` crate requires
both opcodes to take a `libc::iovec` while it doesn't export the type.
See tokio-rs/io-uring#259 for a request to use `std::io::IoSlice` or to
export the type directly.
Apart from those two, there are no other usages of libc, so if those are
resolved, we could also drop the dependency on libc.

Closes #668
2025-01-14 10:07:20 +02:00
Pekka Enberg
af020c27d6 Initial take on Rust bindings
This implements libSQL compatible Rust API on top of Limbo's core. The
purpose of this is to allow libraries and apps that build on libSQL to
use Limbo.
2025-01-14 09:16:46 +02:00
Jorge López
cca3846f95 core: Previous commits didn't actually remove nix as dependency, so do that here 2025-01-13 21:15:36 +01:00
Pekka Enberg
a369329988 Merge 'Add OPFS support to Wasm bindings' from Elijah Morgan
Lots of cleanup still left to do. Draft PR for adding support for OPFS
for WASM build (add support for limbo in browser).
Overall explanation of the architecture: this follows the sqlite wasm
architecture for OPFS.
main <> (limbo-worker.js) limbo (VFS - opfs.js) <> opfs-sync-proxy
The main thread loads limbo-worker.js which bootstraps the opfs.js and
opfs-sync-proxy.js and then launches the limbo-wasm.js.
At that point it can be used with worker.postmessage and
worker.onmessage interactions from the main thread.
The VFS provided by opfs.js provides a sync API by offloading async
operations to opfs-sync-proxy.js. This is done through SharedArrayBuffer
and Atomic.wait() to make the actual async operations appear synchronous
for limbo.
resolves #531

Closes #594
2025-01-07 10:36:18 +02:00
Samyak S Sarnayak
c09a0bcbf3 Nicer parse errors using miette
I noticed that the parse errors were a bit hard to read - only the nearest token and the line/col offsets were printed.

I made a first attempt at improving the errors using [miette](https://github.com/zkat/miette).
- Added derive for `miette::Diagnostic` to both the parser's error type and LimboError.
- Added miette dependency to both sqlite3_parser and core. The `fancy` feature is only enabled for CLI.

Some future improvements that can be made further:
- Add spans to AST nodes so that errors can better point to the correct token. See upstream issue: https://github.com/gwenn/lemon-rs/issues/33
- Construct more errors with offset information. I noticed that most parser errors are constructed with `None` as the offset.

Comparisons.
Before:
```
❯ cargo run --package limbo --bin limbo database.db --output-mode pretty
...
limbo> selet * from a;
[2025-01-05T11:22:55Z ERROR sqlite3Parser] near "Token([115, 101, 108, 101, 116])": syntax error
Parse error: near "selet": syntax error at (1, 6)
```

After:
```
❯ cargo run --package limbo --bin limbo database.db --output-mode pretty
...
limbo> selet * from a;
[2025-01-05T12:25:52Z ERROR sqlite3Parser] near "Token([115, 101, 108, 101, 116])": syntax error

  × near "selet": syntax error at (1, 6)
   ╭────
 1 │ selet * from a
   ·     ▲
   ·     ╰── syntax error
   ╰────

```
2025-01-05 17:56:59 +05:30
김선우
370e1ca5c2 Add support Java bindings
This add support for Java bindings in the bindings/java directory.
2025-01-05 10:28:05 +02:00
psvri
1f21cf6a71 Feat: Import csv support 2025-01-03 15:20:22 +05:30
Elijah Morgan
058ca89561 feat add basic opfs support and tests 2025-01-01 10:31:26 -05:00
Pekka Enberg
c4b0eb398c Limbo 0.0.11 2024-12-31 10:43:24 +02:00
Jussi Saurio
8c9bd0deb9 chore: commit cargo lock changes from #553 2024-12-29 08:58:26 +02:00
Pekka Enberg
37e1f35df8 Fix Cargo.toml in macros crate 2024-12-25 11:54:16 +02:00