Commit Graph

1683 Commits

Author SHA1 Message Date
Pekka Enberg
2cd3c47691 Merge 'Nicer parse errors using miette' from Samyak Sarnayak
I noticed that the parse errors were a bit hard to read - only the
nearest token and the line/col offsets were printed.
I made a first attempt at improving the errors using
[miette](https://github.com/zkat/miette).
- Added derive for `miette::Diagnostic` to both the parser's error type
and LimboError.
- Added miette dependency to both sqlite3_parser and core. The `fancy`
feature is only enabled for the CLI. So the overhead on the libraries
(core, parser) should be minimal.
Some future improvements that can be made further:
- Add spans to AST nodes so that errors can better point to the correct
token. See upstream issue: https://github.com/gwenn/lemon-rs/issues/33
- Construct more errors with offset information. I noticed that most
parser errors are constructed with `None` as the offset.
- The messages are a bit redundant (example "syntax error at (1, 6)").
This can improved.
Comparisons.
Before:
```
❯ cargo run --package limbo --bin limbo database.db --output-mode pretty
...
limbo> selet * from a;
[2025-01-05T11:22:55Z ERROR sqlite3Parser] near "Token([115, 101, 108, 101, 116])": syntax error
Parse error: near "selet": syntax error at (1, 6)
```
<img width="969" alt="image" src="https://github.com/user-
attachments/assets/82651a77-f5ac-4eee-b208-88c6ea7fc9b7" />
After:
```
❯ cargo run --package limbo --bin limbo database.db --output-mode pretty
...
limbo> selet * from a;
[2025-01-05T12:25:52Z ERROR sqlite3Parser] near "Token([115, 101, 108, 101, 116])": syntax error

  × near "selet": syntax error at (1, 6)
   ╭────
 1 │ selet * from a
   ·     ▲
   ·     ╰── syntax error
   ╰────

```
<img width="980" alt="image" src="https://github.com/user-
attachments/assets/747a90e5-5085-41f9-b0fe-25864179ca35" />

Closes #618
2025-01-05 21:09:52 +02:00
Pekka Enberg
ec9031dcf8 Improve JavaScript benchmarks 2025-01-05 20:43:55 +02:00
Pekka Enberg
daee7f8458 s/RowResult/StepResult/ 2025-01-05 20:24:26 +02:00
Pekka Enberg
5681603750 Merge 'Make iterate() lazily evaluated on wasm' from Diego Reis
#514
Introduces a new feature for lazy evaluation in the
`Statement.raw().iterate()` method and includes related changes in both
the test and implementation files. The most important changes include
adding a test case for lazy evaluation, creating a `RowIterator` struct,
and modifying the `iterate` method to use this new struct.
Everything seems to works fine, but suggestions on code improvement and
test use cases are welcoming.

Closes #527
2025-01-05 20:23:06 +02:00
Pekka Enberg
ba28999d05 Merge 'Add partial support for datetime() function' from Preston Thorpe
This PR adds the `datetime` function, with all the support currently
that date/time have for modifiers, and `julianday` function, as well as
some additional modifiers for date/time/datetime.
There are a couple considerations here, I left a couple comments but
essentially there is going to have to be some more work done to track
the state of the expression during the application of modifiers, to
handle a bunch of edge-cases like re-applying the same timezone modifier
to itself, or converting an integer automatically assumed to be
julianday, into epoch, or `ceiling`/`floor` which will determine
relative addition of time in cases like
```
2024-01-31 +1 month = 2024-03-02
```
which was painful enough to get working to begin with.
I couldn't get the `julianday_converter` library to get the exact same
float precision as sqlite, so function is included that matches their
output, for some reason floating point math + `.floor()` would give the
correct result. They seem to 'round' to 8 decimal places, and I was able
to get this to work with the same output as sqlite, except in cases like
`2234.5`, in which case we return `2234.5000000` because of the `fmt`
precision:
```rust
pub fn exec_julianday(time_value: &OwnedValue) -> Result<String> {
    let dt = parse_naive_date_time(time_value);
    match dt {
        // if we did something heinous like: parse::<f64>().unwrap().to_string()
        // that would solve the precision issue, but dear lord...
        Some(dt) => Ok(format!("{:.1$}", to_julian_day_exact(&dt), 8)),
        None => Ok(String::new()),
    }
}
```
Suggestions would be appreciated on the float precision issue.

Reviewed-by: Sonny <14060682+sonhmai@users.noreply.github.com>

Closes #600
2025-01-05 20:13:13 +02:00
Pekka Enberg
651442d008 Merge 'Add skeleton code for implementing java bindings in jdbc style ' from Kim Seon Woo
## Purpose of the PR
- Add minimal template code that provides Limbo features.
## Changes
- Added `DB` which is an interface to DB.
- Added 'LimboDB` which is a thin wrapper around native methods provided
using jni.
## TODO
- Incrementally update the code to support jdbc. Refer to [sqlite-
jdbc](https://github.com/xerial/sqlite-jdbc).
## Reference
- https://github.com/tursodatabase/limbo/issues/615

Closes #619
2025-01-05 20:12:31 +02:00
Pekka Enberg
80de7f35e7 Merge 'Improve CONTRIBUTING on pull requests' from Pekka Enberg
These are not hard rules in the project, but describe what the ideal
pull request looks like.

Closes #617
2025-01-05 20:11:58 +02:00
Pekka Enberg
4a5b6b43bd Merge 'Fix quote escape in literals' from Vrishabh
Previously we were not escaping the quotes properly in stirng literals
as shown below. The PR fixes that
limbo output without this PR
```
limbo> select '''a';
''a
limbo>  explain select '''a';
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     4     0                    0   Start at 4
1     String8            0     1     0     ''a            0   r[1]='''a'
2     ResultRow          1     1     0                    0   output=r[1]
3     Halt               0     0     0                    0
4     Transaction        0     0     0                    0
5     Goto               0     1     0                    0
```
sqlite3 output
```
sqlite> select '''a';
'a
sqlite> explain select '''a';
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     4     0                    0
1     String8        0     1     0     'a             0
2     ResultRow      1     1     0                    0
3     Halt           0     0     0                    0
4     Goto           0     1     0                    0
```
limbo output with this PR
```
limbo> select '''a';
'a
limbo> explain select '''a';
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     4     0                    0   Start at 4
1     String8            0     1     0     'a             0   r[1]=''a'
2     ResultRow          1     1     0                    0   output=r[1]
3     Halt               0     0     0                    0
4     Transaction        0     0     0                    0
5     Goto               0     1     0                    0
```

Closes #616
2025-01-05 20:11:48 +02:00
Pekka Enberg
92ef68d627 Merge 'refactor: simplify database header write logic' from Ziyak Jehangir
- Removed unecessary clones
- Got rid of uneeded scoping in read_complete callback
- Use more rusty syntax e.g. std::cell::RefCell::borrow_mut(&buffer) ->
buffer.borrow_mut()
- Changed write_page.unwrap() to write_page()? to match run_once and
read_page()
No functional change.

Reviewed-by: Jussi Saurio <kacperoza@gmail.com>

Closes #608
2025-01-05 20:11:37 +02:00
Pekka Enberg
745c32e335 Merge 'Sqlite3 parser perf improvements' from Jussi Saurio
Manually vendored in some changes from [lemon-
rs](https://github.com/gwenn/lemon-rs), including a merged change from
@krishvishal and [an unmerged PR ](https://github.com/gwenn/lemon-
rs/pull/81) from user ignatz that boxes Limit. I also boxed `OneSelect`
because it also improved perf in the benchmarks. 40-50% more throughput
with these changes to our existing admittedly simple benchmarks. Added a
new more complex prepare benchmark that includes group by and having as
well, which is also 42% faster with the new code.
**Runs on my local machine:**
```
main:

limbo/Prepare statement: 'SELECT 1'
                        time:   [1.2902 µs 1.2927 µs 1.2958 µs]
                        thrpt:  [771.73 Kelem/s 773.56 Kelem/s 775.07 Kelem/s]
                 change:
                        time:   [+0.2770% +0.6013% +0.9243%] (p = 0.00 < 0.05)
                        thrpt:  [-0.9158% -0.5977% -0.2762%]
limbo/Prepare statement: 'SELECT * FROM users LIMIT 1'
                        time:   [2.4885 µs 2.4927 µs 2.4971 µs]
                        thrpt:  [400.47 Kelem/s 401.18 Kelem/s 401.84 Kelem/s]
                 change:
                        time:   [+1.2859% +1.6970% +2.0993%] (p = 0.00 < 0.05)
                        thrpt:  [-2.0561% -1.6687% -1.2696%]
limbo/Prepare statement: 'SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1)...
                        time:   [5.6867 µs 5.6994 µs 5.7164 µs]
                        thrpt:  [174.93 Kelem/s 175.46 Kelem/s 175.85 Kelem/s]
                 change:
                        time:   [+16.921% +17.332% +17.765%] (p = 0.00 < 0.05)
                        thrpt:  [-15.085% -14.772% -14.472%]

this branch:

limbo/Prepare statement: 'SELECT 1'
                        time:   [861.48 ns 862.60 ns 863.79 ns]
                        thrpt:  [1.1577 Melem/s 1.1593 Melem/s 1.1608 Melem/s]
                 change:
                        time:   [-33.293% -33.042% -32.754%] (p = 0.00 < 0.05)
                        thrpt:  [+48.709% +49.347% +49.909%]
                        Performance has improved.
limbo/Prepare statement: 'SELECT * FROM users LIMIT 1'
                        time:   [1.6080 µs 1.6106 µs 1.6140 µs]
                        thrpt:  [619.58 Kelem/s 620.87 Kelem/s 621.88 Kelem/s]
                 change:
                        time:   [-35.838% -35.611% -35.380%] (p = 0.00 < 0.05)
                        thrpt:  [+54.750% +55.305% +55.857%]
                        Performance has improved.
Benchmarking limbo/Prepare statement: 'SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1)...: Collecting 100 samples in estimated 5.0125 s (1.
limbo/Prepare statement: 'SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1)...
                        time:   [4.0161 µs 4.0301 µs 4.0473 µs]
                        thrpt:  [247.08 Kelem/s 248.13 Kelem/s 249.00 Kelem/s]
                 change:
                        time:   [-29.791% -29.596% -29.399%] (p = 0.00 < 0.05)
                        thrpt:  [+41.642% +42.038% +42.431%]
                        Performance has improved.
```
**Runs in CI:**
```
most recent commit on main:

limbo/Prepare statement: 'SELECT 1'
                        time:   [2.7085 µs 2.7113 µs 2.7153 µs]
                        thrpt:  [368.28 Kelem/s 368.83 Kelem/s 369.21 Kelem/s]
limbo/Prepare statement: 'SELECT * FROM users LIMIT 1'
                        time:   [4.8688 µs 4.8713 µs 4.8741 µs]
                        thrpt:  [205.17 Kelem/s 205.29 Kelem/s 205.39 Kelem/s]

this branch:

limbo/Prepare statement: 'SELECT 1'
                        time:   [1.9278 µs 1.9329 µs 1.9405 µs]
                        thrpt:  [515.33 Kelem/s 517.35 Kelem/s 518.73 Kelem/s]
limbo/Prepare statement: 'SELECT * FROM users LIMIT 1'
                        time:   [3.5708 µs 3.5 µs 3.5794 µs]
                        thrpt:  [279.38 Kelem/s 279.75 Kelem/s 280.05 Kelem/s]
```
**Discussion:**
Generally I think we should probably just, philosophically, hard fork
this vendored code and start making whatever modifications we want to
it... thoughts?
Also I guess there's a way to add a co-authored by XXX to these commits
so that they don't show up under my name only, because I didn't write
most of it.

Closes #620
2025-01-05 18:03:58 +02:00
Samyak S Sarnayak
c09a0bcbf3 Nicer parse errors using miette
I noticed that the parse errors were a bit hard to read - only the nearest token and the line/col offsets were printed.

I made a first attempt at improving the errors using [miette](https://github.com/zkat/miette).
- Added derive for `miette::Diagnostic` to both the parser's error type and LimboError.
- Added miette dependency to both sqlite3_parser and core. The `fancy` feature is only enabled for CLI.

Some future improvements that can be made further:
- Add spans to AST nodes so that errors can better point to the correct token. See upstream issue: https://github.com/gwenn/lemon-rs/issues/33
- Construct more errors with offset information. I noticed that most parser errors are constructed with `None` as the offset.

Comparisons.
Before:
```
❯ cargo run --package limbo --bin limbo database.db --output-mode pretty
...
limbo> selet * from a;
[2025-01-05T11:22:55Z ERROR sqlite3Parser] near "Token([115, 101, 108, 101, 116])": syntax error
Parse error: near "selet": syntax error at (1, 6)
```

After:
```
❯ cargo run --package limbo --bin limbo database.db --output-mode pretty
...
limbo> selet * from a;
[2025-01-05T12:25:52Z ERROR sqlite3Parser] near "Token([115, 101, 108, 101, 116])": syntax error

  × near "selet": syntax error at (1, 6)
   ╭────
 1 │ selet * from a
   ·     ▲
   ·     ╰── syntax error
   ╰────

```
2025-01-05 17:56:59 +05:30
Jussi Saurio
f0b3bac435 add new more complex benchmark entry for preparing statement 2025-01-05 13:51:56 +02:00
Jussi Saurio
f434b24e63 Fix limbo/core to work with new boxed ast types 2025-01-05 13:51:34 +02:00
Jussi Saurio
97eae13d0a boxed limit (by ignatz) 2025-01-05 13:51:02 +02:00
Jussi Saurio
f5540e9602 boxed select and selectbody (by gwenn and jussisaurio) 2025-01-05 13:50:32 +02:00
Jussi Saurio
d35eadb22c preallocate capacity for yystack (by krishvishal) 2025-01-05 13:46:30 +02:00
psvri
9340c8f0b1 Change sanitize_string comments to doc string 2025-01-05 17:16:18 +05:30
김선우
038ea16d75 Add TODO comments for deprecation. 2025-01-05 20:26:21 +09:00
김선우
8e110da9c9 Add wrapper classes around native methods that Limbo will provide 2025-01-05 20:21:35 +09:00
psvri
a11f4b2b10 Refactor escape string literal logic to a function 2025-01-05 14:21:11 +05:30
Pekka Enberg
fdbf62d5b3 Merge 'Add support for Java bindings' from Kim Seon Woo
Purpose of this PR

- Add support for Java (as Java has an extensive community)
- Enable Limbo to be provided as a Java library in the future.

Changes

- Added `bindings/java` directory.
- Created `src` package for Java (Gradle) and `rs_src` for Rust JNI
code.
- Implemented basic functionality to gather feedback and proceed with
further development.
  - Some features just printout the result for testing purposes.

Future Work

- Integrate CI to publish the library.
- Enhance error handling mechanisms.
- Implement additional features and functionality.
- Add test code after we decide on which features to provide

Issue

https://github.com/tursodatabase/limbo/issues/615

Closes #613
2025-01-05 10:28:35 +02:00
김선우
370e1ca5c2 Add support Java bindings
This add support for Java bindings in the bindings/java directory.
2025-01-05 10:28:05 +02:00
Pekka Enberg
1c2e074c93 Merge 'Add SQLite installation instructions for Windows to contrib' from dkaluza
Closes #614
2025-01-05 10:04:10 +02:00
Pekka Enberg
8617b1f866 Improve CONTRIBUTING on pull requests
These are not hard rules in the project, but describe what the ideal
pull request looks like.
2025-01-05 10:01:07 +02:00
psvri
2d84956fda Fix quote escape in literals 2025-01-05 01:35:29 +05:30
Daniel Kaluza
75c89ed08e Add SQLite installation instructions for Windows to contrib 2025-01-04 19:10:41 +01:00
Pekka Enberg
9f3e064bcf Merge 'Cleanup emitter some more' from Jussi Saurio
No functional changes, just move almost everything out of `emitter.rs`
into smaller modules with more distinct responsibilities. Also, from
`expr.rs`, move `translate_aggregation` into `aggregation.rs` and
`translate_aggregation_groupby` into `group_by.rs`

Closes #610
2025-01-04 17:48:35 +02:00
Pekka Enberg
81b2e18520 Merge 'Add tests prerequisites to README and CONTRIBUTING' from dkaluza
* add description of workaround for running coverage with Rust 1.83

Reviewed-by: Jussi Saurio <kacperoza@gmail.com>

Closes #611
2025-01-04 17:47:25 +02:00
Pekka Enberg
5c52c8b1e9 Merge 'Fix integer overflow output to be same as sqlite3' from Vrishabh
In sqlite3, before arithmetic operation is done, it first checks if the
operation dosent overflow and then does the operation. In case it
overflows it would covert the arguments into floats and then does the
operation as [per code](https://github.com/sqlite/sqlite/blob/ded37f337b
7b2e916657a83732aaec40eb146282/src/vdbe.c#L1875)  . I have done the same
behaviour for limbo.

Closes #612
2025-01-04 17:46:48 +02:00
PThorpe92
ca428b3dda Julianday function and additional tests/comments 2025-01-04 10:42:34 -05:00
PThorpe92
9a635be7b8 Add tests for new modifiers and datetime func 2025-01-04 08:41:05 -05:00
PThorpe92
7c4a780cc2 Add DateTime func and support more modifiers 2025-01-04 08:41:03 -05:00
Daniel Kaluza
4c1e8038b3 Add tests prerequisites to README and CONTRIBUTING 2025-01-04 13:59:47 +01:00
Jussi Saurio
9a8156753e core/translate: break emitter.rs into smaller modules 2025-01-04 14:52:46 +02:00
psvri
18137c932e Fix integer overflow output to be same result as sqlite3 2025-01-04 18:14:09 +05:30
Pekka Enberg
b18abba118 Merge 'Auto-create index in CREATE TABLE when necessary' from Jussi Saurio
Closes #448
Adds support for:
- Automatically creating index on the PRIMARY KEY if the pk is not a
rowid alias
- Parsing the automatically created index into memory, for use in
queries
    * `testing/testing_norowidalias.db` now uses the PK indexes and some
tests were failing -- looks like taking the index into use revealed some
bugs in our codegen :) I fixed those in later commits.
Does not add support for:
- Inserting to the index during writes to the table

Closes #588
2025-01-04 14:10:44 +02:00
Jussi Saurio
1b61749c0f feat/core/translate: create automatic index in CREATE TABLE when necessary 2025-01-04 13:54:44 +02:00
Pekka Enberg
fc60e544af Merge 'Fix arithmetic operations for text values' from Vrishabh
We had not implemented arithmetic operations for text values. This PR
implements this and aligns the behavior with sqlite3 .

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #605
2025-01-04 13:40:03 +02:00
Pekka Enberg
bae452f6b0 Merge 'sqlite3: Add in-memory support to sqlite3_open()' from Pekka Enberg
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #606
2025-01-04 13:39:58 +02:00
Pekka Enberg
0238eaf95d Merge 'Emitter cleanup part 2' from Jussi Saurio
Changes:
---
Instead of passing around:
    1. `SymbolTable` (for resolving functions dynamically provided by
extensions), and
    2. `precomputed_exprs_to_registers` (for resolving registers
containing results from already-computed expressions),
add `Resolver` struct to `TranslateCtx` that encapsulates both.
This introduces some lifetime annotation spam unfortunately, but maybe
we can also migrate to using references instead of cloning elsewhere as
well, since we generally do a lot of copying of expressions right now
for convenience.
---
Use way less arguments to functions - mainly just passing `program`,
`t_ctx` and `plan` around.

Closes #609
2025-01-04 13:27:38 +02:00
Jussi Saurio
e31317fbb5 emitter.rs: use way less arguments to functions 2025-01-04 12:59:30 +02:00
Jussi Saurio
d2b73e8492 emitter.rs: make t_ctx always be the second argument to any functions 2025-01-04 12:32:50 +02:00
Jussi Saurio
d1f74fa3cb Emitter cleanup part 2: add Resolver 2025-01-04 12:23:19 +02:00
Pekka Enberg
08844b80bb Merge 'Code cleanups to emitter' from Jussi Saurio
I plan to do a series of cleanups to `emitter.rs` which is a chonker
module right now. This first one is just making some variable namings
more consistent and refactoring some metadata fields to function-local
variables

Closes #604
2025-01-04 11:03:52 +02:00
Ziyak Jehangir
4f119f4b95 refactor: simplify database header write logic 2025-01-04 14:31:23 +05:30
Pekka Enberg
276819369c sqlite3: Add in-memory support to sqlite3_open() 2025-01-04 10:58:51 +02:00
Jussi Saurio
23f1858239 translatectx: consistent naming 2025-01-04 10:39:32 +02:00
Jussi Saurio
1a01487872 left join metadata: consistent naming 2025-01-04 10:39:32 +02:00
Jussi Saurio
2f129402e8 sorter data register: consistent naming 2025-01-04 10:39:32 +02:00
Jussi Saurio
0c572cda3c more consistent function naming 2025-01-04 10:39:32 +02:00