Hey! I've rebuilt the JSON path parser to make it easier to maintain and
catch more edge cases.
Main stuff:
- Removed pest dependency in favor of hand-written parser
- Better memory usage (pre-allocates space, use Cow)
- Handles quoted keys properly now
- Added tests for weird edge cases
- Added PPState (Shoutout ThePrimeagen)
Closes#970
Adds initial limited support for CTEs.
- No MATERIALIZED
- No RECURSIVE
- No named CTE columns
- Only SELECT statements supported inside CTE
Basically this kind of WITH clause can just be rewritten as a subquery,
so this PR adds some plumbing to rewrite them using the existing
subquery machinery.
It also introduces the concept of a `Scope` where a child query can
refer to its parent, useful for CTEs like:
```
do_execsql_test nested-subquery-cte {
with nested_sub as (
select concat(name, '!!!') as loud_hat
from products where name = 'hat'
),
sub as (
select upper(nested_sub.loud_hat) as loudest_hat from nested_sub
)
select sub.loudest_hat from sub;
} {HAT!!!}
```
I think we need to expand the use of `Scope` to all of our identifier
resolutions (currently we don't explicitly have logic for determining
what a given query can see), but I didn't want to bloat the PR too much.
Hence, this implementation is probably full of all sorts of bugs, but
I've added equivalent tests for ALL the existing subquery tests,
rewritten in CTE form.
Closes#920
I am on a bit of a mission to revisit a lot of the ref counting, this
was an easy first win.
It seems to be a linear path of function calls or hashmaps which can own
the completions directly, no cloning needed.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#912
This is an attempt to move towards #881. I am not sure this is the
direction you want to take. In any case, I thought I would take a crack
at converting `values` from `Record` to private and see how bad it would
be.
In the end, as you can see, it is not so bad. I think performance-wise
it shouldn't be a bad hit with Rust's zero-cost abstraction. Also,
during the process I noticed a couple improvements that could be made
here and there but I honestly wanted to start with something small
enough that wouldn't be too hard to review.
Anyway, let me know if this is really how you would like to proceed.
Closes#962
This PR aims to simplify text creation and to reduce allocation overhead
of creating a new OwnedValue::Text. Instead of creating Rc<String> every
time you need to create a text, you just pass the string slice and the
Rc<String> is created at the end. This change, at least on my machine,
has removed a lot of variability in the benchmarking performance, while
maintaining roughly the same performance.
Closes#961
Add fuzz test for string functions and fix 2 bugs in implementation of
`LTRIM/RTRIM/TRIM` and `QUOTE`:
- `QUOTE` needs to escape internal quote(`'`) symbols
- `LTRIM`/`RTRIM`/`TRIM` needs to check if they have additional argument
Closes#958
Fix codegen for `CAST` expression and also adjust implementation of
`CAST: Real -> Int` because `SQLite` use "closest integer between the
REAL value and zero" as a CAST result.
Closes#956
Add `COALESCE` function in fuzz test and fix bug (found by fuzzer with
`COALESCE`) related to the resolution of labels which needs to be
resolved to next emitted instruction.
Before, code assumed that no two labels will need to be resolved to next
emitted instruction. But this assumption is wrong (at least in current
codegen logic) for example for following expression `COALESCE(0,
COALESCE(0, 0))`. Here, both `COALESCE` functions will create label and
resolve it to next emitted instruction in the end of generation. So, in
this case one of the labels will be actually "orphaned" and never be
assigned any position in the emitted code.
Closes#955
(after fixing complicated b-tree bugs - I want to end my day with the
joy of fixing simple bugs)
This PR fix bug in emit (which was introduced after switch from
`HashMap` to `Vec`) and also align `CASE` codegen in case of `NULL`
result from `WHEN` clause with SQLite behaviour.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#953
This PR introduce simple fuzz test for BTree insertion algorithm and
fixes few bugs found by fuzzer
- BTree algorithm returned early although there were overflow pages on
stack and more rebalances were needed
- BTree balancing algorithm worked under assumption that single page
will be enough for rebalance - although this is not always true (if page
were tightly packed with relatively big cells, insertion of new very big
cell can require 3 split pages to distribute the content between them)
- `overflow_cells` wasn't cleared properly during rebalancing
- insertions of dividers to the parent node were implemented incorrectly
- `defragment_page` didn't reset
`PAGE_HEADER_OFFSET_FRAGMENTED_BYTES_COUNT` field which can lead to
suboptimal usage of pages
Closes#951
We expose Wal trait as public, but there are three types in its
signature that are private.
Notice I chose to expose limbo_core::result instead of
limbo_core::result::LimboResult, since LimboResult is the only type in
that module.
Closes#939
- right-nested expression can generate multiple labels which needs to be
resolved to next generated instruction
- for example, COALESCE(0, COALESCE(0, 1))
the boxings will continue until morale improves
<img width="621" alt="Screenshot 2025-02-09 at 14 16 21"
src="https://github.com/user-
attachments/assets/887d218d-3db7-4a64-ad81-b7431adb23bb" />
Closes#947