`sqlite3-parser` spends a lot of time in `yy_reduce` assigning different
`enum YYMINORTYPE` members, and it suffers from bad performance because
the stack size of the enum is very large, and the size of individual
members varies wildly. This PR reduces the `YYMINORTYPE` stack size from
496 to 264 bytes and has the following effect:
Preparing statement:
```sql
Prepare `SELECT 1`/Limbo/SELECT 1
time: [620.37 ns 621.08 ns 621.79 ns]
change: [-17.811% -17.582% -17.380%] (p = 0.00 < 0.05)
Performance has improved.
Prepare `SELECT * FROM users LIMIT 1`/Limbo/SELECT * FROM users LIMIT 1
time: [1.2215 µs 1.2231 µs 1.2248 µs]
change: [-12.926% -12.627% -12.272%] (p = 0.00 < 0.05)
Performance has improved.
Benchmarking Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...: Collecting 100 samples in estimated 5.0152 s (
Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...
time: [3.1056 µs 3.1096 µs 3.1138 µs]
change: [-13.279% -12.995% -12.712%] (p = 0.00 < 0.05)
Performance has improved.
```
Execute (mainly to check for regressions):
```sql
Execute `SELECT * FROM users LIMIT ?`/Limbo/1
time: [402.19 ns 402.75 ns 403.36 ns]
change: [-3.4845% -2.4003% -1.6539%] (p = 0.00 < 0.05)
Performance has improved.
Execute `SELECT * FROM users LIMIT ?`/Limbo/10
time: [2.7920 µs 2.7977 µs 2.8036 µs]
change: [-1.3135% -1.0123% -0.7132%] (p = 0.00 < 0.05)
Change within noise threshold.
Execute `SELECT * FROM users LIMIT ?`/Limbo/50
time: [13.577 µs 13.633 µs 13.690 µs]
change: [-1.7709% -1.3575% -0.9563%] (p = 0.00 < 0.05)
Change within noise threshold.
```
Closes#936
Since we are already round robin'ing our `iovecs` in the same way, and
the map will never be larger than MAX_IOVECS, we can remove the cost of
hashing and keep o(1) lookups.
Closes#922
Hi! This is my first PR on the project, so I apologize if I did not
follow a convention from the project.
#127
This PR implements json_quote as specified in their source:
https://www.sqlite.org/json1.html#jquote. It follows the internal doc
guidelines for implementing functions. Most tests were added from sqlite
test suite for json_quote, while some others were added by me. Sqlite
test suite for json_quote depends on json_valid to test for correct
escape control characters, so that specific test at the moment cannot be
done the same way.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Sonny (@sonhmai)
Closes#763
**Core delete tasks**:
- [x] Implement free page functionality
- [x] Clear overflow pages if any before deleting their cells using free
page.
- [x] Implement delete for leaf page
- [ ] Balance after delete properly
**Auxiliary tasks to make delete work properly**:
- [x] Implement block coalescing in `free_cell_range` to reduce
fragmentation.
- [x] Track page fragmentation in `free_cell_range`.
- [x] Update page offsets in `drop_cell` and update cell pointer array
after dropping a cell.
- [x] Add TCL tasks
Closes#455
--------
I will add support for balancing after delete once `balance_nonroot` is
extended. In the current state of `balance_nonroot` balancing won't work
after delete and corrupts page.
But delete itself is functional now.
Closes#785
Adds `quickcheck` property based tests to the recently merged
`generate_series` implementation and fixes various issues in it:
- `0` for `step` should be treated as `1`
- missing `step` should be treated as `1`
- a string like `a` for `step` should be treated as `1`
- saturating overflow
- return an empty series for a variety of invalid inputs, instead of
erroring
For the future:
We should have e2e/fuzz tests comparing sqlite implementation to limbo
implementation
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes#910
Move result row to `ProgramState` to mimic what SQLite does where `Vdbe`
struct has a `pResultRow` member. This makes it easier to deal with result
lifetime, but more importantly, eventually lazily parse values at the edges of
the API.
To make implementation of DiskANN in limbo easier, I'm moving `vector`
from `extensions` to core.
Now `vector` related function are exposed via `Function` op code.
I've defined a new enum called `VectorFunc` to group the vector related
functions.
The `vector.test` TCL test runs fine.
```sql
limbo> SELECT vector_extract(vector('[]'));
[]
limbo> SELECT vector_extract(vector(' [ 1 , 2 , 3 ] '));
[1,2,3]
limbo> SELECT vector_extract(vector('[-1000000000000000000]'));
[-1000000000000000000]
limbo> SELECT vector_distance_cos(vector('[1,2,3]'), vector('[3,2,1]'));
0.2857142686843872
```
Closes#902