Commit Graph

1463 Commits

Author SHA1 Message Date
PThorpe92
ae88d51e6f Remove TableReferenceType enum to clean up planner 2025-02-06 09:15:39 -05:00
PThorpe92
a8ae957162 Add tests for series extension, finish initial vtable impl 2025-02-06 09:15:39 -05:00
PThorpe92
ad30ccdc0e Add docs in extension README for vtable modules 2025-02-06 09:15:39 -05:00
PThorpe92
d4c06545e1 Refactor vtable impl and remove Rc Refcell from module 2025-02-06 09:15:39 -05:00
PThorpe92
661c74e338 Apply new planner structure to virtual table impl 2025-02-06 09:15:28 -05:00
Jussi Saurio
f5f77c0bd1 Initial virtual table implementation 2025-02-06 07:51:50 -05:00
Pekka Enberg
f4a574e6bc core: Move strftime to functions module 2025-02-06 13:53:36 +02:00
Pekka Enberg
ee8eabf167 core: Move datetime to functions module 2025-02-06 13:52:25 +02:00
Pekka Enberg
7513f859df core: Move printf to functions module 2025-02-06 13:50:05 +02:00
Pekka Enberg
238fb9c977 Merge 'Sqlean Crypto extension' from Diego Reis
Introduces a new `crypto` extension, compatible with the Sqlean [crypto
extension](https://github.com/nalgeon/sqlean/blob/main/docs/crypto.md).

Closes #903
2025-02-06 13:46:01 +02:00
Pekka Enberg
f3902ef9b6 core: Rename OwnedRecord to Record
We only have one record type so let's call it `Record`.
2025-02-06 13:40:34 +02:00
Pekka Enberg
f9828e0e6f core: Parse UTF-8 strings lazily 2025-02-06 13:27:52 +02:00
Pekka Enberg
c210821100 core: Move result row to ProgramState
Move result row to `ProgramState` to mimic what SQLite does where `Vdbe`
struct has a `pResultRow` member. This makes it easier to deal with result
lifetime, but more importantly, eventually lazily parse values at the edges of
the API.
2025-02-06 11:52:26 +02:00
Pekka Enberg
0012e9d556 cargo fmt 2025-02-06 10:44:37 +02:00
Pekka Enberg
f769d1aa2a s/LimboText/Text/g 2025-02-06 10:44:02 +02:00
Pekka Enberg
2546413d40 Merge 'Move vector into core from extensions' from Krishna Vishal
To make implementation of DiskANN in limbo easier, I'm moving `vector`
from `extensions` to core.
Now `vector` related function are exposed via `Function` op code.
I've defined a new enum called `VectorFunc` to group the vector related
functions.
The `vector.test` TCL test runs fine.
```sql
limbo>   SELECT vector_extract(vector('[]'));
[]
limbo> SELECT vector_extract(vector('  [  1  ,  2  ,  3  ]  '));
[1,2,3]
limbo> SELECT vector_extract(vector('[-1000000000000000000]'));
[-1000000000000000000]
limbo> SELECT vector_distance_cos(vector('[1,2,3]'), vector('[3,2,1]'));
0.2857142686843872
```

Closes #902
2025-02-06 07:41:28 +02:00
Diego Reis
05057a04ac completes crypto extension
It aims to be compatible with https://github.com/nalgeon/sqlean/blob/main/docs/crypto.md
2025-02-06 01:42:47 -03:00
Diego Reis
dd58be3b60 Add basic structure for crypto extension 2025-02-05 23:09:26 -03:00
krishvishal
32080aba5d Make vector function accessible through Function op code. 2025-02-06 07:01:50 +05:30
krishvishal
d516821e27 Add vector to core and make necessary changes to types.rs. 2025-02-06 07:00:51 +05:30
krishvishal
a3d0e1e974 Remove vector extension from different Cargo.toml files and add quickcheck, quickcheck_macros
and `rand` crates to core's Cargo.toml file
2025-02-06 06:58:41 +05:30
Pekka Enberg
0d318d810e core: Add Text::from_str() helper 2025-02-05 20:02:57 +02:00
Pekka Enberg
5abf49a0be core: Rename LimboText to Text 2025-02-05 20:02:27 +02:00
Pekka Enberg
6ea7fa06d2 Merge 'prepare perf: make ProgramBuilder aware of plan to count/estimate required memory' from Jussi Saurio
Use knowledge of query plan to inform how much memory to initially
allocate for `ProgramBuilder` vectors
Some of them are exact, some are semi-random estimates
```sql
Prepare `SELECT 1`/Limbo/SELECT 1
                        time:   [756.93 ns 758.11 ns 759.59 ns]
                        change: [-4.5974% -4.3153% -4.0393%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) low severe
  1 (1.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

Prepare `SELECT * FROM users LIMIT 1`/Limbo/SELECT * FROM users LIMIT 1
                        time:   [1.4739 µs 1.4769 µs 1.4800 µs]
                        change: [-7.9364% -7.7171% -7.4979%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...`
                        time:   [3.7440 µs 3.7520 µs 3.7596 µs]
                        change: [-5.4627% -5.1578% -4.8445%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
```

Closes #899
2025-02-05 18:24:16 +02:00
Pekka Enberg
b5f5e40986 Merge 'prepare perf: dont eagerly allocate result column name strings' from Jussi Saurio
- Remove eagerly allocated `name` from `ResultSetColumn`
- `ResultSetColumn` can calculate `name()` on demand:
    - if it has an alias (`foo as bar`), use that
    - if it is a column reference, use that
    - otherwise return none, and callers can assign it a placeholder
name (like `column_1`)
- move the `plan.result_columns` and `plan.table_references` to
`Program` after preparing statement is done, so that column names can be
returned upon request
- make `name` in `Column` optional, not needed for pseudo tables and
sorters so avoids an extra string allocation
```sql
Prepare `SELECT 1`/Limbo/SELECT 1
                        time:   [756.80 ns 758.27 ns 760.04 ns]
                        change: [-3.3257% -3.0252% -2.7035%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) low severe
  3 (3.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe

Prepare `SELECT * FROM users LIMIT 1`/Limbo/SELECT * FROM users LIMIT 1
                        time:   [1.4646 µs 1.4669 µs 1.4696 µs]
                        change: [-6.4769% -6.2021% -5.9137%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low severe
  3 (3.00%) low mild
  3 (3.00%) high severe

Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...`
                        time:   [3.7256 µs 3.7311 µs 3.7376 µs]
                        change: [-4.5195% -4.2192% -3.9309%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low severe
  2 (2.00%) low mild
  2 (2.00%) high mild
```

Closes #898
2025-02-05 18:20:01 +02:00
Jussi Saurio
795576b2ec dont eagerly allocate result column name strings 2025-02-05 17:53:23 +02:00
Jussi Saurio
f599b5a752 Make programbuilder aware of plan to count/estimate required memory 2025-02-05 14:22:42 +02:00
Pekka Enberg
f772fc83e1 core/mvcc: Disable test_overlapping_concurrent_inserts_read_your_writes test
...it fails sporadically
2025-02-05 14:18:56 +02:00
Pekka Enberg
56d401fb67 Merge 'Implement json_set' from Marcus Nilsson
This PR adds support for `json_set`.
There are three helper functions added:
1. `json_path_from_owned_value`, this function turns an `OwnedValue`
into a `JsonPath`.
2. `find_or_create_target`, this function is similar to `find_target`
with the added bonus of creating the target if it doesn't exist. There
is a caveat with this function and that is that it will create
objects/arrays as it goes, meaning if you send `{}` into it and try
getting the path `$.some.nested.array[123].field`, it will return
`{"some":{"nested":array:[]}}` since creation of `some`, `nested` and
`array` will succeed, but accessing element `123` will fail.
3. `create_and_mutate_json_by_path`, this function is very similar to
`mutate_json_by_path` but calls `find_or_create_target` instead of
`find_target`

Related to #127

Closes #878
2025-02-05 14:15:02 +02:00
Pekka Enberg
acb98f56d5 core/mvcc: Thanks Clippy... 2025-02-05 13:44:55 +02:00
Pekka Enberg
36b487d281 core/mvcc: Make Clippy happy 2025-02-05 13:41:20 +02:00
Pekka Enberg
5870c92e9e core/mvcc: Fix MVCC benchmark SIGKILL
The `begin_tx` benchmark makes no sense because it just fills up memory with
transaction metadata, eventually killing the process...
2025-02-05 13:33:38 +02:00
Pekka Enberg
44ca85e121 core: Enable MVCC benchmark 2025-02-05 13:26:05 +02:00
Pekka Enberg
fad479ac59 core/mvcc: Move source code to module 2025-02-05 13:25:16 +02:00
Pekka Enberg
a585b81148 mvcc/core: Kill S3 persistent storage 2025-02-05 12:51:58 +02:00
Pekka Enberg
e923a2352e core/mvcc: Kill mvcc-rs crate
We'll just integrate everything in the core.
2025-02-05 12:50:46 +02:00
Pekka Enberg
9f0b33a8ef core/mvcc: Remove README.md 2025-02-05 12:50:46 +02:00
Pekka Enberg
5c9bb4bddd core/mvcc: Remove duplicate Cargo workspace config 2025-02-05 12:42:39 +02:00
Pekka Enberg
5e282c00bc Remove duplicate MIT license 2025-02-05 12:42:15 +02:00
Pekka Enberg
7d99894269 Move MVCC docs to top-level docs directory 2025-02-05 12:41:55 +02:00
Pekka Enberg
df20213a4b core/mvcc: Remove C bindings
We won't need them because we just use the Rust APIs in the core.
2025-02-05 12:40:28 +02:00
Pekka Enberg
fcb4c7e46a core/mvcc: Remove Git metadata files 2025-02-05 12:40:06 +02:00
Pekka Enberg
b9568b74af Merge "Hekaton MVCC implementation" from Pekka and others
This imports the full history of the following Git repository into
`core/mvcc` directory as-is:

https://github.com/penberg/tihku/tree/main
2025-02-05 12:38:35 +02:00
Pekka Enberg
9fdf54de2b Merge 'Small perf optimizations to statement preparation' from Jussi Saurio
```bash
Prepare `SELECT 1`/Limbo/SELECT 1
                        time:   [765.94 ns 768.26 ns 771.03 ns]
                        change: [-7.8340% -7.4887% -7.1406%] (p = 0.00 < 0.05)
                        Performance has improved.

Prepare `SELECT * FROM users LIMIT 1`/Limbo/SELECT * FROM users LIMIT 1
                        time:   [1.5673 µs 1.5699 µs 1.5731 µs]
                        change: [-10.810% -9.7122% -8.4951%] (p = 0.00 < 0.05)
                        Performance has improved.

Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...
                        time:   [4.1331 µs 4.1421 µs 4.1513 µs]
                        change: [-9.3157% -9.0255% -8.7372%] (p = 0.00 < 0.05)
                        Performance has improved.
```
flamegraph for prepare `SELECT 1`:
<img width="1718" alt="Screenshot 2025-02-03 at 10 34 14"
src="https://github.com/user-
attachments/assets/ba67fe2f-78b2-4796-9a09-837d8e79fe62" />

Closes #872
2025-02-05 10:46:57 +02:00
Pekka Enberg
0b0681c9f8 core/vdbe: Lazy cursor borrowing
This saves a few more nanoseconds:

```
Execute `SELECT 1`/Limbo
                        time:   [44.964 ns 45.064 ns 45.160 ns]
                        change: [-14.371% -13.724% -13.214%] (p = 0.00 < 0.05)
                        Performance has improved.
```
2025-02-05 09:47:17 +02:00
Pekka Enberg
23cd8b10c3 core: Unify StepResult structs
...also simplify Statement::step() to get some performance back.

Before:

```
Execute `SELECT 1`/Limbo
                        time:   [49.128 ns 50.425 ns 52.604 ns]
```

After:

```
Execute `SELECT 1`/Limbo
                        time:   [49.128 ns 50.425 ns 52.604 ns]
```
2025-02-05 09:09:32 +02:00
Pekka Enberg
7573fc62e6 core: Unify Row and Record structs
They're exactly the same thing.
2025-02-05 09:04:52 +02:00
Marcus Nilsson
01492cf46f add support for json_set
Test cases are included.
Related to #127
2025-02-04 19:09:58 +01:00
Marcus Nilsson
3478352b18 move extraction of JsonPath from OwnedValue to separate function 2025-02-04 17:49:49 +01:00
Pekka Enberg
e4d7474372 core: Switch to parking_lot for RwLock
We really need to make the WAL lock less expensive, but switching to
`parking_lot` is anyway something we should do.

Before:

```
Execute `SELECT 1`/Limbo
                        time:   [56.230 ns 56.463 ns 56.688 ns]
```

After:

```
Execute `SELECT 1`/Limbo
                        time:   [52.003 ns 52.132 ns 52.287 ns]
```
2025-02-04 18:38:33 +02:00