Commit Graph

503 Commits

Author SHA1 Message Date
Pekka Enberg
4142f4f4cb Merge 'Organize extension library and feature gate VFS' from Preston Thorpe
I keep having 3+ PR's in at the same time and always deal with crazy
conflicts because everything in the `ext` library is together in one
file.
This PR moves each category of extension into its own file, and
separates the `vfs` functionality in Core into the `ext/dynamic` module,
so that it can be more easily separated from wasm (or non feature =
"fs") targets to prevent build issues.
The only semantic changes made in this PR is the feature gating of vfs,
the rest is simply organizing and cleaning up imports.
Was unsure if `vfs` should be a feature on the `core` side too, or to
just enable it with the `fs` feature which seemed reasonable, as that
was already the current behavior. But let me know if we want it entirely
behind it's own feature.

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #1124
2025-03-19 19:08:13 +02:00
Pekka Enberg
d4db5eb4c1 Merge 'Various JSON and JSONB function improvements' from Ihor Andrianov
Added jsonb_remove, jsonb_replace, json_replace.
Updated json_remove to use jsonb under the hood.
Fixed json function big numbers serialization.
Add tests for new functions.

Closes #1140
2025-03-19 17:21:44 +02:00
PThorpe92
57d4aa7216 Reorganize ext library and feature gate vfs to more easily prevent wasm build issues 2025-03-19 10:17:11 -04:00
Ihor Andrianov
32ea972151 make tests pass 2025-03-19 11:29:46 +02:00
Ihor Andrianov
b5e86a9e36 remove and replace functions defenitions 2025-03-18 21:43:48 +02:00
Pekka Enberg
a2c6831f30 Merge 'Implement FastLock for DatabaseHeader' from Pere Diaz Bou
The motivation behind implementing our own lock is to not depend on any
dependency as we should moving forward. This is a experiment for now as
a single test obviously is not enough but I believe this is the right
direction to on.
## benchmark tldr;
Execute benchmarks have a performance improvement around [1.78%, 7.5%]
which seems like it went okay as it was expected from removing
`pthread_mutex` calls.
## benchmarks before
```
Prepare `SELECT 1`/Limbo/SELECT 1
                        time:   [575.63 ns 577.33 ns 580.07 ns]
                        change: [-1.3304% -0.8881% -0.4675%] (p = 0.00 < 0.05)
                        Change within noise threshold.

Prepare `SELECT * FROM users LIMIT 1`/Limbo/SELECT * FROM users LIMIT 1
                        time:   [1.2070 µs 1.2114 µs 1.2166 µs]
                        change: [-0.8670% -0.4084% -0.0252%] (p = 0.06 > 0.05)
                        No change in performance detected.

Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...
                        time:   [2.9845 µs 2.9895 µs 2.9951 µs]
                        change: [-3.0470% -2.6038% -2.1301%] (p = 0.00 < 0.05)
                        Performance has improved.
Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou... #2
                        time:   [1.6015 µs 1.6084 µs 1.6157 µs]
                        change: [-0.0676% +0.3850% +0.8704%] (p = 0.11 > 0.05)
                        No change in performance detected.
Execute `SELECT * FROM users LIMIT ?`/Limbo/1
                        time:   [442.46 ns 446.72 ns 454.13 ns]
                        change: [+3.9744% +4.5337% +5.3357%] (p = 0.00 < 0.05)
                        Performance has regressed.
Execute `SELECT * FROM users LIMIT ?`/Limbo/10
                        time:   [3.1722 µs 3.1850 µs 3.1980 µs]
                        change: [+7.1994% +7.7452% +8.2856%] (p = 0.00 < 0.05)
                        Performance has regressed.
Execute `SELECT * FROM users LIMIT ?`/Limbo/50
                        time:   [14.976 µs 15.024 µs 15.078 µs]
                        change: [+5.7879% +6.2419% +6.7139%] (p = 0.00 < 0.05)
                        Performance has regressed.
Execute `SELECT * FROM users LIMIT ?`/Limbo/100
                        time:   [29.834 µs 29.925 µs 30.024 µs]
                        change: [+4.6519% +5.0384% +5.4491%] (p = 0.00 < 0.05)
                        Performance has regressed.
Execute `SELECT 1`/Limbo
                        time:   [45.135 ns 45.439 ns 45.763 ns]
                        change: [-0.4703% -0.0496% +0.3622%] (p = 0.81 > 0.05)
                        No change in performance detected.
```
## benchmarks after
```
Prepare `SELECT 1`/Limbo/SELECT 1
                        time:   [585.61 ns 590.92 ns 596.49 ns]
                        change: [+0.5902% +1.1505% +1.7012%] (p = 0.00 < 0.05)
                        Change within noise threshold.

Prepare `SELECT * FROM users LIMIT 1`/Limbo/SELECT * FROM users LIMIT 1
                        time:   [1.2061 µs 1.2090 µs 1.2119 µs]
                        change: [-0.2364% +0.0977% +0.4252%] (p = 0.57 > 0.05)
                        No change in performance detected.

Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...
                        time:   [2.9854 µs 2.9893 µs 2.9936 µs]
                        change: [-0.5752% -0.2529% +0.0167%] (p = 0.09 > 0.05)
                        No change in performance detected.
Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou... #2
                        time:   [1.5853 µs 1.5983 µs 1.6108 µs]
                        change: [-2.3810% -1.7986% -1.2748%] (p = 0.00 < 0.05)
                        Performance has improved.
Execute `SELECT * FROM users LIMIT ?`/Limbo/1
                        time:   [429.84 ns 431.34 ns 433.07 ns]
                        change: [-2.7721% -1.8504% -0.8738%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Execute `SELECT * FROM users LIMIT ?`/Limbo/10
                        time:   [2.9184 µs 2.9254 µs 2.9323 µs]
                        change: [-8.2377% -7.7816% -7.3373%] (p = 0.00 < 0.05)
                        Performance has improved.
Execute `SELECT * FROM users LIMIT ?`/Limbo/50
                        time:   [14.190 µs 14.229 µs 14.271 µs]
                        change: [-6.2034% -5.7858% -5.3552%] (p = 0.00 < 0.05)
                        Performance has improved.
Execute `SELECT * FROM users LIMIT ?`/Limbo/100
                        time:   [28.734 µs 28.856 µs 28.979 µs]
                        change: [-4.3640% -3.9462% -3.5492%] (p = 0.00 < 0.05)
                        Performance has improved.

Execute `SELECT 1`/Limbo
                        time:   [43.144 ns 43.237 ns 43.326 ns]
                        change: [-4.9417% -4.5554% -4.2030%] (p = 0.00 < 0.05)
                        Performance has improved.
```

Closes #1120
2025-03-18 13:44:16 +02:00
Pekka Enberg
f9d7834874 Merge 'Jsonb extract' from Ihor Andrianov
Made a jsonb traversal by json path.
Changed some ordinary json functions to use jsonb under the hood, so now
behavior of our json module more like sqlite.
Found and fixed some bugs on the way.

Closes #1135
2025-03-17 18:25:28 +02:00
Diego Reis
16396c57c7 Removes unnecessary clone 2025-03-17 10:06:14 -03:00
Diego Reis
2314e7f906 Improve explain output for Transaction bytecode.
It isn't SQLite compliant but it helps a lot, specially when the user doesn't know what each register means.
2025-03-17 09:50:22 -03:00
Pere Diaz Bou
00ab3d1c0c Fix ordering and implement Deref 2025-03-17 10:22:42 +01:00
Pere Diaz Bou
20f5ade95e Experiment with a custom Lock for database header 2025-03-17 10:21:34 +01:00
Diego Reis
590f90ad9a Fix AutoCommit handling of an ongoing halt checkpoint 2025-03-16 15:35:49 -03:00
Ihor Andrianov
23d7d82b6c add jsonb_extract function 2025-03-16 15:14:29 +02:00
Ihor Andrianov
0b22fbd566 Add jsonb to json_valid 2025-03-16 03:26:08 +02:00
Pekka Enberg
731fbaf3c7 Merge 'Jsonb implementation' from Ihor Andrianov
This PR implements a complete JSONB parser and serializer as current PR
draft looks stale.
Sorry for huge PR.
I've choose a recursive parsing approach because:
1. It's simpler to understand and maintain
2. It follows SQLite's implementation pattern, ensuring compatibility
3. It naturally maps to JSON's hierarchical structure
The implementation includes comprehensive test coverage for standard
JSON features and JSON5 extensions. All test cases pass successfully,
handling edge cases like nested structures, escape sequences, and
various number formats.
While the code is ready for review, I believe it would benefit from fuzz
testing in the future to identify any edge cases not covered by the
current tests.
Ready for review, proposals and feedback.

Closes #1114
2025-03-13 21:17:52 +02:00
Pere Diaz Bou
cc320a74ca few checkpoint result cleanup in vdbe 2025-03-12 15:48:22 +01:00
Pere Diaz Bou
be3badc1f3 modify a few btree log level and add end_write_txn after checkpoint 2025-03-12 15:48:22 +01:00
Ihor Andrianov
04f69220b7 add jsonb function implementation and json now understands blobs 2025-03-12 15:03:40 +02:00
Pekka Enberg
b0636e4494 Merge 'Adds Drop Table' from Zaid Humayun
This PR adds support for `DROP TABLE` and addresses issue
https://github.com/tursodatabase/limbo/issues/894
It depends on https://github.com/tursodatabase/limbo/pull/785 being
merged in because it requires the implementation of `free_page`.
EDIT: The PR above has been merged.
It adds the following:
* an implementation for the `DropTable` AST instruction via a method
called `translate_drop_table`
* a couple of new instructions - `Destroy` and `DropTable`. The former
is to modify physical b-tree pages and the latter is to modify in-memory
structures like the schema hash table.
* `btree_destroy` on `BTreeCursor` to walk the tree of pages for this
table and place it in free list.
* state machine traversal for both `btree_destroy` and
`clear_overflow_pages` to ensure performant, correct code.
* unit & tcl tests
* modifies the `Null` instruction to follow SQLite semantics and accept
a second register. It will set all registers in this range to null. This
is required for `DROP TABLE`.
The screenshots below have a comparison of the bytecodes generated via
SQLite & Limbo.
Limbo has the same instruction set except for the subroutines which
involve opening an ephemeral table, copying over the triggers from the
`sqlite_schema` table and then re-inserting them back into the
`sqlite_schema` table.
This is because `OpenEphemeral` is still a WIP and is being tracked at
https://github.com/tursodatabase/limbo/pull/768
![Screenshot 2025-02-09 at 7 05 03 PM](https://github.com/user-
attachments/assets/1d597001-a60c-4a76-89fd-8b90881c77c9)
![Screenshot 2025-02-09 at 7 05 35 PM](https://github.com/user-
attachments/assets/ecfd2a7a-2edc-49cd-a8d1-7b4db8657444)

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #897
2025-03-06 18:27:41 +02:00
Pekka Enberg
2c9d30cef4 core/vdbe: Don't commit MVCC on Halt if no autocommit
Spotted by Pere.
2025-03-06 12:52:03 +02:00
Pekka Enberg
d6c514c8d1 core: Integrate MVCC to B-Tree cursor 2025-03-06 10:16:42 +02:00
Pekka Enberg
bf3163c7fe core: Fix parse_schema() to use existing MVCC TX 2025-03-06 10:16:42 +02:00
Pekka Enberg
ef32a82941 core/vdbe: Integrate MVCC transactions 2025-03-06 10:16:42 +02:00
Pere Diaz Bou
aa7391da50 fix halt return 2025-03-05 22:32:59 +01:00
Pere Diaz Bou
b555561aeb make Program::halt reentrant 2025-03-05 22:32:59 +01:00
Pere Diaz Bou
feeb398e73 finish transaction and reset transaction state 2025-03-05 22:32:59 +01:00
Pere Diaz Bou
262c4de548 add line number and thread id to tracing logs 2025-03-05 15:36:47 +01:00
Pere Diaz Bou
e20dd59353 Make schema a RWLock
This makes it work like in SQLite where only one schema writer is permitted and readers will return error while preparing statement if the schema is changing.
2025-03-05 14:07:48 +01:00
Pere Diaz Bou
e4a8ee5402 move load extensions to Connection
Extensions are loaded per connection and not per database as per SQLite
behaviour. This also helps with removing locks.
2025-03-05 14:07:48 +01:00
Pere Diaz Bou
8daf7666d1 Make database Sync + Send 2025-03-05 14:07:48 +01:00
Pekka Enberg
f57d2b32af core: Clean up B-Tree creation code
Move page allocation to pager so that we don't need to instantiate a
cursor to create a B-Tree.
2025-03-04 18:38:06 +02:00
Pekka Enberg
f3ee86d784 core/vdbe: Replace get_btree_{table,index}_cursor() calls with get_cursor() 2025-03-04 15:17:57 +02:00
Pekka Enberg
cdcaebb878 core/vdbe: Unify B-Tree cursors 2025-03-04 14:35:40 +02:00
Pekka Enberg
1c0d9c3b46 core/vdbe: Replace get_pseudo_cursor() calls with get_cursor() 2025-03-04 14:18:52 +02:00
Pekka Enberg
c12f2aeca4 core/vdbe: Replace get_sorter() calls with get_cursor() 2025-03-04 13:51:05 +02:00
Pekka Enberg
45539a4fe5 core/vdbe: Replace get_vtab_cursor() calls with get_cursor() 2025-03-04 13:43:49 +02:00
Pekka Enberg
085f93ce79 core/vdbe: Add ProgramState::get_cursor() helper 2025-03-04 12:23:35 +02:00
Pekka Enberg
3aeb11b673 core/vdbe: Add ProgramStatem::get_btree_{table,index}_cursor() helpers 2025-03-04 11:40:43 +02:00
Pekka Enberg
222808ab6c ore/vdbe: Add ProgramState::get_pseudo_cursor() helper 2025-03-04 11:21:24 +02:00
Pekka Enberg
06446b768b core/vdbe: Add ProgramState::get_sorter() helper 2025-03-04 11:18:09 +02:00
Pekka Enberg
e4ebb6d9e1 core/vdbe: Add ProgramState::get_vtab_cursor() helper 2025-03-04 11:16:29 +02:00
Pekka Enberg
dc525dd7d1 core/vdbe: Kill call_external_function macro
The call_external_function macro has exactly one call-site and,
therefore, only makes the code harder to read.
2025-03-04 11:01:09 +02:00
Pekka Enberg
ddb188132c Merge 'Clean up extension types API, introduce json text subtype' from Preston Thorpe
This PR cleans up some comments in the extension API and prevents
extensions themselves from calling 'free' on Value types that are
exposed to the user facing traits, as well as changes the `from_ffi`
method for OwnedValues to take ownership and automatically free the
values to prevent memory leaks.
This PR also finds the name of the `args: &[Value]` argument for scalar
functions in extensions, and uses that in the proc macro, instead of
relying on documentation to communicate that the parameter must be named
`args`.

Closes #1054
2025-03-04 10:24:19 +02:00
Pekka Enberg
2e4c18dca2 Merge 'Escape character is ignored in LIKE function' from lgualtieri75
Fixes #1051

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #1074
2025-03-04 10:23:09 +02:00
PThorpe92
588e43c5aa Minor improvements and cleanups in btree 2025-03-01 15:48:42 -05:00
PThorpe92
5b8efd92a4 Update extension ownership cleanups for new vtab module 2025-03-01 14:27:33 -05:00
PThorpe92
e7713e87ec Prevent extensions from accidentally freeing value types, fix comments 2025-03-01 14:27:33 -05:00
l.gualtieri
6449c79e93 Escape character is ignored in LIKE function #1051 2025-03-01 18:32:09 +01:00
Zaid Humayun
23a904f38d Merge branch 'main' of https://github.com/tursodatabase/limbo 2025-03-01 01:18:45 +05:30
Pekka Enberg
b4e8afa3c7 Merge 'Implement SQLite balancing algorithm' from Pere Diaz Bou
Beep boop.
What happened you ask? I removed the dumb balancing algorithm I
implemented in favor of SQLite's implementation based on B*Tree[1] where
a page is 2/3 full instead of 1/2. It also tries to balance a page by
taking a maximum 3 pages and distributing cells evenly between them.
I've made some changes that are somewhat related:
* Moved most operations on pages out of BTreeCursor because those
operations are based on a page, not on a cursor, and it makes it easier
to test.
* Fixed `write_u16` and `read_u16` cases that didn't need a implicit
offset calculation. Added: `write_u16_no_offset` and
`read_u16_no_offset` to counter this.
* Added some tests with fuzz testing too.
* Fixed some important actions like: `compute_free_space`,
`defragment_page` and `drop_cell`.
[1] https://dl.acm.org/doi/10.1145/356770.356776

Closes #968
2025-02-28 19:10:52 +02:00