Commit Graph

1017 Commits

Author SHA1 Message Date
Pekka Enberg
895f2acbfb Merge 'Fix concat_ws to match sqlite behavior' from bit-aloo
closes: #2101
Refactors exec_concat_ws to skip null and blob arguments instead of
inserting separators for them. Also adds a fuzz test.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2338
2025-07-30 21:31:58 +03:00
Jussi Saurio
7caef278a5 Merge 'Rewrite the WAL' from Preston Thorpe
closes #1893
Adds some fairly extensive tests but I'll continue to add some python
tests on top of the unit tests.
## Restart:
tested 
- open new DB
- create table and do a bunch of inserts
- `pragma wal_checkpoint(RESTART);`
- close db file
- re-open and verify we can read the wal/repopulate the frame cache
- verify min|max frame
tested 
- open same DB
- add more inserts
- `pragma wal_checkpoint(RESTART);`
- do _more_ inserts
- close
- re-open
- verify checksums/max_frame are valid
- verify row count
## Truncate
tested 
- open new db
- create table and add inserts
- `pragma wal_checkpoint(truncate);`
- close file
- verify WAL file is empty (32 bytes, header only)
- re-open file
- verify content/row count
tested 
- open db
- create table and insert many rows
- `pragma wal_checkpoint(truncate);`
- insert _more_ rows
- close db file
- verify WAL file is valid
- re-open file
- verify we can read entire file/repopulate the frame cache
<img width="541" height="315" alt="image" src="https://github.com/user-
attachments/assets/0470c795-5116-4866-b913-78c07b06b68c" />
```
# header
magic=0x377f0682
version=3007000
page_size=4096
seq=2
salt=ec475ff2-7ea94342
checksum=c9464aff-c571cc22
```

Closes #2179
2025-07-30 18:50:49 +03:00
Jussi Saurio
2813a7a5de clippy 2025-07-30 17:25:30 +03:00
Jussi Saurio
7b1f04dc5e pager: only ROLLBACK your own transaction, not if someone else is writing 2025-07-30 17:00:38 +03:00
Pekka Enberg
951bebfac3 Merge 'Add vector_concat and vector_slice support' from bit-aloo
Closes: #2323
This PR adds support for two new vector functions:
* vector_concat(x, y) – Concatenates two vectors of the same type.
* vector_slice(x, start_index, end_index) – Extracts a subvector from
the input vector.
Notes:
* Negative start_index or end_index is not supported

Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #2336
2025-07-30 16:58:38 +03:00
Jussi Saurio
b1aa13375d call pager.end_tx() everywhere instead of pager.rollback() 2025-07-30 16:39:38 +03:00
Jussi Saurio
fd5e73f038 op_transaction: read tx must be ended in all cases if begin_write_tx fails 2025-07-30 14:58:03 +03:00
Jussi Saurio
9a63425b43 clippy 2025-07-30 14:58:03 +03:00
bit-aloo
1d83bb48c5 fix exec_concat_ws implementation 2025-07-30 17:27:15 +05:30
PThorpe92
4dc15492d8 Integrate changes from tx isolation commits from @jussisaurio 2025-07-30 14:10:12 +03:00
PThorpe92
2c3a9fe5ef Finish wal transaction handling and add more wal and chkpt testing 2025-07-30 14:10:10 +03:00
PThorpe92
eaa6f99fa8 Hold and ensure release of proper locks if we trunc the db file post-checkpoint 2025-07-30 14:08:33 +03:00
PThorpe92
7643ef97a6 Pass checkpoint mode from sqlite3 c api argument 2025-07-30 14:08:33 +03:00
bit-aloo
96a99ca48a rename subvector to vector_slice 2025-07-30 13:34:49 +05:30
bit-aloo
a5dce2b50b add subvector execution flow 2025-07-30 09:51:08 +05:30
bit-aloo
e4d79a6516 add vec_concat execution flow 2025-07-30 06:07:03 +05:30
Diego Reis
e0b099f5ad refactor: Implement conversion between InsnFunctionStepResult and
StepResult
2025-07-29 15:02:09 -03:00
Pekka Enberg
8adc807cd7 Merge 'Change function signatures to return IO Completions' from Pedro Muniz
Changes a couple of function signatures to return `Completion`. Also, I
changed `Completion` to be internally `Arc` to abstract the `Arc`
implementation detail, and to be able to attach a `#[must_use]` to the
`Completion` struct, so that cargo check can show us where we are not
tracking completions in the code. I also attached a `#[must_use]` to
`IOResult` so that we can see the places that we are not propagating or
waiting for I/O, demonstrating locations where functions should be
reentrant and are not.
Also, while we are with this refactor in progress I want to relax the
Clippy CI lint on unused_variables.

Closes #2309
2025-07-29 12:41:14 +03:00
pedrocarlo
3831e0db39 convert must_use compile warnings to unused_variables to track locations where we need to refactor in the future 2025-07-28 16:09:26 -03:00
pedrocarlo
d30c7d54c8 change all Arc<Completion> to Completion 2025-07-28 15:32:45 -03:00
Diego Reis
bab10909c3 Disable extension loading for wasm
We should enable it later when wasm become more mature
2025-07-28 14:49:07 -03:00
Jussi Saurio
d8a133a1a5 Merge 'VDBE/op_column: use references to cursor payload instead of cloning' from Jussi Saurio
instead use RefValue to refer to record payload directly and then copy
to register as necessary
my local:
```sql
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/1: Warming u
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/1: Collectin
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/1
                        time:   [491.64 ns 492.54 ns 493.64 ns]
                        change: [-3.6642% -3.3050% -2.9558%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/10: Warming
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/10: Collecti
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/10: Analyzin
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/10
                        time:   [2.7923 µs 2.8001 µs 2.8114 µs]
                        change: [-14.643% -14.282% -13.878%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low severe
  1 (1.00%) high mild
  4 (4.00%) high severe
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/50: Warming
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/50: Collecti
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/50: Analyzin
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/50
                        time:   [13.452 µs 13.496 µs 13.550 µs]
                        change: [-15.768% -15.471% -15.182%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) high mild
  4 (4.00%) high severe
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/100: Warming
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/100: Collect
Benchmarking Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/100: Analyzi
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/100
                        time:   [27.110 µs 27.162 µs 27.226 µs]
                        change: [-15.878% -15.604% -15.336%] (p = 0.00 < 0.05)
                        Performance has improved.
```
ci, main:
```
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/100
                        time:   [70.671 µs 71.741 µs 72.910 µs]
```
ci, branch:
```
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/100
                        time:   [53.969 µs 54.013 µs 54.061 µs]
```

Reviewed-by: bit-aloo (@Shourya742)

Closes #2205
2025-07-28 14:13:54 +03:00
Pere Diaz Bou
752a876f9a change every Rc to Arc in schema internals 2025-07-28 10:51:17 +02:00
Jussi Saurio
111c0032ae Always extend texts and blobs 2025-07-28 11:06:16 +03:00
Jussi Saurio
ae5470f1d0 use default directly 2025-07-28 11:01:26 +03:00
Jussi Saurio
12d8b266a1 Define some helper traits to reduce duplication 2025-07-28 11:01:26 +03:00
Jussi Saurio
7eb52c65d3 Add missing program counter increment 2025-07-28 11:01:26 +03:00
Jussi Saurio
b14124ad3b VDBE/op_column: avoid first cloning text/blob and then copying it again
instead use RefValue to refer to record payload directly and then copy
to register as necessary
2025-07-28 11:01:26 +03:00
Pekka Enberg
6bf6cc28e4 Merge 'Implement the Returning statement for inserts and updates' from Glauber Costa
They are very similar. DELETE is very different, so that one we'll do it
later.

Closes #2276
2025-07-27 09:11:16 +03:00
Glauber Costa
5d8d08d1b6 Implement the Returning statement for inserts and updates
They are very similar. DELETE is very different, so that one we'll
do it later.
2025-07-26 09:01:09 -05:00
FHaggs
54edfa09d5 Replicate the sqlite Kahan-Babaska-Neumaier algorithm 2025-07-25 15:25:29 -03:00
FHaggs
d5049a46c2 Add kahan sum logic 2025-07-25 13:24:19 -03:00
Pekka Enberg
e0e3c52535 Merge 'Simplify sum() aggregation logic' from bit-aloo
This refactors AggContext::Sum by removing the extra bool flag and
simplifying type handling during aggregation:

Closes #2265
2025-07-25 17:57:58 +03:00
bit-aloo
4f8027990d detach the sum and total logic from using has_non_numeric flag 2025-07-25 17:59:19 +05:30
Pekka Enberg
c6c0db19e9 Merge 'Fix schema reparse logic' from Nikita Sivukhin
`maybe_reparse_schema` function introduced in the #2246 was incorrect as
it didn't update `schema_version` for internal schema representation and
basically updated only schema for connection which called
`maybe_reparse_schema`.
This PR fixes this issue by reading schema and cookie value within a
single transaction and updating both schema content and its version for
internal representation.

Reviewed-by: Pedro Muniz (@pedrocarlo)

Closes #2259
2025-07-25 13:24:23 +03:00
Pekka Enberg
669b231714 Merge 'parser: Distinguish quoted identifiers and unify Id into Name enum' from bit-aloo
Closes: #1947
This PR replaces the `Name(pub String)` struct with a `Name` enum that
explicitly models how the name appeared in the source either as an
unquoted identifier (`Ident`) or a quoted string (`Quoted`).
In the process, the separate `Id` wrapper type has been coalesced into
the `Name` enum, simplifying the AST and reducing duplication in
identifier handling logic.
While this increases the size of some AST nodes (notably
`yyStackEntry`).
cc: @levydsa

Reviewed-by: Levy A. (@levydsa)
Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2251
2025-07-25 12:08:54 +03:00
Glauber Costa
988b16f962 Support ATTACH (read only)
Support for attaching databases. The main difference from SQLite is that
we support an arbitrary number of attached databases, and we are not
bound to just 100ish.

We for now only support read-only databases. We open them as read-only,
but also, to keep things simple, we don't patch any of the insert
machinery to resolve foreign tables.  So if an insert is tried on an
attached database, it will just fail with a "no such table" error - this
is perfect for now.

The code in core/translate/attach.rs is written by Claude, who also
played a key part in the boilerplate for stuff like the .databases
command and extending the pragma database_list, and also aided me in
the test cases.
2025-07-24 19:19:48 -05:00
Nikita Sivukhin
8b0e5b151e simplify parse_schema_rows signature 2025-07-24 22:43:31 +04:00
meteorgan
c48a5ef538 we don't need read_tx return IOResult anymore 2025-07-24 23:19:33 +08:00
meteorgan
08f1803a6b end read tx in op_transaction when write transaction return io 2025-07-24 23:18:29 +08:00
meteorgan
7ef50e0690 fix page_count pragma 2025-07-24 23:18:26 +08:00
Pere Diaz Bou
ce598b772e clippy i hate you so much 2025-07-24 15:29:21 +02:00
Pere Diaz Bou
b07e57d9d1 review fixes 2025-07-24 15:29:21 +02:00
Pere Diaz Bou
75f9c23ed3 end txn on vdbe failures 2025-07-24 15:29:21 +02:00
Pere Diaz Bou
14de7c55af set connection state to None in vdbe rollback 2025-07-24 15:29:21 +02:00
bit-aloo
9a54ef214e parser: Distinguish quoted identifiers and unify Id into Name enum
This commit replaces the `Name(pub String)` struct with a `Name` enum that
explicitly models how the name appeared in the source either as an
unquoted identifier (`Ident`) or a quoted string (`Quoted`).

In the process, the separate `Id` wrapper type has been coalesced into the
`Name` enum, simplifying the AST and reducing duplication in identifier
handling logic.

While this increases the size of some AST nodes (notably `yyStackEntry`),
it improves correctness and makes source structure more explicit for
later phases.
2025-07-24 14:40:19 +05:30
Jussi Saurio
92a10f94d8 Merge 'Bail early for read-only virtual tables' from Preston Thorpe
This PR adds a const associated value on the VTabModule trait,
`READONLY` defaulted to `true`, so we can bail early when a write
operation is done on an invalid vtable.
This prevents extensions from having to implement `insert`,`update`,
`delete` just to return `Error::ReadOnly`, and prevents us from having
to step through `VUpdate` just to error out.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2247
2025-07-24 10:12:07 +03:00
Jussi Saurio
49b2bf4fdb Merge 'Deserialize keys only once when sorting immutable records' from Iaroslav Zeigerman
Before this update, the entire immutable record was **fully**
deserialized **every** time it was compared in the sorter.
This PR extends the sorter with incremental deserialization of record
keys, only when needed and only if they weren’t already deserialized in
a previous iteration.
I hate that we panic on failed deserialization in `cmp`, but
unfortunately, I can’t return `Result` as part of this interface.
Looking for feedback around a better way to handle this.
Alternatively, I could store the deserialization error as part of
`SortableImmutableRecord` and check it before returning the record in
`next`, thereby deferring the error handling. The downside of this
approach is that it complicates debugging, since the error will be
completely decoupled from the place where it occurs.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2207
2025-07-24 10:08:16 +03:00
Jussi Saurio
52b4c22be9 Merge 'fix: SUM returns correct float for mixed numeric/non-numeric types & return value on empty set' from Axel Tobieson Rova
# Fix SUM aggregate function for mixed types
Fixes #2133
The SUM aggregate function was returning incorrect results when
processing tables with mixed numeric and non-numeric values. According
to SQLite documentation:
> "If any input to sum() is neither an integer nor a NULL, then sum()
returns a floating point value"
[*](https://sqlite.org/lang_aggfunc.html)
Now both SQLite and Turso yield the same output of 44.0.
--
I modified `Sum` to increment only for numeric values, skipping non-
numeric values. However, if we have mixed numeric values or non-numeric
values, we return a float output. Added a flag to keep track of it.
as pointed out by @FHaggs , If there are no non-NULL input rows then
sum() returns NULL but total() returns 0.0. I decided to include it in
this PR as well. Empty was such a natural test case.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2182
2025-07-24 10:08:01 +03:00
PThorpe92
0871a8c7f3 Bail early when we detect a readonly virtual table 2025-07-23 16:57:30 -04:00