### Follow SUM [spec](https://sqlite.org/lang_aggfunc.html)
This PR updates the `SUM` aggregation logic to follow the
[Kahan–Babushka–Neumaier summation
algorithm](https://en.wikipedia.org/wiki/Kahan_summation_algorithm),
consistent with SQLite’s implementation. It improves the numerical
stability of floating-point summation.This fixes issue #2252 . I added a
fuzz test to ensure the compatibility of the implementations
I also fixed the return types for `SUM` to match SQLite’s documented
behavior. This was previously discussed in
[#2182](https://github.com/tursodatabase/turso/pull/2182), but part of
the logic was later unintentionally overwritten by
[#2265](https://github.com/tursodatabase/turso/pull/2265).
I introduced two helper functions, `apply_kbn_step` and
`apply_kbn_step_int`, in `vbde/execute.rs` to handle floating-point and
integer accumulation respectively. However, I’m new to this codebase and
would welcome constructive feedback on whether there’s a better place
for these helpers.
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes#2270
`maybe_reparse_schema` function introduced in the #2246 was incorrect as
it didn't update `schema_version` for internal schema representation and
basically updated only schema for connection which called
`maybe_reparse_schema`.
This PR fixes this issue by reading schema and cookie value within a
single transaction and updating both schema content and its version for
internal representation.
Reviewed-by: Pedro Muniz (@pedrocarlo)
Closes#2259
Closes: #1947
This PR replaces the `Name(pub String)` struct with a `Name` enum that
explicitly models how the name appeared in the source either as an
unquoted identifier (`Ident`) or a quoted string (`Quoted`).
In the process, the separate `Id` wrapper type has been coalesced into
the `Name` enum, simplifying the AST and reducing duplication in
identifier handling logic.
While this increases the size of some AST nodes (notably
`yyStackEntry`).
cc: @levydsa
Reviewed-by: Levy A. (@levydsa)
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes#2251
Support for attaching databases. The main difference from SQLite is that
we support an arbitrary number of attached databases, and we are not
bound to just 100ish.
We for now only support read-only databases. We open them as read-only,
but also, to keep things simple, we don't patch any of the insert
machinery to resolve foreign tables. So if an insert is tried on an
attached database, it will just fail with a "no such table" error - this
is perfect for now.
The code in core/translate/attach.rs is written by Claude, who also
played a key part in the boilerplate for stuff like the .databases
command and extending the pragma database_list, and also aided me in
the test cases.
Closes: #1415
### What this PR does
1. Removes database initialization from the `read_tx` function.
2. Adds checks for database initialization when executing `.schema`,
`.indexes`, `.tables` and `.import` commands, as they rely on
`sqlite_schema` table.
### About the second issue
I think we have another solution for the second issue: create the
`sqlite_schema` table in `Schema` only during page1 initialization,
rather than during `Schema` initialization.
#### Pros
This approach has the advantage of unifying the logic for the
`sqlite_schema` table with other user tables when running `select`
statements
#### Cons
- we still need to check error codes for commands like `.schema`.
- this approach may increase the complexity of the `pager`
implementation.
I'd like to hear your thoughts and feedback.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2099
Closes#2227 , enables fixing #2225
## What
Although we cleared overflow pages on DELETE, we never did it for
INSERT/UPDATE, which means any overflow pages were left dangling and not
added to freelist.
## Why is this a problem
This means that we are not able to reuse these pages to solve #2225,
causing massive bloat in the DB when UPDATEs are executed.
## Fix
Clear overflow pages when `BTreeCursor::insert()` overwrites a cell.
Needed a new state machine for `overwrite_cell` + new `WriteState`
variants
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#2230
Closes#2241
## What
When an index interior cell is deleted, it steals the leaf cell with the
largest key in its left subtree, deletes the old interior cell and then
replaces it with the stolen cell. This ensures the binary-search-tree
aspect of the btree remains correct. However, this can cause a situation
where both are true:
1. The leaf page is now UNDERFULL and must be rebalanced
2. The leaf's IMMEDIATE parent page is now OVERFULL and must be
rebalanced
## Why is this a problem
We simply didn't support the case where:
- Leaf page P is unbalanced and rebalancing starts on it
- Its immediate parent is ALSO unbalanced and _overflows_.
We had an assertion against this happening (see #2241)
## The fix
Allow exactly 1 overflow cell in the parent under very particular
conditions:
1. The parent page must be an index interior page
2. The parent must be positioned exactly at the divider cell whose left
child page underflows
This is the _only_ case where the immediate parent of a page about to
undergo rebalancing can have overflow cells.
## Implementation details
The parent overflow cell is folded into `cell_array` fairly early on and
`parent.overflow_cells` is cleared. However we need to be careful with
`cell_idx` for dividers other than the overflow cell because they get
shifted left on the page in `drop_cell()`. I've added a long comment
about this.
## Testing
Adds fuzz test that does inserts and deletes on an index btree and
asserts that all the expected keys are found at the end in the right
order. This test runs into this case quite frequently so I was able to
verify it.
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#2243
Let's make sure we don't end up in a weird situation by appending frames
one by one and we can later think of optimizations.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#2034