Commit Graph

3628 Commits

Author SHA1 Message Date
Pekka Enberg
aca6ffa042 Merge 'io/unix: wrap file with Mutex' from Pere Diaz Bou
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2301
2025-07-28 12:53:38 +03:00
Pere Diaz Bou
f458f622a5 io/unix: wrap file with Mutex 2025-07-28 11:33:57 +02:00
Pere Diaz Bou
752a876f9a change every Rc to Arc in schema internals 2025-07-28 10:51:17 +02:00
Pere Diaz Bou
d273de483f comment clone for schema 2025-07-28 10:50:50 +02:00
Pere Diaz Bou
6ec80b3364 clone everything in schema 2025-07-28 10:27:45 +02:00
Pekka Enberg
fd2a7f9098 core: Switch to unreachable for invalid enum variants
The parser unfortunately outputs Stmt, which has some enum variants that
we never actually encounter in some parts of the core. Switch to
unreachable instead of todo.
2025-07-28 09:52:20 +03:00
Pekka Enberg
a02a590f88 Merge 'core/translate: Handle Expr::Id in CREATE INDEX' from Kristofer
I am running into issues when creating indexes and made this PR with a
possible fix.
`Error: cannot use expressions in CREATE INDEX`
In my setup, running on `wasm32-unknown-unknown` (not in the browser), I
can reproduce the issue like this. First, creating a table:
```rust
conn.execute(
    r#"
    CREATE TABLE IF NOT EXISTS users (
        name TEXT,
        created DATETIME DEFAULT CURRENT_TIMESTAMP
    )
    "#,
    (),
)
.await
.unwrap();
```
Here, creating an index for that table:
```rust
conn.execute(
    "CREATE INDEX IF NOT EXISTS idx_users_name ON users(name)",
    (),
)
.await
.unwrap();
```
## Findings
I had a closer look at `resolve_sorted_columns`. In this bit, it checks
the expression of the sorted column.
https://github.com/tursodatabase/turso/blob/a2a31a520ff6e228a00e785026da
e19b5b2cced7/core/translate/index.rs#L252-L257
```rust
let ident = normalize_ident(match &sc.expr {
    // SQLite supports indexes on arbitrary expressions, but we don't (yet).
    // See "How to use indexes on expressions" in https://www.sqlite.org/expridx.html
    Expr::Name(ast::Name::Ident(col_name)) | Expr::Name(ast::Name::Quoted(col_name)) => {
        col_name
    }
    _ => crate::bail_parse_error!("Error: cannot use expressions in CREATE INDEX"),
});
```
If it is not an `Expr::Name`, function fails.
But, the `sc.expr` I am getting is not `Expr::Name` but `Expr::Id`.
Which doesn't seem unexpected but rather expected. Reading up on the
`sqlite3_parser` AST, it seems that both `Name` and `Id` can   be
expected.
Adding `Expr::Id` to the check fixes the issue.
```rust
let ident = normalize_ident(match &sc.expr {
    // SQLite supports indexes on arbitrary expressions, but we don't (yet).
    // See "How to use indexes on expressions" in https://www.sqlite.org/expridx.html
    Expr::Id(ast::Name::Ident(col_name))
    | Expr::Id(ast::Name::Quoted(col_name))
    | Expr::Name(ast::Name::Ident(col_name))
    | Expr::Name(ast::Name::Quoted(col_name)) => col_name,
    _ => crate::bail_parse_error!("Error: cannot use expressions in CREATE INDEX"),
});
```

Closes #2294
2025-07-28 08:54:45 +03:00
Pekka Enberg
d92ebd6d37 Merge 'Fix writing wal header for async IO' from Preston Thorpe
We previously were making another inline completion inside io_uring.rs,
I thought this wouldn't be needed anymore because of the Arc that is now
wrapping the RefCell<Buffer>, but in the case of the WAL header, where
it's not pinned to a page in the cache, there is nothing to keep it
alive and we will write a corrupt wal header.
```rust
        #[allow(clippy::arc_with_non_send_sync)]
        Arc::new(RefCell::new(buffer))
    };

    let write_complete = move |bytes_written: i32| {
     turso_assert!(
            bytes_written == WAL_HEADER_SIZE as i32,
            "wal header wrote({bytes_written}) != expected({WAL_HEADER_SIZE})"
        );
    };
// buffer is never referenced again, this works for sync IO but io_uring writes junk bytes
```
<img width="881" height="134" alt="image" src="https://github.com/user-
attachments/assets/0ff06ad5-411a-43d2-abac-caf9e23ceaeb" />

Closes #2297
2025-07-28 08:47:12 +03:00
PThorpe92
b08c465450 Fix writing wal header for async IO 2025-07-27 21:52:13 -04:00
Levy A.
1f57ab02cf feat: instrument WindowsIO functions 2025-07-27 20:39:49 -03:00
Levy A.
c95c6b67ee fix: thread-safe WindowsFile 2025-07-27 20:39:49 -03:00
Kristofer Lund
cbd5a26cf7 Adding Expr::Id as an allowed Expr when
creating an index.
2025-07-27 22:54:20 +02:00
Pekka Enberg
6bf6cc28e4 Merge 'Implement the Returning statement for inserts and updates' from Glauber Costa
They are very similar. DELETE is very different, so that one we'll do it
later.

Closes #2276
2025-07-27 09:11:16 +03:00
Pekka Enberg
86c97fca6d Merge 'Fix sum() to follow the SQLite semantics' from FamHaggs
### Follow SUM [spec](https://sqlite.org/lang_aggfunc.html)
This PR updates the `SUM` aggregation logic to follow the
[Kahan–Babushka–Neumaier summation
algorithm](https://en.wikipedia.org/wiki/Kahan_summation_algorithm),
consistent with SQLite’s implementation. It improves the numerical
stability of floating-point summation.This fixes issue #2252 . I added a
fuzz test to ensure the compatibility of the implementations
I also fixed the return types for `SUM` to match SQLite’s documented
behavior. This was previously discussed in
[#2182](https://github.com/tursodatabase/turso/pull/2182), but part of
the logic was later unintentionally overwritten by
[#2265](https://github.com/tursodatabase/turso/pull/2265).
I introduced two helper functions, `apply_kbn_step` and
`apply_kbn_step_int`, in `vbde/execute.rs` to handle floating-point and
integer accumulation respectively. However, I’m new to this codebase and
would welcome constructive feedback on whether there’s a better place
for these helpers.

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2270
2025-07-27 09:08:34 +03:00
Pekka Enberg
ab39ea54c7 Merge 'Fix error handling when binding column references while translating the UPDATE statement' from Iaroslav Zeigerman
Closes #1968

Reviewed-by: bit-aloo (@Shourya742)

Closes #2273
2025-07-27 09:05:17 +03:00
Pekka Enberg
6d88c6851b Merge 'io_uring: use Arc pointer for user data of entries' from Preston Thorpe
trying to pull bite sized adjustments out of other open PR's

Closes #2281
2025-07-27 09:04:35 +03:00
PThorpe92
e6737d923d Return correct value for pragma checkpoint 2025-07-26 23:09:40 -04:00
PThorpe92
fb611390c0 Update test to use realistic expectations for should_checkpoint in cacheflush 2025-07-26 23:03:51 -04:00
PThorpe92
7c027fed8c Keep should_checkpoint logic for now until greater checkpointing is fixed 2025-07-26 23:03:51 -04:00
PThorpe92
6644036be4 Stop checkpointing after every write when wal frame size > threshold 2025-07-26 23:03:47 -04:00
Glauber Costa
b8ee38868d implement the pragma encoding
Do not allow setting it. That ship has sailed around 2005.
2025-07-26 19:37:39 -05:00
PThorpe92
735026b502 Use Arc pointer for user data and save indirection when processing sqe/cqes 2025-07-26 16:35:40 -04:00
Glauber Costa
5d8d08d1b6 Implement the Returning statement for inserts and updates
They are very similar. DELETE is very different, so that one we'll
do it later.
2025-07-26 09:01:09 -05:00
Iaroslav Zeigerman
6f63327320 fix overlooked tests 2025-07-26 04:51:44 -07:00
Iaroslav Zeigerman
f13b9105b9 Fix error handling when binding column references while translating the UPDATE statement 2025-07-26 04:51:42 -07:00
Pekka Enberg
cc5d4dc3ba Merge 'support doubly qualified identifiers' from Glauber Costa
Closes #2271
2025-07-26 11:31:42 +03:00
Glauber Costa
b5927dcfd5 support doubly qualified identifiers 2025-07-25 14:52:45 -05:00
FHaggs
54edfa09d5 Replicate the sqlite Kahan-Babaska-Neumaier algorithm 2025-07-25 15:25:29 -03:00
FHaggs
f0ffff3c8e Modify AggContext to support the kahan algorithm 2025-07-25 13:25:25 -03:00
FHaggs
d5049a46c2 Add kahan sum logic 2025-07-25 13:24:19 -03:00
meteorgan
b5a18d7dc9 fix get_column_name() when column name doesn't exist 2025-07-25 23:49:31 +08:00
Pekka Enberg
e0e3c52535 Merge 'Simplify sum() aggregation logic' from bit-aloo
This refactors AggContext::Sum by removing the extra bool flag and
simplifying type handling during aggregation:

Closes #2265
2025-07-25 17:57:58 +03:00
Pere Diaz Bou
805bcfe633 Merge 'Ignore WAL frames after bad checksum' from Pere Diaz Bou
SQLite basically ignores bad frames instead of panicking, let's try to
do the same.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1956
2025-07-25 15:31:12 +02:00
bit-aloo
4f8027990d detach the sum and total logic from using has_non_numeric flag 2025-07-25 17:59:19 +05:30
bit-aloo
f389c31ac9 remove bool from sum variant in AggContext 2025-07-25 17:55:53 +05:30
Pekka Enberg
c6c0db19e9 Merge 'Fix schema reparse logic' from Nikita Sivukhin
`maybe_reparse_schema` function introduced in the #2246 was incorrect as
it didn't update `schema_version` for internal schema representation and
basically updated only schema for connection which called
`maybe_reparse_schema`.
This PR fixes this issue by reading schema and cookie value within a
single transaction and updating both schema content and its version for
internal representation.

Reviewed-by: Pedro Muniz (@pedrocarlo)

Closes #2259
2025-07-25 13:24:23 +03:00
Nikita Sivukhin
020d567e78 fix clippy 2025-07-25 13:55:37 +04:00
Pekka Enberg
669b231714 Merge 'parser: Distinguish quoted identifiers and unify Id into Name enum' from bit-aloo
Closes: #1947
This PR replaces the `Name(pub String)` struct with a `Name` enum that
explicitly models how the name appeared in the source either as an
unquoted identifier (`Ident`) or a quoted string (`Quoted`).
In the process, the separate `Id` wrapper type has been coalesced into
the `Name` enum, simplifying the AST and reducing duplication in
identifier handling logic.
While this increases the size of some AST nodes (notably
`yyStackEntry`).
cc: @levydsa

Reviewed-by: Levy A. (@levydsa)
Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2251
2025-07-25 12:08:54 +03:00
Pekka Enberg
c6d4a5c3ed Rename DatabaseIndexer to DatabaseCatalog
Avoid confusion with actual database indexes.
2025-07-25 10:36:33 +03:00
Glauber Costa
988b16f962 Support ATTACH (read only)
Support for attaching databases. The main difference from SQLite is that
we support an arbitrary number of attached databases, and we are not
bound to just 100ish.

We for now only support read-only databases. We open them as read-only,
but also, to keep things simple, we don't patch any of the insert
machinery to resolve foreign tables.  So if an insert is tried on an
attached database, it will just fail with a "no such table" error - this
is perfect for now.

The code in core/translate/attach.rs is written by Claude, who also
played a key part in the boilerplate for stuff like the .databases
command and extending the pragma database_list, and also aided me in
the test cases.
2025-07-24 19:19:48 -05:00
Nikita Sivukhin
e05660133b update schema version for internal schema represenation in maybe_reparse_schema 2025-07-24 22:43:31 +04:00
Nikita Sivukhin
8b0e5b151e simplify parse_schema_rows signature 2025-07-24 22:43:31 +04:00
Pekka Enberg
2141293017 Merge 'Fix page_count pragma' from meteorgan
Closes: #1415
### What this PR does
1. Removes database initialization from the `read_tx` function.
2. Adds checks for database initialization when executing `.schema`,
`.indexes`, `.tables` and `.import` commands, as they rely on
`sqlite_schema` table.
### About the second issue
I think we have another solution for the second issue: create the
`sqlite_schema` table in `Schema` only during page1 initialization,
rather than during `Schema` initialization.
#### Pros
This approach has the advantage of unifying the logic for the
`sqlite_schema` table with other user tables when running `select`
statements
#### Cons
- we still need to check error codes for commands like  `.schema`.
- this approach may increase the complexity of the `pager`
implementation.
I'd like to hear your thoughts and feedback.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2099
2025-07-24 19:21:35 +03:00
Jussi Saurio
b33527c3c4 Merge 'btree: clear overflow pages when insert overwrites a cell (= UPDATE)' from Jussi Saurio
Closes #2227 , enables fixing #2225
## What
Although we cleared overflow pages on DELETE, we never did it for
INSERT/UPDATE, which means any overflow pages were left dangling and not
added to freelist.
## Why is this a problem
This means that we are not able to reuse these pages to solve #2225,
causing massive bloat in the DB when UPDATEs are executed.
## Fix
Clear overflow pages when `BTreeCursor::insert()` overwrites a cell.
Needed a new state machine for `overwrite_cell` + new `WriteState`
variants

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #2230
2025-07-24 18:59:15 +03:00
Jussi Saurio
0b627ed331 Merge 'btree/balance: support case where immediate parent page of unbalanced child page also overflows' from Jussi Saurio
Closes #2241
## What
When an index interior cell is deleted, it steals the leaf cell with the
largest key in its left subtree, deletes the old interior cell and then
replaces it with the stolen cell. This ensures the binary-search-tree
aspect of the btree remains correct. However, this can cause a situation
where both are true:
1. The leaf page is now UNDERFULL and must be rebalanced
2. The leaf's IMMEDIATE parent page is now OVERFULL and must be
rebalanced
## Why is this a problem
We simply didn't support the case where:
- Leaf page P is unbalanced and rebalancing starts on it
- Its immediate parent is ALSO unbalanced and _overflows_.
We had an assertion against this happening (see #2241)
## The fix
Allow exactly 1 overflow cell in the parent under very particular
conditions:
1. The parent page must be an index interior page
2. The parent must be positioned exactly at the divider cell whose left
child page underflows
This is the _only_ case where the immediate parent of a page about to
undergo rebalancing can have overflow cells.
## Implementation details
The parent overflow cell is folded into `cell_array` fairly early on and
`parent.overflow_cells` is cleared. However we need to be careful with
`cell_idx` for dividers other than the overflow cell because they get
shifted left on the page in `drop_cell()`. I've added a long comment
about this.
## Testing
Adds fuzz test that does inserts and deletes on an index btree and
asserts that all the expected keys are found at the end in the right
order. This test runs into this case quite frequently so I was able to
verify it.

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #2243
2025-07-24 18:48:36 +03:00
Jussi Saurio
7968be9d71 btree/insert: cell can also underflow after overwrite 2025-07-24 18:43:02 +03:00
Jussi Saurio
a4535684b3 btree: WriteState: remove CheckNeedsBalancing variant 2025-07-24 18:40:49 +03:00
Jussi Saurio
b0edd3b716 btree: WriteState: add comments 2025-07-24 18:36:07 +03:00
Pere Diaz Bou
8150a72550 check frame number is not 0
clippy

fmt

fix after rebase

clippy
2025-07-24 17:30:17 +02:00
meteorgan
c48a5ef538 we don't need read_tx return IOResult anymore 2025-07-24 23:19:33 +08:00