Commit Graph

4607 Commits

Author SHA1 Message Date
Preston Thorpe
caaf60a7ea Merge 'Unify resolution of aggregate functions' from Piotr Rżysko
This PR unifies the logic for resolving aggregate functions. Previously,
bare aggregates (e.g. `SELECT max(a) FROM t1`) and aggregates wrapped in
expressions (e.g. `SELECT max(a) + 1 FROM t1`) were handled differently,
which led to duplicated code. Now both cases are resolved consistently.
The added benchmark shows a small improvement:
```
Prepare `SELECT first_name, last_name, state, city, age + 10, LENGTH(email), UPPER(first_name), LOWE...
                        time:   [59.791 µs 59.898 µs 60.006 µs]
                        change: [-7.7090% -7.2760% -6.8242%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  8 (8.00%) high mild
  2 (2.00%) high severe
```
For an existing benchmark, no change:
```
Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...
                        time:   [11.895 µs 11.913 µs 11.931 µs]
                        change: [-0.2545% +0.2426% +0.6960%] (p = 0.34 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low severe
  2 (2.00%) high mild
  5 (5.00%) high severe
```

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2884
2025-09-03 19:46:04 -04:00
Preston Thorpe
c55c7d76c3 Merge 'replace some matches with match_ignore_ascii_case macro' from Lâm Hoàng Phúc
Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #2903
2025-09-03 17:03:19 -04:00
PThorpe92
c5b6df4249 Use mutex in place of spinlock for io_uring 2025-09-03 11:12:33 -04:00
PThorpe92
30454336a6 Make io_uring sound for connections across multiple threads 2025-09-03 10:54:42 -04:00
TcMits
b6fca2718f fmt 2025-09-03 13:41:23 +07:00
TcMits
b0f4dd49d5 use match_ignore_ascii_case macro 2025-09-03 12:01:52 +07:00
Pekka Enberg
1de647758f Merge 'refactor parser fmt' from Lâm Hoàng Phúc
@penberg this PR try to clean up `turso_parser`'s`fmt` code.
- `get_table_name` and `get_column_name` should return None when
table/column does not exist.
```rust
/// Context to be used in ToSqlString
pub trait ToSqlContext {
    /// Given an id, get the table name
    /// First Option indicates whether the table exists
    ///
    /// Currently not considering aliases
    fn get_table_name(&self, _id: TableInternalId) -> Option<&str> {
        None
    }

    /// Given a table id and a column index, get the column name
    /// First Option indicates whether the column exists
    /// Second Option indicates whether the column has a name
    fn get_column_name(&self, _table_id: TableInternalId, _col_idx: usize) -> Option<Option<&str>> {
        None
    }

    // help function to handle missing table/column names
    fn get_table_and_column_names(
        &self,
        table_id: TableInternalId,
        col_idx: usize,
    ) -> (String, String) {
        let table_name = self
            .get_table_name(table_id)
            .map(|s| s.to_owned())
            .unwrap_or_else(|| format!("t{}", table_id.0));

        let column_name = self
            .get_column_name(table_id, col_idx)
            .map(|opt| {
                opt.map(|s| s.to_owned())
                    .unwrap_or_else(|| format!("c{col_idx}"))
            })
            .unwrap_or_else(|| format!("c{col_idx}"));

        (table_name, column_name)
    }
}
```
- remove `FmtTokenStream` because it is same as `WriteTokenStream `
- remove useless functions and simplify `ToTokens`
```rust
/// Generate token(s) from AST node
/// Also implements Display to make sure devs won't forget Display
pub trait ToTokens: Display {
    /// Send token(s) to the specified stream with context
    fn to_tokens<S: TokenStream + ?Sized, C: ToSqlContext>(
        &self,
        s: &mut S,
        context: &C,
    ) -> Result<(), S::Error>;

    // Return displayer representation with context
    fn displayer<'a, 'b, C: ToSqlContext>(&'b self, ctx: &'a C) -> SqlDisplayer<'a, 'b, C, Self>
    where
        Self: Sized,
    {
        SqlDisplayer::new(ctx, self)
    }
}
```

Closes #2748
2025-09-02 18:35:43 +03:00
Pekka Enberg
2addeb5a9f Merge 'introduce eq/contains/starts_with/ends_with_ignore_ascii_case macros' from Lâm Hoàng Phúc
depend on #2865
```sh
`ALTER TABLE _ RENAME TO _`/limbo_rename_table/
                        time:   [10.100 ms 10.191 ms 10.283 ms]
                        change: [-16.770% -15.559% -14.417%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild

`ALTER TABLE _ RENAME COLUMN _ TO _`/limbo_rename_column/
                        time:   [7.4829 ms 7.5492 ms 7.6128 ms]
                        change: [-19.397% -18.093% -16.789%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) low mild
  1 (1.00%) high mild

`ALTER TABLE _ ADD COLUMN _`/limbo_add_column/
                        time:   [5.3255 ms 5.3713 ms 5.4183 ms]
                        change: [-24.002% -22.612% -21.195%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 39 outliers among 100 measurements (39.00%)
  17 (17.00%) low severe
  1 (1.00%) low mild
  1 (1.00%) high mild
  20 (20.00%) high severe

`ALTER TABLE _ DROP COLUMN _`/limbo_drop_column/
                        time:   [5.8858 ms 5.9183 ms 5.9510 ms]
                        change: [-16.233% -14.679% -13.083%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 25 outliers among 100 measurements (25.00%)
  8 (8.00%) low severe
  11 (11.00%) low mild
  2 (2.00%) high mild
  4 (4.00%) high severe

Prepare `SELECT 1`/limbo_parse_query/SELECT 1
                        time:   [590.28 ns 591.31 ns 592.35 ns]
                        change: [-3.7810% -3.5059% -3.2444%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low severe
  6 (6.00%) high mild

Prepare `SELECT * FROM users LIMIT 1`/limbo_parse_query/SELECT * FROM users LIMIT 1
                        time:   [1.2569 µs 1.2582 µs 1.2596 µs]
                        change: [-5.0125% -4.7516% -4.4933%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) low severe
  2 (2.00%) low mild
  1 (1.00%) high mild
  1 (1.00%) high severe

Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...
                        time:   [3.7180 µs 3.7227 µs 3.7274 µs]
                        change: [-3.0557% -2.7642% -2.4761%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low mild
  4 (4.00%) high mild

Execute `SELECT 1`/limbo_execute_select_1
                        time:   [27.455 ns 27.477 ns 27.499 ns]
                        change: [-2.9461% -2.7493% -2.5589%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  1 (1.00%) high severe

Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/1
                        time:   [410.53 ns 411.05 ns 411.54 ns]
                        change: [-15.364% -15.133% -14.912%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) low mild
  1 (1.00%) high mild

Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/10
                        time:   [2.1100 µs 2.1122 µs 2.1145 µs]
                        change: [-11.517% -11.065% -10.662%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) low severe
  2 (2.00%) low mild

Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/50
                        time:   [9.5156 µs 9.5268 µs 9.5383 µs]
                        change: [-10.284% -10.086% -9.8833%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low severe
  2 (2.00%) low mild

Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/100
                        time:   [18.669 µs 18.698 µs 18.731 µs]
                        change: [-9.5949% -9.3407% -9.1140%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low severe
  1 (1.00%) high mild

Execute `SELECT count() FROM users`/limbo_execute_select_count
                        time:   [7.1027 µs 7.1098 µs 7.1170 µs]
                        change: [-43.739% -43.596% -43.469%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  2 (2.00%) low mild
  5 (5.00%) high mild
  2 (2.00%) high severe

```

Closes #2866
2025-09-02 18:35:14 +03:00
Pekka Enberg
d77b76e75a Merge 'string sometimes used as identifier quoting' from Lâm Hoàng Phúc
fix https://github.com/tursodatabase/turso/issues/2886#issuecomment-
3244885481

Closes #2894
2025-09-02 18:34:43 +03:00
Pekka Enberg
52ef7dd675 Merge 'Fix memory leak in page cache during balancing' from Preston Thorpe
Currently we have `Pager::update_dirty_loaded_page_in_cache` which does
exactly what you would expect, but `DumbLruPageCache::_insert` method
with `ignore_existing` set to true, totally ignores the previous entry
and leaks the memory.
I really want to get #2885 finished and through because of the perf, but
I ran into this when inspecting it for correctness changes

Closes #2892
2025-09-02 18:32:56 +03:00
TcMits
635402fc6f string sometimes used as identifier quoting 2025-09-02 21:35:37 +07:00
PThorpe92
cfadc4f579 Fix memory leak in page cache during balancing 2025-09-02 10:35:04 -04:00
Pekka Enberg
12cf4d2e72 core: Make strict schema support experimental
It's not tested properly so let's mark it as experimental for now.

Fixes #2775
2025-09-02 16:40:02 +03:00
TcMits
40adf3fcfd Merge branch 'perf-3' into perf-4 2025-09-02 18:47:05 +07:00
TcMits
bfff05faba merge main 2025-09-02 18:25:20 +07:00
TcMits
33a04fbaf7 resolve conflict 2025-09-02 17:30:10 +07:00
Pekka Enberg
6591b66c3d Merge 'Simulate I/O in memory' from Pedro Muniz
Revives the `MemorySim` PR and fixes a page cache issue where we could
have a unlocked and unloaded page in the page cache after a FaultyQuery.
The page would continue in the cache and could affect other queries as
the `page_cache` is at the `Connection` level.
Depends on #2785

Closes #2693
2025-09-02 13:28:48 +03:00
Pekka Enberg
15d45e3f68 Merge 'Refactor encryption to manage authentication tag internally' from bit-aloo
This PR updates the internal encryption framework to handle
authentication tags explicitly rather than relying on the underlying
cipher libraries to append/verify them automatically.
closes: #2850

Reviewed-by: Avinash Sajjanshetty (@avinassh)

Closes #2858
2025-09-02 09:44:22 +03:00
Piotr Rzysko
e97cc64ad0 Remove duplicated code for resolving aggregates
This also gave a small performance boost.

Local run results:

```
Prepare `SELECT first_name, last_name, state, city, age + 10, LENGTH(email), UPPER(first_name), LOWE...
                        time:   [59.791 µs 59.898 µs 60.006 µs]
                        change: [-7.7090% -7.2760% -6.8242%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  8 (8.00%) high mild
  2 (2.00%) high severe
```
2025-09-02 08:22:37 +02:00
Piotr Rzysko
517f23013a Delay deduplication of aggregate expressions
It is not necessary to iterate over existing aggregates for every
traversed expression. Instead, do so only when an aggregate function
is found.
2025-09-02 08:22:37 +02:00
Piotr Rzysko
569e41cb1e Skip traversing children of aggregate functions
Aggregate functions cannot be nested, and this is validated during the
translation of aggregate function arguments. Therefore, traversing their
child expressions is unnecessary.
2025-09-02 08:22:37 +02:00
Piotr Rzysko
9b742a64c2 Handle functions with star argument wrapped in expressions
Handled in the same way as in `prepare_one_select_plan` for bare
function calls.
2025-09-02 08:22:36 +02:00
Piotr Rzysko
f3cbc382ce Support external aggregate functions wrapped in expressions
Handled in the same way as in `prepare_one_select_plan` for bare
function calls. In `prepare_one_select_plan`, however, resolving
external scalar functions is performed unnecessarily twice.
2025-09-02 08:22:36 +02:00
Piotr Rzysko
d361734819 Remove unnecessary recursion in resolve_aggregates
The walk_expr method already traverses arguments, so there is no need
to do this explicitly.
2025-09-02 08:22:36 +02:00
Piotr Rzysko
ab0f673f44 Add benchmark for result column expression handling
The new query combines multiple aggregate functions, plain columns,
arithmetic expressions, and aggregates wrapped in additional expressions.

Local run results:
```
Prepare `SELECT first_name, last_name, state, city, age + 10, LENGTH(email), UPPER(first_name), LOWE...
                        time:   [64.535 µs 64.623 µs 64.713 µs]
Found 9 outliers among 100 measurements (9.00%)
  4 (4.00%) high mild
  5 (5.00%) high severe
```
2025-09-02 08:22:36 +02:00
Pekka Enberg
7189e98455 Merge 'Unify handling of grouped and ungrouped aggregations' from Piotr Rżysko
The initial commits fix issues and plug gaps between ungrouped and
grouped aggregations.
The final commit consolidates the code that emits `AggStep` to prevent
future disparities between the two.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2867
2025-09-02 09:11:40 +03:00
Pekka Enberg
0868af29df Merge 'core/printf: support for more basic substitution types' from Luiz Gustavo
Some progress working on `printf` support.
(relevant issue https://github.com/tursodatabase/turso/issues/885)
Implementation of the basic substitution types cited in the `TODO`
comment on the beginning of the file (%i, %x, %X, %o, %e, %E, %c). There
are some others in the sqlite spec which I will implement in a future
PR.
I tried to pay attention to the specific behaviors from sqlite as much
as possible while testing this, but if there's something I missed please
tell me.
Also, I see this code needs to be reorganized already, I'm still
thinking on the best approach to do that without affecting the
ergonomics of new implementations, I'm still learning Rust so this is
not obvious for me right now. I'm open to suggestions about it.

Closes #2868
2025-09-02 09:10:03 +03:00
Pekka Enberg
87d3f74e6e Merge 'Evict page from cache if page is unlocked and unloaded' from Pedro Muniz
Because we can abort a read_page completion, this means a page can be in
the cache but be unloaded and unlocked. However, if we do not evict that
page from the page cache, we will return an unloaded page later which
will trigger assertions later on. This is worsened by the fact that page
cache is not per `Statement`, so you can abort a completion in one
Statement, and trigger some error in the next one if we don't evict the
page in these circumstances.
Also, to propagate IO errors we need to return the Error from
IOCompletions on step.

Closes #2785
2025-09-02 09:08:12 +03:00
Pekka Enberg
d959319b42 Merge 'Use u64 for file offsets in I/O and calculate such offsets in u64' from Preston Thorpe
Using `usize` to compute file offsets caps us at ~16GB on 32-bit
systems. For example, with 4 KiB pages we can only address up to 1048576
pages; attempting the next page overflows a 32-bit usize and can wrap
the write offset, corrupting data. Switching our I/O APIs and offset
math to u64 avoids this overflow on 32-bit targets

Closes #2791
2025-09-02 09:06:49 +03:00
Pekka Enberg
cfaba4ab10 Merge 'Implement libSQL's ALTER COLUMN extension' from Levy A.
Implement `ALTER COLUMN` as described here:
https://github.com/tursodatabase/libsql/blob/main/libsql-
sqlite3/doc/libsql_extensions.md#altering-columns
- [x] Add `ALTER COLUMN` to parser
- [x] Implement `Insn::AlterColumn`
- [x] Add tests

Closes #2814
2025-09-02 09:06:03 +03:00
PThorpe92
e9b50b63fb Return sqlite_version() without being initialized 2025-09-01 13:36:41 -04:00
pedrocarlo
be855a8059 IOCompletions: abort other remaining completions if previous one errors 2025-09-01 14:12:11 -03:00
pedrocarlo
c01449e71b add parking_lot to simulator 2025-09-01 11:11:25 -03:00
pedrocarlo
bc707fd9be cleanup + comments 2025-09-01 11:10:40 -03:00
pedrocarlo
53cfae1db4 return Error from step if IO failed 2025-09-01 11:10:39 -03:00
pedrocarlo
6f1eed7aca clippy 2025-09-01 11:10:39 -03:00
pedrocarlo
4618df9d1a because we can abort a read_page completion, this means that the page can be in the cache but be unloaded and unlocked. However, if we do not evict that page from the page cache, we will return an unloaded page later 2025-09-01 11:10:39 -03:00
pedrocarlo
be3f944c4f impl Error for CacheError and propagate it into LimboError 2025-09-01 11:10:39 -03:00
bit-aloo
c70fe79eb8 adjust test cfg and cleanup 2025-09-01 16:21:03 +05:30
bit-aloo
27a6dc95c4 simplify Cipher enum to store wrapper types
- Replace boxed `Aes256Gcm` and `Aegis256Cipher` with direct wrapper types:
  - `Cipher::Aes256Gcm(Aes256GcmCipher)`
  - `Cipher::Aegis256(Aegis256Cipher)`
- Add `as_aead()` method to unify access via `AeadCipher` trait.
- Refactor decrypt_raw and encrypt raw.
- Add decrypt_raw_detached and encrypt raw detached.
2025-09-01 16:19:37 +05:30
bit-aloo
7f3c886154 add Aes256GcmCipher implementing AeadCipher
- Create new `Aes256GcmCipher` wrapper around AES-256-GCM.
- Implement `AeadCipher` trait with both combined and detached modes.
2025-09-01 16:18:49 +05:30
bit-aloo
f11e90c94d refactor Aegis256Cipher to implement AeadCipher 2025-09-01 16:18:22 +05:30
bit-aloo
c685c4e735 Add AeadCipher trait abstraction
- Define a common trait `AeadCipher` for encryption/decryption.
- Provide methods for both "combined" and "detached" encryption modes:
  - encrypt / decrypt
  - encrypt_detached / decrypt_detached
2025-09-01 16:16:41 +05:30
bit-aloo
3a9b5cc6fa simplify aes-gcm imports and add tag size constants 2025-09-01 16:15:57 +05:30
TcMits
6e87b08d64 faster type_from_name 2025-09-01 14:38:38 +07:00
Gaurav Sarma
453cbd3201 Decrypt WAL page while reading raw frames 2025-09-01 15:29:01 +08:00
Pekka Enberg
9d06e0bf8e Merge 'Support encryption for non-4k page size' from
Closes #2734.

Reviewed-by: Avinash Sajjanshetty (@avinassh)

Closes #2860
2025-09-01 10:05:52 +03:00
TcMits
ed1fb4cabc remove unnecessary check 2025-09-01 11:51:51 +07:00
Piotr Rzysko
6f1cd17fcf Consolidate methods emitting AggStep 2025-08-31 13:29:10 +02:00
Piotr Rzysko
cdba1f1b87 Generalize GroupByAggArgumentSource
This is primarily a mechanical change: the enum was moved between files,
renamed, and its comments updated so it is no longer strictly tied to
GROUP BY aggregations.

This prepares the enum for reuse with ungrouped aggregations.
2025-08-31 13:23:12 +02:00