This PR unifies the logic for resolving aggregate functions. Previously,
bare aggregates (e.g. `SELECT max(a) FROM t1`) and aggregates wrapped in
expressions (e.g. `SELECT max(a) + 1 FROM t1`) were handled differently,
which led to duplicated code. Now both cases are resolved consistently.
The added benchmark shows a small improvement:
```
Prepare `SELECT first_name, last_name, state, city, age + 10, LENGTH(email), UPPER(first_name), LOWE...
time: [59.791 µs 59.898 µs 60.006 µs]
change: [-7.7090% -7.2760% -6.8242%] (p = 0.00 < 0.05)
Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
8 (8.00%) high mild
2 (2.00%) high severe
```
For an existing benchmark, no change:
```
Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...
time: [11.895 µs 11.913 µs 11.931 µs]
change: [-0.2545% +0.2426% +0.6960%] (p = 0.34 > 0.05)
No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low severe
2 (2.00%) high mild
5 (5.00%) high severe
```
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#2884
@penberg this PR try to clean up `turso_parser`'s`fmt` code.
- `get_table_name` and `get_column_name` should return None when
table/column does not exist.
```rust
/// Context to be used in ToSqlString
pub trait ToSqlContext {
/// Given an id, get the table name
/// First Option indicates whether the table exists
///
/// Currently not considering aliases
fn get_table_name(&self, _id: TableInternalId) -> Option<&str> {
None
}
/// Given a table id and a column index, get the column name
/// First Option indicates whether the column exists
/// Second Option indicates whether the column has a name
fn get_column_name(&self, _table_id: TableInternalId, _col_idx: usize) -> Option<Option<&str>> {
None
}
// help function to handle missing table/column names
fn get_table_and_column_names(
&self,
table_id: TableInternalId,
col_idx: usize,
) -> (String, String) {
let table_name = self
.get_table_name(table_id)
.map(|s| s.to_owned())
.unwrap_or_else(|| format!("t{}", table_id.0));
let column_name = self
.get_column_name(table_id, col_idx)
.map(|opt| {
opt.map(|s| s.to_owned())
.unwrap_or_else(|| format!("c{col_idx}"))
})
.unwrap_or_else(|| format!("c{col_idx}"));
(table_name, column_name)
}
}
```
- remove `FmtTokenStream` because it is same as `WriteTokenStream `
- remove useless functions and simplify `ToTokens`
```rust
/// Generate token(s) from AST node
/// Also implements Display to make sure devs won't forget Display
pub trait ToTokens: Display {
/// Send token(s) to the specified stream with context
fn to_tokens<S: TokenStream + ?Sized, C: ToSqlContext>(
&self,
s: &mut S,
context: &C,
) -> Result<(), S::Error>;
// Return displayer representation with context
fn displayer<'a, 'b, C: ToSqlContext>(&'b self, ctx: &'a C) -> SqlDisplayer<'a, 'b, C, Self>
where
Self: Sized,
{
SqlDisplayer::new(ctx, self)
}
}
```
Closes#2748
This also gave a small performance boost.
Local run results:
```
Prepare `SELECT first_name, last_name, state, city, age + 10, LENGTH(email), UPPER(first_name), LOWE...
time: [59.791 µs 59.898 µs 60.006 µs]
change: [-7.7090% -7.2760% -6.8242%] (p = 0.00 < 0.05)
Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
8 (8.00%) high mild
2 (2.00%) high severe
```
Aggregate functions cannot be nested, and this is validated during the
translation of aggregate function arguments. Therefore, traversing their
child expressions is unnecessary.
Handled in the same way as in `prepare_one_select_plan` for bare
function calls. In `prepare_one_select_plan`, however, resolving
external scalar functions is performed unnecessarily twice.
The initial commits fix issues and plug gaps between ungrouped and
grouped aggregations.
The final commit consolidates the code that emits `AggStep` to prevent
future disparities between the two.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#2867
This is primarily a mechanical change: the enum was moved between files,
renamed, and its comments updated so it is no longer strictly tied to
GROUP BY aggregations.
This prepares the enum for reuse with ungrouped aggregations.
This bug occurred when arguments were read for the GROUP BY sorter — all
arguments were incorrectly resolved to the first column. Added tests
confirm that aggregates now work correctly both with and without the
sorter.
This commit consolidates the creation of the Aggregate struct, which was
previously handled differently in `prepare_one_select_plan` and
`resolve_aggregates`. That discrepancy caused inconsistent handling of
zero-argument aggregates.
The queries added in the new tests would previously trigger a panic.
Instead of using static elements, use a dynamically generated DBSP-
circuit to keep views.
The DBSP circuit is generated from the logical plan, which only supports
enough for us to generate the DBSP circuit at the moment.
The state of the view is still kept inside the IncrementalView, instead
of materialized at the operator level. As a consequence, this still
depends on us always populating the view at startup. Fixing this is the
next step.
Closes#2815
```
Benchmarking Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...: Collecting 100 samples in estimated 5.008
Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...
time: [4.0081 µs 4.0223 µs 4.0364 µs]
change: [-2.9298% -2.2538% -1.6786%] (p = 0.00 < 0.05)
Performance has improved.
```
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#2847
This adds support for "OFF" and "FULL" (default) synchronous modes. As
future work, we need to add NORMAL and EXTRA as well because
applications expect them.
Closes#2833
This adds support for "OFF" and "FULL" (default) synchronous modes. As
future work, we need to add NORMAL and EXTRA as well because
applications expect them.
The removed comment no longer matches the current code. The
OrderByRemapping struct and the surrounding comments are sufficient to
explain deduplication and remapping.