Commit Graph

1432 Commits

Author SHA1 Message Date
Pekka Enberg
1de647758f Merge 'refactor parser fmt' from Lâm Hoàng Phúc
@penberg this PR try to clean up `turso_parser`'s`fmt` code.
- `get_table_name` and `get_column_name` should return None when
table/column does not exist.
```rust
/// Context to be used in ToSqlString
pub trait ToSqlContext {
    /// Given an id, get the table name
    /// First Option indicates whether the table exists
    ///
    /// Currently not considering aliases
    fn get_table_name(&self, _id: TableInternalId) -> Option<&str> {
        None
    }

    /// Given a table id and a column index, get the column name
    /// First Option indicates whether the column exists
    /// Second Option indicates whether the column has a name
    fn get_column_name(&self, _table_id: TableInternalId, _col_idx: usize) -> Option<Option<&str>> {
        None
    }

    // help function to handle missing table/column names
    fn get_table_and_column_names(
        &self,
        table_id: TableInternalId,
        col_idx: usize,
    ) -> (String, String) {
        let table_name = self
            .get_table_name(table_id)
            .map(|s| s.to_owned())
            .unwrap_or_else(|| format!("t{}", table_id.0));

        let column_name = self
            .get_column_name(table_id, col_idx)
            .map(|opt| {
                opt.map(|s| s.to_owned())
                    .unwrap_or_else(|| format!("c{col_idx}"))
            })
            .unwrap_or_else(|| format!("c{col_idx}"));

        (table_name, column_name)
    }
}
```
- remove `FmtTokenStream` because it is same as `WriteTokenStream `
- remove useless functions and simplify `ToTokens`
```rust
/// Generate token(s) from AST node
/// Also implements Display to make sure devs won't forget Display
pub trait ToTokens: Display {
    /// Send token(s) to the specified stream with context
    fn to_tokens<S: TokenStream + ?Sized, C: ToSqlContext>(
        &self,
        s: &mut S,
        context: &C,
    ) -> Result<(), S::Error>;

    // Return displayer representation with context
    fn displayer<'a, 'b, C: ToSqlContext>(&'b self, ctx: &'a C) -> SqlDisplayer<'a, 'b, C, Self>
    where
        Self: Sized,
    {
        SqlDisplayer::new(ctx, self)
    }
}
```

Closes #2748
2025-09-02 18:35:43 +03:00
Pekka Enberg
2addeb5a9f Merge 'introduce eq/contains/starts_with/ends_with_ignore_ascii_case macros' from Lâm Hoàng Phúc
depend on #2865
```sh
`ALTER TABLE _ RENAME TO _`/limbo_rename_table/
                        time:   [10.100 ms 10.191 ms 10.283 ms]
                        change: [-16.770% -15.559% -14.417%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild

`ALTER TABLE _ RENAME COLUMN _ TO _`/limbo_rename_column/
                        time:   [7.4829 ms 7.5492 ms 7.6128 ms]
                        change: [-19.397% -18.093% -16.789%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) low mild
  1 (1.00%) high mild

`ALTER TABLE _ ADD COLUMN _`/limbo_add_column/
                        time:   [5.3255 ms 5.3713 ms 5.4183 ms]
                        change: [-24.002% -22.612% -21.195%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 39 outliers among 100 measurements (39.00%)
  17 (17.00%) low severe
  1 (1.00%) low mild
  1 (1.00%) high mild
  20 (20.00%) high severe

`ALTER TABLE _ DROP COLUMN _`/limbo_drop_column/
                        time:   [5.8858 ms 5.9183 ms 5.9510 ms]
                        change: [-16.233% -14.679% -13.083%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 25 outliers among 100 measurements (25.00%)
  8 (8.00%) low severe
  11 (11.00%) low mild
  2 (2.00%) high mild
  4 (4.00%) high severe

Prepare `SELECT 1`/limbo_parse_query/SELECT 1
                        time:   [590.28 ns 591.31 ns 592.35 ns]
                        change: [-3.7810% -3.5059% -3.2444%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low severe
  6 (6.00%) high mild

Prepare `SELECT * FROM users LIMIT 1`/limbo_parse_query/SELECT * FROM users LIMIT 1
                        time:   [1.2569 µs 1.2582 µs 1.2596 µs]
                        change: [-5.0125% -4.7516% -4.4933%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) low severe
  2 (2.00%) low mild
  1 (1.00%) high mild
  1 (1.00%) high severe

Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...
                        time:   [3.7180 µs 3.7227 µs 3.7274 µs]
                        change: [-3.0557% -2.7642% -2.4761%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low mild
  4 (4.00%) high mild

Execute `SELECT 1`/limbo_execute_select_1
                        time:   [27.455 ns 27.477 ns 27.499 ns]
                        change: [-2.9461% -2.7493% -2.5589%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  1 (1.00%) high severe

Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/1
                        time:   [410.53 ns 411.05 ns 411.54 ns]
                        change: [-15.364% -15.133% -14.912%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) low mild
  1 (1.00%) high mild

Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/10
                        time:   [2.1100 µs 2.1122 µs 2.1145 µs]
                        change: [-11.517% -11.065% -10.662%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) low severe
  2 (2.00%) low mild

Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/50
                        time:   [9.5156 µs 9.5268 µs 9.5383 µs]
                        change: [-10.284% -10.086% -9.8833%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low severe
  2 (2.00%) low mild

Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/100
                        time:   [18.669 µs 18.698 µs 18.731 µs]
                        change: [-9.5949% -9.3407% -9.1140%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low severe
  1 (1.00%) high mild

Execute `SELECT count() FROM users`/limbo_execute_select_count
                        time:   [7.1027 µs 7.1098 µs 7.1170 µs]
                        change: [-43.739% -43.596% -43.469%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  2 (2.00%) low mild
  5 (5.00%) high mild
  2 (2.00%) high severe

```

Closes #2866
2025-09-02 18:35:14 +03:00
Pekka Enberg
12cf4d2e72 core: Make strict schema support experimental
It's not tested properly so let's mark it as experimental for now.

Fixes #2775
2025-09-02 16:40:02 +03:00
TcMits
bfff05faba merge main 2025-09-02 18:25:20 +07:00
TcMits
33a04fbaf7 resolve conflict 2025-09-02 17:30:10 +07:00
Pekka Enberg
7189e98455 Merge 'Unify handling of grouped and ungrouped aggregations' from Piotr Rżysko
The initial commits fix issues and plug gaps between ungrouped and
grouped aggregations.
The final commit consolidates the code that emits `AggStep` to prevent
future disparities between the two.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2867
2025-09-02 09:11:40 +03:00
Piotr Rzysko
6f1cd17fcf Consolidate methods emitting AggStep 2025-08-31 13:29:10 +02:00
Piotr Rzysko
cdba1f1b87 Generalize GroupByAggArgumentSource
This is primarily a mechanical change: the enum was moved between files,
renamed, and its comments updated so it is no longer strictly tied to
GROUP BY aggregations.

This prepares the enum for reuse with ungrouped aggregations.
2025-08-31 13:23:12 +02:00
Piotr Rzysko
0a85883ee2 Support external aggregate functions in GROUP BY 2025-08-31 12:02:11 +02:00
Piotr Rzysko
7d179bd9fe Fix handling of multiple arguments in aggregate functions
This bug occurred when arguments were read for the GROUP BY sorter — all
arguments were incorrectly resolved to the first column. Added tests
confirm that aggregates now work correctly both with and without the
sorter.
2025-08-31 12:02:11 +02:00
Piotr Rzysko
3ad4016080 Fix handling of zero-argument grouped aggregations
This commit consolidates the creation of the Aggregate struct, which was
previously handled differently in `prepare_one_select_plan` and
`resolve_aggregates`. That discrepancy caused inconsistent handling of
zero-argument aggregates.

The queries added in the new tests would previously trigger a panic.
2025-08-31 12:02:09 +02:00
TcMits
37f33dc45f add eq/contains/starts_with/ends_with_ignore_ascii_case 2025-08-31 16:18:42 +07:00
Piotr Rzysko
978a78b79a Handle COLLATE clause in grouped aggregations
Previously, it was only applied to ungrouped aggregations.
2025-08-31 06:51:26 +02:00
Levy A.
293865c2d6 feat+fix: add tests and restrict altering some constraints 2025-08-30 03:43:31 -03:00
Levy A.
ad639b2b23 fix: reintroduce rename
we don't store the parsed column to replace just the name, this will be
refactored later with a more general approach
2025-08-30 03:10:39 -03:00
Levy A.
5b378e3730 feat: add AlterColumn instruction
also refactor `RenameColumn` to reuse the logic from `AlterColumn`
2025-08-30 03:10:39 -03:00
Levy A.
678ca8d33b feat(parser): add ALTER COLUMN 2025-08-30 03:10:39 -03:00
Pekka Enberg
e1b5f2d948 Merge 'Implement UPSERT' from Preston Thorpe
This PR closes #2019
Implements https://sqlite.org/lang_upsert.html

Closes #2853
2025-08-30 08:54:35 +03:00
Pekka Enberg
ca7f1002b4 Merge 'Change views to use DBSP circuits' from Glauber Costa
Instead of using static elements, use a dynamically generated DBSP-
circuit to keep views.
The DBSP circuit is generated from the logical plan, which only supports
enough for us to generate the DBSP circuit at the moment.
The state of the view is still kept inside the IncrementalView, instead
of materialized at the operator level. As a consequence, this still
depends on us always populating the view at startup. Fixing this is the
next step.

Closes #2815
2025-08-30 08:44:06 +03:00
PThorpe92
8531560899 Combine rewriting expressions in UPSERT into a single walk of the ast 2025-08-29 22:12:46 -04:00
Preston Thorpe
18a9a38c8e Merge ' core/translate: parse_table remove unnecessary clone of table name ' from Pere Diaz Bou
```

Benchmarking Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...: Collecting 100 samples in estimated 5.008
Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...
                        time:   [4.0081 µs 4.0223 µs 4.0364 µs]
                        change: [-2.9298% -2.2538% -1.6786%] (p = 0.00 < 0.05)
                        Performance has improved.
                        ```

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2847
2025-08-29 21:45:58 -04:00
PThorpe92
0fc603830b Use consistent imports of ast::Expr in upsert 2025-08-29 21:13:03 -04:00
PThorpe92
e175516319 Add more doc comments to upsert.rs 2025-08-29 20:59:02 -04:00
PThorpe92
e4a0a57227 Change get_column_mapping to return an Option now that we support excluded.col in upsert 2025-08-29 20:58:44 -04:00
PThorpe92
2beb8e4725 Add documentation and comments to translate.rs for upsert 2025-08-29 20:58:44 -04:00
PThorpe92
30137145a9 Add documentation and comments to upsert.rs 2025-08-29 20:58:44 -04:00
PThorpe92
1120d73931 Add a bunch of UPSERT tests 2025-08-29 20:58:43 -04:00
PThorpe92
ae6f60b603 initial pass at upsert, integrate upsert into insert translation 2025-08-29 20:58:43 -04:00
PThorpe92
efd15721b1 initial pass at upsert, add upsert.rs 2025-08-29 20:58:43 -04:00
PThorpe92
1c4d1a2f28 Add upsert module to core/translate 2025-08-29 20:58:43 -04:00
Preston Thorpe
0899711439 Merge 'core/translate: remove unneessary agg clones' from Pere Diaz Bou
```
Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...
                        time:   [3.9978 µs 4.0085 µs 4.0193 µs]
                        change: [-5.0734% -4.6271% -4.1644%] (p = 0.00 < 0.05)
```

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2846
2025-08-29 11:40:53 -04:00
Pere Diaz Bou
d72be206f2 core/translate: parse_table remove unnecessary clone of table name 2025-08-29 16:42:46 +02:00
Pere Diaz Bou
167459389b core/translate: remove unneessary agg clones 2025-08-29 16:23:44 +02:00
Preston Thorpe
eb0f2b7029 Merge 'translate: with_capacity insns' from Pere Diaz Bou
Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2831
2025-08-29 10:23:09 -04:00
Pekka Enberg
13e62ce435 Merge 'core: Initial pass on synchronous pragma' from Pekka Enberg
This adds support for "OFF" and "FULL" (default) synchronous modes. As
future work, we need to add NORMAL and EXTRA as well because
applications expect them.

Closes #2833
2025-08-29 07:27:12 +03:00
Pekka Enberg
3952dbb445 Merge 'Fix sorter column deduplication' from Piotr Rżysko
Previously, the added test case failed because the last result column
was missing - a nonexistent column in the sorter was referenced.

Closes #2824
2025-08-28 18:29:44 +03:00
Pekka Enberg
44ed4d562f core: Initial pass on synchronous pragma
This adds support for "OFF" and "FULL" (default) synchronous modes. As
future work, we need to add NORMAL and EXTRA as well because
applications expect them.
2025-08-28 16:02:41 +03:00
Pekka Enberg
878147b931 Merge 'translate/insert: Improve string format performance' from Pere Diaz Bou
Rust's `fmt!` is slow af for the simplest of cases, let's just create
strings with a known size and skip all the fmt stuff.

Closes #2832
2025-08-28 14:36:09 +03:00
Pere Diaz Bou
964422375e translate/insert: string fmt perf improvmenets 2025-08-28 13:22:54 +02:00
Pere Diaz Bou
c7230f4ab0 translate: with_capacity insns 2025-08-28 13:13:19 +02:00
Pere Diaz Bou
082f18c073 core/translate: sanize_string fast path improvement 2025-08-28 12:57:28 +02:00
Piotr Rzysko
c383d9f16e Remove outdated comment in order_by.rs
The removed comment no longer matches the current code. The
OrderByRemapping struct and the surrounding comments are sufficient to
explain deduplication and remapping.
2025-08-28 09:49:55 +02:00
Piotr Rzysko
e33c2e0f0b Fix sorter column deduplication
Previously, the added test case failed because the last result column
was missing - a nonexistent column in the sorter was referenced.
2025-08-28 09:49:55 +02:00
themixednuts
79a9f4743e fix: planner alias and table name 2025-08-27 18:13:03 -05:00
Glauber Costa
29b93e3e58 add DBSP circuit compiler
The next step is to adapt the view code to use circuits instead of
listing the operators manually.
2025-08-27 14:21:32 -05:00
Glauber Costa
c776e4eefb First implementation of Logical plan
This is a first pass on logical plans. The idea is that the DBSP
compiler will have an easier time operating on a logical plan, that
exposes linear algebra operators, than on SQL expr.

To keep this simple, we only support filters, aggregates and projections
for now, and will add more later as we agree on the core of the
implementation.

To make sure that the implementations is reasonable, I tried my best to
generate a couple of logical plans using Datafusion and seeing if we
were generating something similar.

Our plans are not the same as Datafusion's, though. There are two
important differences:

* SQLite is weird, and it allows columns that are not part of the group
  by statement to appear in aggregated statements. For example:
  select a, count(b) from table group by c; <== that "a" is usually not
  permitted and datafusion will reject it. SQLite will be happy to
  accept it

* Datafusion will not generate a projection on queries like this:
  select sum(hex(a)) from table, and just keep the complex expression
  hex(a) inside the aggregation. For DBSP to work well, we'll need an
  explicit aggregation there.

Because there are no users yet, I am marking this as [cfg(test)], but
I wanted to put this out there ASAP.
2025-08-27 11:18:54 -05:00
TcMits
1b048b2628 clippy+fmt 2025-08-27 15:08:32 +07:00
TcMits
4ddfdb2a62 finish 2025-08-27 14:58:35 +07:00
TcMits
50bdaec6d0 merge main 2025-08-27 13:36:54 +07:00
Pekka Enberg
26ba09c45f Revert "Merge 'Remove double indirection in the Parser' from Pedro Muniz"
This reverts commit 71c1b357e4, reversing
changes made to 6bc568ff69 because it
actually makes things slower.
2025-08-26 14:58:21 +03:00