Commit Graph

1407 Commits

Author SHA1 Message Date
Pekka Enberg
e1b5f2d948 Merge 'Implement UPSERT' from Preston Thorpe
This PR closes #2019
Implements https://sqlite.org/lang_upsert.html

Closes #2853
2025-08-30 08:54:35 +03:00
Pekka Enberg
ca7f1002b4 Merge 'Change views to use DBSP circuits' from Glauber Costa
Instead of using static elements, use a dynamically generated DBSP-
circuit to keep views.
The DBSP circuit is generated from the logical plan, which only supports
enough for us to generate the DBSP circuit at the moment.
The state of the view is still kept inside the IncrementalView, instead
of materialized at the operator level. As a consequence, this still
depends on us always populating the view at startup. Fixing this is the
next step.

Closes #2815
2025-08-30 08:44:06 +03:00
PThorpe92
8531560899 Combine rewriting expressions in UPSERT into a single walk of the ast 2025-08-29 22:12:46 -04:00
Preston Thorpe
18a9a38c8e Merge ' core/translate: parse_table remove unnecessary clone of table name ' from Pere Diaz Bou
```

Benchmarking Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...: Collecting 100 samples in estimated 5.008
Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...
                        time:   [4.0081 µs 4.0223 µs 4.0364 µs]
                        change: [-2.9298% -2.2538% -1.6786%] (p = 0.00 < 0.05)
                        Performance has improved.
                        ```

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2847
2025-08-29 21:45:58 -04:00
PThorpe92
0fc603830b Use consistent imports of ast::Expr in upsert 2025-08-29 21:13:03 -04:00
PThorpe92
e175516319 Add more doc comments to upsert.rs 2025-08-29 20:59:02 -04:00
PThorpe92
e4a0a57227 Change get_column_mapping to return an Option now that we support excluded.col in upsert 2025-08-29 20:58:44 -04:00
PThorpe92
2beb8e4725 Add documentation and comments to translate.rs for upsert 2025-08-29 20:58:44 -04:00
PThorpe92
30137145a9 Add documentation and comments to upsert.rs 2025-08-29 20:58:44 -04:00
PThorpe92
1120d73931 Add a bunch of UPSERT tests 2025-08-29 20:58:43 -04:00
PThorpe92
ae6f60b603 initial pass at upsert, integrate upsert into insert translation 2025-08-29 20:58:43 -04:00
PThorpe92
efd15721b1 initial pass at upsert, add upsert.rs 2025-08-29 20:58:43 -04:00
PThorpe92
1c4d1a2f28 Add upsert module to core/translate 2025-08-29 20:58:43 -04:00
Preston Thorpe
0899711439 Merge 'core/translate: remove unneessary agg clones' from Pere Diaz Bou
```
Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...
                        time:   [3.9978 µs 4.0085 µs 4.0193 µs]
                        change: [-5.0734% -4.6271% -4.1644%] (p = 0.00 < 0.05)
```

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2846
2025-08-29 11:40:53 -04:00
Pere Diaz Bou
d72be206f2 core/translate: parse_table remove unnecessary clone of table name 2025-08-29 16:42:46 +02:00
Pere Diaz Bou
167459389b core/translate: remove unneessary agg clones 2025-08-29 16:23:44 +02:00
Preston Thorpe
eb0f2b7029 Merge 'translate: with_capacity insns' from Pere Diaz Bou
Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2831
2025-08-29 10:23:09 -04:00
Pekka Enberg
13e62ce435 Merge 'core: Initial pass on synchronous pragma' from Pekka Enberg
This adds support for "OFF" and "FULL" (default) synchronous modes. As
future work, we need to add NORMAL and EXTRA as well because
applications expect them.

Closes #2833
2025-08-29 07:27:12 +03:00
Pekka Enberg
3952dbb445 Merge 'Fix sorter column deduplication' from Piotr Rżysko
Previously, the added test case failed because the last result column
was missing - a nonexistent column in the sorter was referenced.

Closes #2824
2025-08-28 18:29:44 +03:00
Pekka Enberg
44ed4d562f core: Initial pass on synchronous pragma
This adds support for "OFF" and "FULL" (default) synchronous modes. As
future work, we need to add NORMAL and EXTRA as well because
applications expect them.
2025-08-28 16:02:41 +03:00
Pekka Enberg
878147b931 Merge 'translate/insert: Improve string format performance' from Pere Diaz Bou
Rust's `fmt!` is slow af for the simplest of cases, let's just create
strings with a known size and skip all the fmt stuff.

Closes #2832
2025-08-28 14:36:09 +03:00
Pere Diaz Bou
964422375e translate/insert: string fmt perf improvmenets 2025-08-28 13:22:54 +02:00
Pere Diaz Bou
c7230f4ab0 translate: with_capacity insns 2025-08-28 13:13:19 +02:00
Pere Diaz Bou
082f18c073 core/translate: sanize_string fast path improvement 2025-08-28 12:57:28 +02:00
Piotr Rzysko
c383d9f16e Remove outdated comment in order_by.rs
The removed comment no longer matches the current code. The
OrderByRemapping struct and the surrounding comments are sufficient to
explain deduplication and remapping.
2025-08-28 09:49:55 +02:00
Piotr Rzysko
e33c2e0f0b Fix sorter column deduplication
Previously, the added test case failed because the last result column
was missing - a nonexistent column in the sorter was referenced.
2025-08-28 09:49:55 +02:00
themixednuts
79a9f4743e fix: planner alias and table name 2025-08-27 18:13:03 -05:00
Glauber Costa
29b93e3e58 add DBSP circuit compiler
The next step is to adapt the view code to use circuits instead of
listing the operators manually.
2025-08-27 14:21:32 -05:00
Glauber Costa
c776e4eefb First implementation of Logical plan
This is a first pass on logical plans. The idea is that the DBSP
compiler will have an easier time operating on a logical plan, that
exposes linear algebra operators, than on SQL expr.

To keep this simple, we only support filters, aggregates and projections
for now, and will add more later as we agree on the core of the
implementation.

To make sure that the implementations is reasonable, I tried my best to
generate a couple of logical plans using Datafusion and seeing if we
were generating something similar.

Our plans are not the same as Datafusion's, though. There are two
important differences:

* SQLite is weird, and it allows columns that are not part of the group
  by statement to appear in aggregated statements. For example:
  select a, count(b) from table group by c; <== that "a" is usually not
  permitted and datafusion will reject it. SQLite will be happy to
  accept it

* Datafusion will not generate a projection on queries like this:
  select sum(hex(a)) from table, and just keep the complex expression
  hex(a) inside the aggregation. For DBSP to work well, we'll need an
  explicit aggregation there.

Because there are no users yet, I am marking this as [cfg(test)], but
I wanted to put this out there ASAP.
2025-08-27 11:18:54 -05:00
Pekka Enberg
26ba09c45f Revert "Merge 'Remove double indirection in the Parser' from Pedro Muniz"
This reverts commit 71c1b357e4, reversing
changes made to 6bc568ff69 because it
actually makes things slower.
2025-08-26 14:58:21 +03:00
Pekka Enberg
8f11311473 Merge 'Improve encryption API' from Avinash Sajjanshetty
This patch brings a bunch of quality of life improvements to encryption:
1. Previously, we just let any string to be used as a key. I have
updated the `PRAGMA hexkey=''` to get the key in hex. I have also
renamed from `key`, because that will be used to get passphrase
2. Added `PRAGMA cipher` so that now users can select which cipher they
want to use (for now, either `aegis256` or `aes256gcm`)
3. We now set the encryption context when both cipher and key are set
I also updated tests to reflect this.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2779
2025-08-26 08:32:29 +03:00
pedrocarlo
d3240844ec refactor Core to remove the double indirection 2025-08-25 22:59:31 -03:00
Glauber Costa
097510216e implement the projector operator for DBSP
My goal with this patch is to be able to implement the ProjectOperator
for DBSP circuits using VDBE for expression evaluation.

*not* doing so is dangerous for the following reason: we will end up
with different, subtle, and incompatible behavior between SQLite
expressions if they are used in views versus outside of views.

In fact, even in our prototype had them: our projection tests, which
used to pass, were actually wrong =) (sqlite would return something
different if those functions were executed outside the view context)

For optimization reasons, we single out trivial expressions: they don't
have go through VDBE. Trivial expressions are expressions that only
involve Columns, Literals, and simple operators on elements of the same
type. Even type coercion takes this out of the realm of trivial.

Everything that is not trivial, is then translated with translate_expr -
in the same way SQLite will, and then compiled with VDBE.

We can, over time, make this process much better. There are essentially
infinite opportunities for optimization here. But for now, the main
warts are:
* VDBE execution needs a connection
* There is no good way in VDBE to pass parameters to a program.
* It is almost trivial to pollute the original connection. For example,
  we need to issue HALT for the program to stop, but seeing that halt
  will usually cause the program to try and halt the original program.

Subprograms, like the ones we use in triggers are a possible solution,
but they are much more expensive to execute, especially given that our
execution would essentially have to have a program with no other role
than to wrap the subprogram.

Therefore, what I am doing is:
* There is an in-memory database inside the projection operator (an
  obvious optimization is to share it with *all* projection operators).
* We obtain a connection to that database when the operator is created
* We use that connection to execute our VDBE, which offers a clean, safe
  and isolated way to execute the expression.
* We feed the values to the program manually by editing the registers
  directly.
2025-08-25 17:48:17 +03:00
Glauber Costa
38def26704 Add expr_compiler
To be used in DBSP-based projections. This will compile an expression
to VDBE bytecode and execute it.

To do that we need to add a new type of Expression, which we call a
Register.

This is a way for us to pass parameters to a DBSP program which will be
not columns or literals, but inputs from the DBSP deltas.
2025-08-25 17:48:17 +03:00
Pekka Enberg
5d3780f25d core/translate: Add CREATE INDEX IF NOT EXISTS support
Fixes #2263
2025-08-25 11:12:41 +03:00
Avinash Sajjanshetty
328c5edf4d Add PRAGMA cipher to allow setting cipher algo 2025-08-25 02:17:53 +05:30
Alex Miller
370da9fa59 ANALYZE creates sqlite_stat1 if it doesn't exist
This change replaces a bail_parse_error!() when sqlite_stat1 doesn't
exist with the appropriate codegen to create the table, and handle both
cases of the table existing or not existing.

SQLite's codegen looks like:

sqlite> create table stat_test(a,b,c);
sqlite> explain analyze stat_test;
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     40    0                    0   Start at 40
1     ReadCookie     0     3     2                    0
2     If             3     5     0                    0
3     SetCookie      0     2     4                    0
4     SetCookie      0     5     1                    0
5     CreateBtree    0     2     1                    0   r[2]=root iDb=0 flags=1
6     OpenWrite      0     1     0     5              0   root=1 iDb=0
7     NewRowid       0     1     0                    0   r[1]=rowid
8     Blob           6     3     0                   0   r[3]= (len=6)
9     Insert         0     3     1                    8   intkey=r[1] data=r[3]
10    Close          0     0     0                    0
11    Close          0     0     0                    0
12    Null           0     4     5                    0   r[4..5]=NULL
13    Noop           4     0     4                    0
14    OpenWrite      3     1     0     5              0   root=1 iDb=0; sqlite_master
15    SeekRowid      3     17    1                    0   intkey=r[1]
16    Rowid          3     5     0                    0   r[5]= rowid of 3
17    IsNull         5     26    0                    0   if r[5]==NULL goto 26
18    String8        0     6     0     table          0   r[6]='table'
19    String8        0     7     0     sqlite_stat1   0   r[7]='sqlite_stat1'
20    String8        0     8     0     sqlite_stat1   0   r[8]='sqlite_stat1'
21    Copy           2     9     0                    0   r[9]=r[2]
22    String8        0     10    0     CREATE TABLE sqlite_stat1(tbl,idx,stat) 0   r[10]='CREATE TABLE sqlite_stat1(tbl,idx,stat)'
23    MakeRecord     6     5     4     BBBDB          0   r[4]=mkrec(r[6..10])
24    Delete         3     68    5                    0
25    Insert         3     4     5                    0   intkey=r[5] data=r[4]
26    SetCookie      0     1     2                    0
27    ParseSchema    0     0     0     tbl_name='sqlite_stat1' AND type!='trigger' 0
28    OpenWrite      0     2     0     3              16  root=2 iDb=0; sqlite_stat1
29    OpenRead       5     2     0     3              0   root=2 iDb=0; stat_test
30    String8        0     18    0     stat_test      0   r[18]='stat_test'; stat_test
31    Count          5     20    0                    0   r[20]=count()
32    IfNot          20    37    0                    0
33    Null           0     19    0                    0   r[19]=NULL
34    MakeRecord     18    3     16    BBB            0   r[16]=mkrec(r[18..20])
35    NewRowid       0     12    0                    0   r[12]=rowid
36    Insert         0     16    12                   8   intkey=r[12] data=r[16]
37    LoadAnalysis   0     0     0                    0
38    Expire         0     0     0                    0
39    Halt           0     0     0                    0
40    Transaction    0     1     1     0              1   usesStmtJournal=1
41    Goto           0     1     0                    0

And now Turso's looks like:

turso> create table stat_test(a,b,c);
turso> explain analyze stat_test;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     23    0                    0   Start at 23
1     Null               0     1     0                    0   r[1]=NULL
2     CreateBtree        0     2     1                    0   r[2]=root iDb=0 flags=1
3     OpenWrite          0     1     0                    0   root=1; iDb=0
4     NewRowid           0     3     0                    0   r[3]=rowid
5     String8            0     4     0     table          0   r[4]='table'
6     String8            0     5     0     sqlite_stat1   0   r[5]='sqlite_stat1'
7     String8            0     6     0     sqlite_stat1   0   r[6]='sqlite_stat1'
8     Copy               2     7     0                    0   r[7]=r[2]
9     String8            0     8     0     CREATE TABLE sqlite_stat1(tbl,idx,stat)  0   r[8]='CREATE TABLE sqlite_stat1(tbl,idx,stat)'
10    MakeRecord         4     5     9                    0   r[9]=mkrec(r[4..8])
11    Insert             0     9     3     sqlite_stat1   0   intkey=r[3] data=r[9]
12    ParseSchema        0     0     0     tbl_name = 'sqlite_stat1' AND type != 'trigger'  0   tbl_name = 'sqlite_stat1' AND type != 'trigger'
13    OpenWrite          1     2     0                    0   root=2; iDb=0
14    OpenRead           2     2     0                    0   =stat_test, root=2, iDb=0
15    String8            0     12    0     stat_test      0   r[12]='stat_test'
16    Count              2     14    0                    0
17    IfNot              14    22    0                    0   if !r[14] goto 22
18    Null               0     13    0                    0   r[13]=NULL
19    MakeRecord         12    3     11                   0   r[11]=mkrec(r[12..14])
20    NewRowid           1     10    0                    0   r[10]=rowid
21    Insert             1     11    10    sqlite_stat1   0   intkey=r[10] data=r[11]
22    Halt               0     0     0                    0
23    Goto               0     1     0                    0

The notable difference in size is following the same codegen difference
in CREATE TABLE, where sqlite's odd dance of adding a placeholder entry
which is immediately replaced is instead done in tursodb as just
inserting the correct row in the first place. Aside from lines 6-13 of
sqlite's vdbe being missing, there's still the lack of LoadAnalysis,
Expire, and Cookie management.
2025-08-24 13:35:39 -07:00
Avinash Sajjanshetty
0308374d3a Use proper hexadecimal key for encryption
Added `from_hex_string` which gets us `EncryptionKey` from a
hex string. Now we can use securely generated keys, like from openssl

$ openssl rand -hex 32
2025-08-25 01:36:05 +05:30
Pekka Enberg
1b89273f10 Merge 'refactor encryption module and make it configurable' from Avinash Sajjanshetty
Previously, the encryption module had hardcoded a lot of things. This
refactor makes it slightly nice and makes it configurable.
Right now cipher algorithm is assumed and hardcoded, I will make that
configurable in the upcoming PR

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2722
2025-08-24 08:16:28 +03:00
themixednuts
80eca66be9 fix: normalize quotes in update
fixes: #2744
2025-08-23 03:17:03 -05:00
Alex Miller
4619890ffc Add basic support for ANALYZE statement.
This permits only `ANALYZE <table_name>` to work, and all other forms
fail with a parse error (as documented in the tests).

On SQLite, ANALYZE generates:

sqlite> CREATE TABLE sqlite_stat1(tbl,idx,stat);
sqlite> CREATE TABLE iiftest(a int, b int, c int);
sqlite> EXPLAIN ANALYZE iiftest;
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     21    0                    0   Start at 21
1     Null           0     1     0                    0   r[1]=NULL
2     OpenWrite      3     4     0     3              0   root=4 iDb=0; sqlite_stat1
3     Rewind         3     9     0                    0
4       Column         3     0     2                    0   r[2]= cursor 3 column 0
5       Ne             3     8     2     BINARY-8       81  if r[2]!=r[3] goto 8
6       Rowid          3     4     0                    0   r[4]=sqlite_stat1.rowid
7       Delete         3     0     0     sqlite_stat1   2
8     Next           3     4     0                    1
9     OpenWrite      0     4     0     3              0   root=4 iDb=0; sqlite_stat1
10    OpenRead       4     2     0     3              0   root=2 iDb=0; iiftest
11    String8        0     11    0     iiftest        0   r[11]='iiftest'; iiftest
12    Count          4     13    0                    0   r[13]=count()
13    IfNot          13    18    0                    0
14    Null           0     12    0                    0   r[12]=NULL
15    MakeRecord     11    3     9     BBB            0   r[9]=mkrec(r[11..13])
16    NewRowid       0     5     0                    0   r[5]=rowid
17    Insert         0     9     5                    8   intkey=r[5] data=r[9]
18    LoadAnalysis   0     0     0                    0
19    Expire         0     0     0                    0
20    Halt           0     0     0                    0
21    Transaction    0     1     9     0              1   usesStmtJournal=0
22    String8        0     3     0     iiftest        0   r[3]='iiftest'
23    Goto           0     1     0                    0

Turso can now generate:

turso> create table sqlite_stat1(tbl,idx,stat);
turso> create table iiftest(a int, b int, c int);
turso> explain analyze iiftest;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     19    0                    0   Start at 19
1     Null               0     1     0                    0   r[1]=NULL
2     OpenWrite          0     2     0                    0   root=2; iDb=0
3     Rewind             0     9     0                    0   Rewind  sqlite_stat1
4       Column           0     0     2                    0   r[2]=sqlite_stat1.tbl
5       Ne               2     3     9                    0   if r[2]!=r[3] goto 9
6       RowId            0     4     0                    0   r[4]=sqlite_stat1.rowid
7       Delete           0     0     0     sqlite_stat1   0
8     Next               0     4     0                    0
9     OpenWrite          1     2     0                    0   root=2; iDb=0
10    OpenRead           2     3     0                    0   =iiftest, root=3, iDb=0
11    String8            0     7     0     iiftest        0   r[7]='iiftest'
12    Count              2     9     0                    0
13    IfNot              9     18    0                    0   if !r[9] goto 18
14    Null               0     8     0                    0   r[8]=NULL
15    MakeRecord         7     3     6                    0   r[6]=mkrec(r[7..9])
16    NewRowid           1     5     0                    0   r[5]=rowid
17    Insert             1     6     5     sqlite_stat1   0   intkey=r[5] data=r[6]
18    Halt               0     0     0                    0
19    String8            0     3     0     iiftest        0   r[3]='iiftest'
20    Goto               0     1     0                    0

Note the missing support for LoadAnalysis and Expire, but there's no
optimizer work done yet to leverage any gathered statistics yet anyway.
2025-08-22 23:18:53 -07:00
Levy A.
4ba1304fb9 complete parser integration 2025-08-21 15:23:59 -03:00
Levy A.
186e2f5d8e switch to new parser 2025-08-21 15:19:16 -03:00
Avinash Sajjanshetty
3090545167 use encryption ctx instead of encryption key 2025-08-21 22:36:32 +05:30
Jussi Saurio
cc28b8833e Fix condition that checks table.cols against number of provided values 2025-08-21 16:40:10 +03:00
Jussi Saurio
b5bd31a47b Remove old unused data structures and functions 2025-08-21 16:40:10 +03:00
Jussi Saurio
ac56d5bb67 Use new datastructures and functions in translate_insert 2025-08-21 16:40:10 +03:00
Jussi Saurio
88c4eae63e Add functions for constructing and translating Insertions 2025-08-21 16:40:10 +03:00
Jussi Saurio
630441e270 Add new Insertion datastructures 2025-08-21 16:40:10 +03:00
Jussi Saurio
dd2e0ea596 Fix: always emit rowid when column is rowid alias
SQLite does not store the rowid alias column in the record at all
when it is a rowid alias, because the rowid is always stored anyway
in the record header.
2025-08-21 16:40:10 +03:00