Commit Graph

2452 Commits

Author SHA1 Message Date
Jussi Saurio
c8f5bd3f4f rename 2025-05-14 09:42:26 +03:00
Jussi Saurio
630a6093aa refactor join_lhs_tables_to_rhs_table 2025-05-14 09:42:26 +03:00
Jussi Saurio
62d2ee8eb6 rename 2025-05-14 09:42:26 +03:00
Jussi Saurio
5f9ebe26a0 as_binary_components() helper 2025-05-14 09:42:26 +03:00
Jussi Saurio
a92d94270a Get rid of useless ScanCost struct 2025-05-14 09:42:26 +03:00
Jussi Saurio
de9e8442e8 fix ephemeral 2025-05-14 09:42:25 +03:00
Jussi Saurio
3b1aef4a9e Do Less Work (tm) - everything works except ephemeral 2025-05-14 09:42:01 +03:00
Jussi Saurio
87850e5706 simplify 2025-05-14 09:41:14 +03:00
Jussi Saurio
77f11ba004 simplify AccessMethodKind 2025-05-14 09:41:14 +03:00
Jussi Saurio
5f724d6b2e Add more comments to join ordering logic 2025-05-14 09:41:14 +03:00
Jussi Saurio
c02d3f8bcd Do groupby/orderby sort elimination based on optimizer decision 2025-05-14 09:41:13 +03:00
Jussi Saurio
1e46f1d9de Feature: join reordering optimizer 2025-05-14 09:40:48 +03:00
Jussi Saurio
c8c83fc6e6 OPTIMIZER.MD docs 2025-05-14 09:39:47 +03:00
Jussi Saurio
67a080bfa0 dont mutate where clause during individual index selection phase 2025-05-14 09:39:47 +03:00
Jussi Saurio
1b71f58bbf Merge 'Redesign parameter binding in query translator' from Preston Thorpe
closes #1467
## Example:
Previously as explained in #1449, our parameter binding wasn't working
properly because we would essentially
assign the first index of whatever was translated first
```console
limbo> create table t (id integer primary key, name text, age integer);
limbo> explain select * from t where name = ? and id > ? and age between ? and ?;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     20    0                    0   Start at 20
1     OpenRead           0     2     0                    0   table=t, root=2
2     Variable           1     4     0                    0   r[4]=parameter(1) # always 1
3     IsNull             4     19    0                    0   if (r[4]==NULL) goto 19
4     SeekGT             0     19    4                    0   key=[4..4]
5       Column           0     1     5                    0   r[5]=t.name
6       Variable         2     6     0                    0   r[6]=parameter(2) # always 2
7       Ne               5     6     18                   0   if r[5]!=r[6] goto 18
8       Variable         3     7     0                    0   r[7]=parameter(3) # etc...
9       Column           0     2     8                    0   r[8]=t.age
10      Gt               7     8     18                   0   if r[7]>r[8] goto 18
11      Column           0     2     9                    0   r[9]=t.age
12      Variable         4     10    0                    0   r[10]=parameter(4)
13      Gt               9     10    18                   0   if r[9]>r[10] goto 18
14      RowId            0     1     0                    0   r[1]=t.rowid
15      Column           0     1     2                    0   r[2]=t.name
16      Column           0     2     3                    0   r[3]=t.age
17      ResultRow        1     3     0                    0   output=r[1..3]
18    Next               0     5     0                    0
19    Halt               0     0     0                    0
20    Transaction        0     0     0                    0   write=false
21    Goto               0     1     0                    0
```
## Solution:
`rewrite_expr` currently is used to transform `true|false` to `1|0`, so
it has been adapted to transform anonymous `Expr::Variable`s to named
variables, inserting the appropriate index of the parameter by passing
in a counter.
```rust
        ast::Expr::Variable(var) => {
            if var.is_empty() {
                // rewrite anonymous variables only, ensure that the `param_idx` starts at 1 and
                // all the expressions are rewritten in the order they come in the statement
                *expr = ast::Expr::Variable(format!("{}{param_idx}", PARAM_PREFIX));
                *param_idx += 1;
            }
            Ok(())
        }
```
# Corrected output: (notice the seek)
```console
limbo> explain select * from t where name = ? and id > ? and age between ? and ?;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     20    0                    0   Start at 20
1     OpenRead           0     2     0                    0   table=t, root=2
2     Variable           2     4     0                    0   r[4]=parameter(2)
3     IsNull             4     19    0                    0   if (r[4]==NULL) goto 19
4     SeekGT             0     19    4                    0   key=[4..4]
5       Column           0     1     5                    0   r[5]=t.name
6       Variable         1     6     0                    0   r[6]=parameter(1)
7       Ne               5     6     18                   0   if r[5]!=r[6] goto 18
8       Variable         3     7     0                    0   r[7]=parameter(3)
9       Column           0     2     8                    0   r[8]=t.age
10      Gt               7     8     18                   0   if r[7]>r[8] goto 18
11      Column           0     2     9                    0   r[9]=t.age
12      Variable         4     10    0                    0   r[10]=parameter(4)
13      Gt               9     10    18                   0   if r[9]>r[10] goto 18
14      RowId            0     1     0                    0   r[1]=t.rowid
15      Column           0     1     2                    0   r[2]=t.name
16      Column           0     2     3                    0   r[3]=t.age
17      ResultRow        1     3     0                    0   output=r[1..3]
18    Next               0     5     0                    0
19    Halt               0     0     0                    0
20    Transaction        0     0     0                    0   write=false
21    Goto               0     1     0                    0
```
## And a `Delete`:
```console
limbo> explain delete from t where name = ? and age > ? and id > ?;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     15    0                    0   Start at 15
1     OpenWrite          0     2     0                    0
2     Variable           3     1     0                    0   r[1]=parameter(3)
3     IsNull             1     14    0                    0   if (r[1]==NULL) goto 14
4     SeekGT             0     14    1                    0   key=[1..1]
5       Column           0     1     2                    0   r[2]=t.name
6       Variable         1     3     0                    0   r[3]=parameter(1)
7       Ne               2     3     13                   0   if r[2]!=r[3] goto 13
8       Column           0     2     4                    0   r[4]=t.age
9       Variable         2     5     0                    0   r[5]=parameter(2)
10      Le               4     5     13                   0   if r[4]<=r[5] goto 13
11      RowId            0     6     0                    0   r[6]=t.rowid
12      Delete           0     0     0                    0
13    Next               0     5     0                    0
14    Halt               0     0     0                    0
15    Transaction        0     1     0                    0   write=true
16    Goto               0     1     0                    0
```

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1475
2025-05-14 09:26:06 +03:00
Jussi Saurio
a0f973cb34 Merge 'Fix infinite loop when inserting multiple rows' from Jussi Saurio
Due to constant instruction reshuffling introduced in #1359, it is
advisable not to do either of the following:
1. Use raw offsets as jump targets
2. Use `program.resolve_label(my_label, program.offset())` when it is
uncertain what the next instruction will be
Instead, if you want a label to point to "whatever instruction follows
the last one", you should use
`program.preassign_label_to_next_insn(label)`, which will work correctly
even with instruction rerdering

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #1474
2025-05-14 09:24:47 +03:00
PThorpe92
a0b2b6e85d Consolidate match case in parameters push to handle all anonymous params in one case 2025-05-13 14:42:12 -04:00
PThorpe92
2f255524bd Remove unused import and unnecessary mut annotations in insert.rs 2025-05-13 14:34:22 -04:00
PThorpe92
94aa9cd99d Add cases to rewrite_expr in the optimizer 2025-05-13 14:33:45 -04:00
PThorpe92
16ac6ab918 Fix parameter push method to re-convert anonymous parameters 2025-05-13 14:33:11 -04:00
PThorpe92
0593a99f0e Remove insertCtx from parameters and replace fix with expr rewriting 2025-05-13 12:49:16 -04:00
Jussi Saurio
a2e577ad01 Merge 'Fix handling of empty strings in prepared statements' from Diego Reis
When `prepare()` is called with an empty string it should throw an error
(e.g `ApiMisuse` in rusqlite). I'm testing only in JS but it should
throw to any bind.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1473
2025-05-13 10:12:09 +03:00
Jussi Saurio
957fe1b446 Fix infinite loop when inserting multiple rows 2025-05-13 08:54:25 +03:00
Diego Reis
07bfeadd56 core: Simplify error handling of malformed strings for prepared statements 2025-05-12 13:25:11 -03:00
Diego Reis
f7ab8b11d6 cargo fmt 2025-05-12 10:56:53 -03:00
Jussi Saurio
2b6b09d435 Merge 'btree: Coalesce free blocks in page_free_array()' from Mohamed Hossam
Coalesce adjacent free blocks during `page_free_array()` in
`core/storage/btree`.
Instead of immediately passing free cells to `free_cell_range()`, buffer
up to 10 free cells and try to merge adjacent free blocks. Break on the
first merge to avoid time complexity, `free_cell_range()` coalesces
blocks afterwards anyways. This follows SQLite's [`pageFreeArray()`](htt
ps://github.com/sqlite/sqlite/blob/d7324103b196c572a98724a5658970b4000b8
c39/src/btree.c#L7729) implementation.
Removed this TODO:
```rust
fn page_free_array( . . )
    .
    .
    // TODO: implement fancy smart free block coalescing procedure instead of dumb free to
    // then defragment
```

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1448
2025-05-12 16:52:11 +03:00
Diego Reis
c4e7be04f8 core: Handles prepared statement with empty SQL 2025-05-12 10:38:58 -03:00
m0hossam
70b9438f80 Add comments 2025-05-12 15:58:36 +03:00
Jussi Saurio
9b96e2bcc3 Merge 'CREATE VIRTUAL TABLE fixes' from Piotr Rżysko
This PR fixes two bugs in `CREATE VIRTUAL TABLE` (see individual commits
for details).
I added a test in `extensions.py` instead of the TCL test suite because
it's currently more convenient - there’s no existing framework for
loading extensions in TCL tests. However, I believe the test should
eventually be moved to TCL, as it verifies behavior expected to be
compatible with SQLite and is independent of any specific extension.
I've seen a PR proposing to migrate TCL tests to Rust, so please let me
know if moving this test to TCL would still be valuable.

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #1471
2025-05-12 10:47:51 +03:00
Jussi Saurio
501e95637a Merge 'Support isnull and notnull expr' from meteorgan
Limbo `isnull` output:
```
limbo> explain select 1 isnull;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     3     0                    0   Start at 3
1     ResultRow          1     1     0                    0   output=r[1]
2     Halt               0     0     0                    0
3     Integer            1     2     0                    0   r[2]=1
4     Integer            1     1     0                    0   r[1]=1
5     IsNull             2     7     0                    0   if (r[2]==NULL) goto 7
6     Integer            0     1     0                    0   r[1]=0
7     Goto               0     1     0                    0
```
Sqlite `isnull` output:
```
sqlite>  explain select 1 isnull;
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     6     0                    0   Start at 6
1     Integer        1     1     0                    0   r[1]=1
2     IsNull         2     4     0                    0   if r[2]==NULL goto 4
3     Integer        0     1     0                    0   r[1]=0
4     ResultRow      1     1     0                    0   output=r[1]
5     Halt           0     0     0                    0
6     Integer        1     2     0                    0   r[2]=1
7     Goto           0     1     0                    0
```
------------------------------------------------------------------------
-------------------
Limbo `notnull` output:
```
limbo> explain select 1 notnull;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     3     0                    0   Start at 3
1     ResultRow          1     1     0                    0   output=r[1]
2     Halt               0     0     0                    0
3     Integer            1     2     0                    0   r[2]=1
4     Integer            1     1     0                    0   r[1]=1
5     NotNull            2     7     0                    0   r[2]!=NULL -> goto 7
6     Integer            0     1     0                    0   r[1]=0
7     Goto               0     1     0                    0
```
Sqlite `notnull` output:
```
sqlite> explain select 1 notnull;
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     6     0                    0   Start at 6
1     Integer        1     1     0                    0   r[1]=1
2     NotNull        2     4     0                    0   if r[2]!=NULL goto 4
3     Integer        0     1     0                    0   r[1]=0
4     ResultRow      1     1     0                    0   output=r[1]
5     Halt           0     0     0                    0
6     Integer        1     2     0                    0   r[2]=1
7     Goto           0     1     0                    0
```

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1468
2025-05-12 10:06:35 +03:00
Jussi Saurio
07415dac07 Merge 'Count optimization' from Pedro Muniz
After reading #1123, I wanted to see what optimizations I could do.
Sqlite optimizes `count` aggregation for the following case: `SELECT
count() FROM <tbl>`. This is so widely used, that they made an
optimization just for it in the form of the `COUNT` opcode.
This PR thus implements this optimization by creating the `COUNT`
opcode, and checking in the select emitter if we the query is a Simple
Count Query. If it is, we just emit the Opcode instead of going through
a Rewind loop, saving on execution time.
The screenshots below show a huge decrease in execution time.
- **Main**
<img width="383" alt="image" src="https://github.com/user-
attachments/assets/99a9dec4-e7c5-41db-ba67-4eafa80dd2e6" />
- **Count Optimization**
<img width="435" alt="image" src="https://github.com/user-
attachments/assets/e93b3233-92e6-4736-aa60-b52b2477179f" />

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1443
2025-05-12 10:01:50 +03:00
Piotr Rzysko
d5984445a9 Fix panic on CREATE VIRTUAL TABLE IF NOT EXISTS by halting VM properly
Fixes a runtime panic caused by failing to halt the virtual machine
after executing CREATE VIRTUAL TABLE IF NOT EXISTS.

Previously resulted in:
thread 'main' panicked at core/vdbe/mod.rs:408:52:
index out of bounds: the len is 3 but the index is 3
2025-05-11 21:21:18 +02:00
Piotr Rzysko
fdffbc9534 Ensure virtual table name uniqueness 2025-05-11 21:21:18 +02:00
PThorpe92
ab23f2a24f Add comments and reorganize fix of ordering parameters for insert statements 2025-05-11 14:20:57 -04:00
meteorgan
5185f4bf9e Support isnull and notnull expr 2025-05-11 23:47:30 +08:00
pedrocarlo
9f726dbe62 simplify simple count detection 2025-05-10 22:36:43 -03:00
pedrocarlo
977d09fd36 small fixes 2025-05-10 22:23:01 -03:00
pedrocarlo
7508043b62 add bench for select count 2025-05-10 22:23:01 -03:00
pedrocarlo
e9b1631d3c fix is_simple_count detection 2025-05-10 22:23:01 -03:00
pedrocarlo
342bf51c88 remove state machine for count 2025-05-10 22:23:01 -03:00
pedrocarlo
655ceeca45 correct count implementation 2025-05-10 22:23:01 -03:00
pedrocarlo
b69debb8c0 create count opcode 2025-05-10 22:23:01 -03:00
PThorpe92
1c7a50de96 Update comments and correct vtab insert behavior 2025-05-10 10:03:00 -04:00
PThorpe92
e9458de0a4 Use correct math to get value indicies for nth row on multiple insert 2025-05-10 07:46:30 -04:00
PThorpe92
0d73fe0fe7 Fix parameter position on insert by handling before vdbe layer 2025-05-10 07:46:29 -04:00
PThorpe92
50f2621c12 Add several more rust tests for parameter binding 2025-05-10 07:46:29 -04:00
PThorpe92
c4aee50b58 Fix unclear comments in translator 2025-05-10 07:46:29 -04:00
PThorpe92
7a5422ee30 Clean up api for remap parameters and consoidate code 2025-05-10 07:46:29 -04:00
PThorpe92
3691779408 Add tracing for remapping parameters 2025-05-10 07:46:29 -04:00
PThorpe92
d412e7c682 Improve naming of parameter remapping methods 2025-05-10 07:46:28 -04:00