## Problem:
- We have cases where we are evaluating expressions in a hot loop that
could only be evaluated once. For example: `CAST('2025-01-01' as
DATETIME)` -- the value of this never changes, so we should only run it
once.
- We have no robust way of doing this right now for entire _expressions_
-- the only existing facility we have is
`program.mark_last_insn_constant()`, which has no concept of how many
instructions translating a given _expression_ spends, and breaks very
easily for this reason.
## Main ideas of this PR:
- Add `expr.is_constant()` determining whether the expression is
compile-time constant. Tries to be conservative and not deem something
compile-time constant if there is no certainty.
- Whenever we think a compile-time constant expression is about to be
translated into bytecode in `translate_expr()`, start a so called
`constant span`, which means a range of instructions that are part of a
compile-time constant expression.
- At the end of translating the program, all `constant spans` are
hoisted outside of any table loops so they only get evaluated once.
- The target offsets of any jump instructions (e.g. `Goto`) are moved to
the correct place, taking into account all instructions whose offsets
were shifted due to moving the compile-time constant expressions around.
- An escape hatch wrapper `translate_expr_no_constant_opt()` is added
for cases where we should not hoist constants even if we otherwise
could. Right now the only example of this is cases where we are reusing
the same register(s) in multiple iterations of some kind of loop, e.g.
`VALUES(...)` or in the `coalesce()` function implementation.
## Performance effects
Here is an example of a modified/simplified TPC-H query where the
`CAST()` calls were previously run millions of times in a hot loop, but
now they are optimized out of the loop.
**BYTECODE PLAN BEFORE:**
```sql
limbo> explain select
l_orderkey,
3 as revenue,
o_orderdate,
o_shippriority
from
lineitem,
orders,
customer
where
c_mktsegment = 'FURNITURE'
and c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate < cast('1995-03-29' as datetime)
and l_shipdate > cast('1995-03-29' as datetime);
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 26 0 0 Start at 26
1 OpenRead 0 10 0 0 table=lineitem, root=10
2 OpenRead 1 9 0 0 table=orders, root=9
3 OpenRead 2 8 0 0 table=customer, root=8
4 Rewind 0 25 0 0 Rewind lineitem
5 Column 0 10 5 0 r[5]=lineitem.l_shipdate
6 String8 0 7 0 1995-03-29 0 r[7]='1995-03-29'
7 Function 0 7 6 cast 0 r[6]=func(r[7..8]) <-- CAST() executed millions of times
8 Le 5 6 24 0 if r[5]<=r[6] goto 24
9 Column 0 0 9 0 r[9]=lineitem.l_orderkey
10 SeekRowid 1 9 24 0 if (r[9]!=orders.rowid) goto 24
11 Column 1 4 10 0 r[10]=orders.o_orderdate
12 String8 0 12 0 1995-03-29 0 r[12]='1995-03-29'
13 Function 0 12 11 cast 0 r[11]=func(r[12..13])
14 Ge 10 11 24 0 if r[10]>=r[11] goto 24
15 Column 1 1 14 0 r[14]=orders.o_custkey
16 SeekRowid 2 14 24 0 if (r[14]!=customer.rowid) goto 24
17 Column 2 6 15 0 r[15]=customer.c_mktsegment
18 Ne 15 16 24 0 if r[15]!=r[16] goto 24
19 Column 0 0 1 0 r[1]=lineitem.l_orderkey
20 Integer 3 2 0 0 r[2]=3
21 Column 1 4 3 0 r[3]=orders.o_orderdate
22 Column 1 7 4 0 r[4]=orders.o_shippriority
23 ResultRow 1 4 0 0 output=r[1..4]
24 Next 0 5 0 0
25 Halt 0 0 0 0
26 Transaction 0 0 0 0 write=false
27 String8 0 8 0 DATETIME 0 r[8]='DATETIME'
28 String8 0 13 0 DATETIME 0 r[13]='DATETIME'
29 String8 0 16 0 FURNITURE 0 r[16]='FURNITURE'
30 Goto 0 1 0
```
**BYTECODE PLAN AFTER**:
```sql
limbo> explain select
l_orderkey,
3 as revenue,
o_orderdate,
o_shippriority
from
lineitem,
orders,
customer
where
c_mktsegment = 'FURNITURE'
and c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate < cast('1995-03-29' as datetime)
and l_shipdate > cast('1995-03-29' as datetime);
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 21 0 0 Start at 21
1 OpenRead 0 10 0 0 table=lineitem, root=10
2 OpenRead 1 9 0 0 table=orders, root=9
3 OpenRead 2 8 0 0 table=customer, root=8
4 Rewind 0 20 0 0 Rewind lineitem
5 Column 0 10 5 0 r[5]=lineitem.l_shipdate
6 Le 5 6 19 0 if r[5]<=r[6] goto 19
7 Column 0 0 9 0 r[9]=lineitem.l_orderkey
8 SeekRowid 1 9 19 0 if (r[9]!=orders.rowid) goto 19
9 Column 1 4 10 0 r[10]=orders.o_orderdate
10 Ge 10 11 19 0 if r[10]>=r[11] goto 19
11 Column 1 1 14 0 r[14]=orders.o_custkey
12 SeekRowid 2 14 19 0 if (r[14]!=customer.rowid) goto 19
13 Column 2 6 15 0 r[15]=customer.c_mktsegment
14 Ne 15 16 19 0 if r[15]!=r[16] goto 19
15 Column 0 0 1 0 r[1]=lineitem.l_orderkey
16 Column 1 4 3 0 r[3]=orders.o_orderdate
17 Column 1 7 4 0 r[4]=orders.o_shippriority
18 ResultRow 1 4 0 0 output=r[1..4]
19 Next 0 5 0 0
20 Halt 0 0 0 0
21 Transaction 0 0 0 0 write=false
22 String8 0 7 0 1995-03-29 0 r[7]='1995-03-29'
23 String8 0 8 0 DATETIME 0 r[8]='DATETIME'
24 Function 1 7 6 cast 0 r[6]=func(r[7..8]) <-- CAST() executed twice
25 String8 0 12 0 1995-03-29 0 r[12]='1995-03-29'
26 String8 0 13 0 DATETIME 0 r[13]='DATETIME'
27 Function 1 12 11 cast 0 r[11]=func(r[12..13])
28 String8 0 16 0 FURNITURE 0 r[16]='FURNITURE'
29 Integer 3 2 0 0 r[2]=3
30 Goto 0 1 0 0
```
**EXECUTION RUNTIME BEFORE:**
```sql
limbo> select
l_orderkey,
3 as revenue,
o_orderdate,
o_shippriority
from
lineitem,
orders,
customer
where
c_mktsegment = 'FURNITURE'
and c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate < cast('1995-03-29' as datetime)
and l_shipdate > cast('1995-03-29' as datetime);
┌────────────┬─────────┬─────────────┬────────────────┐
│ l_orderkey │ revenue │ o_orderdate │ o_shippriority │
├────────────┼─────────┼─────────────┼────────────────┤
└────────────┴─────────┴─────────────┴────────────────┘
Command stats:
----------------------------
total: 3.633396667 s (this includes parsing/coloring of cli app)
```
**EXECUTION RUNTIME AFTER:**
```sql
limbo> select
l_orderkey,
3 as revenue,
o_orderdate,
o_shippriority
from
lineitem,
orders,
customer
where
c_mktsegment = 'FURNITURE'
and c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate < cast('1995-03-29' as datetime)
and l_shipdate > cast('1995-03-29' as datetime);
┌────────────┬─────────┬─────────────┬────────────────┐
│ l_orderkey │ revenue │ o_orderdate │ o_shippriority │
├────────────┼─────────┼─────────────┼────────────────┤
└────────────┴─────────┴─────────────┴────────────────┘
Command stats:
----------------------------
total: 2.0923475 s (this includes parsing/coloring of cli app)
````
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#1359
Closes#1384 . This PR implements Primary Key constraint for inserts. As
can be seen in the issue, if you created an Index with a Primary Key
constraint, it could trigger `Unique Constraint` error, but still insert
the record. Sqlite uses the opcode `NoConflict` to check if the record
already exists in the Btree. As we did not have this Opcode yet, I
implemented it. It is very similar to `NotFound` with the difference
that if any value in the Record is Null, it will immediately jump to the
offset. The added benefit of implementing this, is that now we fully
support Composite Primary Keys. Also, I think with the current
implementation, it will be trivial to implement the Unique opcode for
Insert. To support Updates, I need to understand more of the plan
optimizer to and find where we are Making the Record and opening the
autoindex.
For testing, I have written a test generator to generate many different
tables that can have a varying numbers of Primary Keys.
```sql
limbo> CREATE TABLE users (id INT, username TEXT, PRIMARY KEY (id, username));
limbo> INSERT INTO users VALUES (1, 'alice');
limbo> explain INSERT INTO users VALUES (1, 'alice');
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 16 0 0 Start at 16
1 OpenWrite 0 2 0 0
2 Integer 1 2 0 0 r[2]=1
3 String8 0 3 0 alice 0 r[3]='alice'
4 OpenWrite 1 3 0 0
5 NewRowId 0 1 0 0
6 Copy 2 5 0 0 r[5]=r[2]
7 Copy 3 6 0 0 r[6]=r[3]
8 Copy 1 7 0 0 r[7]=r[1]
9 MakeRecord 5 3 8 0 r[8]=mkrec(r[5..7])
10 NoConflict 1 12 5 2 0 key=r[5]
11 Halt 1555 0 0 users.id, users.username 0
12 IdxInsert 1 8 5 0 key=r[8]
13 MakeRecord 2 2 4 0 r[4]=mkrec(r[2..3])
14 Insert 0 4 1 0
15 Halt 0 0 0 0
16 Transaction 0 1 0 0 write=true
17 Goto 0 1 0 0
limbo> INSERT INTO users VALUES (1, 'alice');
× Runtime error: UNIQUE constraint failed: users.id, users.username (19)
limbo> INSERT INTO users VALUES (1, 'bob');
limbo> INSERT INTO users VALUES (1, 'bob');
× Runtime error: UNIQUE constraint failed: users.id, users.username (19)
```
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1393
1) Fix a bug where cli pretty mode would not print pragma results;
2) Add ability to read page_size using PRAGMA page_size;
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1394
This avoids redundant `IsNull` instructions during index seeks if the
seek key columns are primary keys of other tables, which they often are.
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes#1388
I was reading through the `translate_expr` function and `COMPAT.md` to
see what was not implemented yet. I saw that `Expr::Between` was marked
as a `todo!` so I set trying to implement it only to find that it was
being rewritten in the optimizer haha. This PR just adjusts the docs and
add an `unreachable` in the appropriate locations.
Closes#1378
This PR adds the statement.columns() function, inspired from Rusqlite: h
ttps://docs.rs/rusqlite/latest/rusqlite/struct.Statement.html#method.col
umns
Note that the rusqlite documentation says
> If associated DB schema can be altered concurrently, you should make
sure that current statement has already been stepped once before calling
this method.
Do we have this requirement as well?
The first commit is just the rust binding. The second commit implements
the column name for the rowid column.
Closes#1376
The previous version of `julian_day-converter` had precision issues,
potentially causing loss of precision when converting between
`julianday` and `datetime`

Reviewed-by: Diego Reis (@diegoreis42)
Closes#1344