Jussi Saurio
029e5eddde
Fix existing resolve_label() calls to work with new system
2025-04-24 11:05:21 +03:00
Jussi Saurio
e557503091
expr.rs: use constant spans to optimize constant expressions
2025-04-24 11:05:21 +03:00
Jussi Saurio
0f5c791784
vdbe: refactor label resolution to account for insn offsets changing
2025-04-24 11:05:21 +03:00
Jussi Saurio
b4b38bdb3c
vdbe: resolve labels for InitCoroutine::start_offset
2025-04-24 11:05:21 +03:00
Jussi Saurio
47f3f3bda3
vdbe: replace constant_insns with constant_spans
2025-04-24 11:05:21 +03:00
Jussi Saurio
e5bab63522
add expr.is_constant()
2025-04-24 11:05:21 +03:00
Jussi Saurio
5bed331505
add Func::is_deterministic()
2025-04-24 11:05:21 +03:00
Jussi Saurio
b36c898842
rename check_constant() to less confusing name
2025-04-24 11:05:21 +03:00
Jussi Saurio
6ff5ff49b7
Merge 'perf/btree: use binary search for Index seek operations' from Jussi Saurio
...
## Beef
Followup to #1357 which did the same treatment for table btrees only.
After this PR, all of our seeks use binary search for both interior and
leaf pages.
## Perf comparison
using TPC-H 1GB db for this query:
```sql
limbo> explain select count(1) from lineitem join partsupp on l_partkey = ps_partkey and l_suppkey = ps_suppkey;
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 18 0 0 Start at 18
1 OpenRead 0 10 0 0 table=lineitem, root=10
2 OpenRead 1 7 0 0 table=sqlite_autoindex_partsupp_1, root=7
3 Rewind 0 14 0 0 Rewind lineitem
4 Column 0 1 2 0 r[2]=lineitem.l_partkey
5 IsNull 2 13 0 0 if (r[2]==NULL) goto 13
6 Column 0 2 3 0 r[3]=lineitem.l_suppkey
7 IsNull 3 13 0 0 if (r[3]==NULL) goto 13
8 SeekGE 1 13 2 0 key=[2..3] <-- index seek here, for every row in lineitem
9 IdxGT 1 13 2 0 key=[2..3]
10 Integer 1 5 0 0 r[5]=1
11 AggStep 0 5 4 count 0 accum=r[4] step(r[5])
12 Next 1 9 0 0
13 Next 0 4 0 0
14 AggFinal 0 4 0 count 0 accum=r[4]
15 Copy 4 1 0 0 r[1]=r[4]
16 ResultRow 1 1 0 0 output=r[1]
17 Halt 0 0 0 0
18 Transaction 0 0 0 0 write=false
19 Goto 0 1 0 0
```
main:
```sql
limbo> select count(1) from lineitem join partsupp on l_partkey = ps_partkey and l_suppkey = ps_suppkey;
┌───────────┐
│ count (1) │
├───────────┤
│ 6001215 │
└───────────┘
Command stats:
----------------------------
total: 40.292102375 s (this includes parsing/coloring of cli app)
```
PR:
```sql
limbo> select count(1) from lineitem join partsupp on l_partkey = ps_partkey and l_suppkey = ps_suppkey;
┌───────────┐
│ count (1) │
├───────────┤
│ 6001215 │
└───────────┘
Command stats:
----------------------------
total: 14.021689916 s (this includes parsing/coloring of cli app)
```
almost 3x faster. buzzkill: still 3x slower than sqlite :)
Closes #1387
2025-04-24 10:53:35 +03:00
Jussi Saurio
c88c579154
Merge 'expr.is_nonnull(): return true if col.primary_key || col.notnull' from Jussi Saurio
...
This avoids redundant `IsNull` instructions during index seeks if the
seek key columns are primary keys of other tables, which they often are.
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes #1388
2025-04-24 10:32:00 +03:00
Jussi Saurio
a7488496d5
expr.is_nonnull(): return true if col.primary_key || col.notnull
2025-04-23 18:10:33 +03:00
Jussi Saurio
af703110f8
btree: remove extra iter_dir argument that can be derived from seek_op
2025-04-23 17:38:48 +03:00
Jussi Saurio
044339efc7
btree: rename tablebtree_move_to_binsearch -> tablebtree_move_to
2025-04-23 17:35:22 +03:00
Jussi Saurio
8c338438dd
btree: use binary search for index interior cell seek
2025-04-23 17:34:32 +03:00
Jussi Saurio
7a133f422f
btree: use binary search for index leaves
2025-04-23 17:34:32 +03:00
Jussi Saurio
8743dcd0da
btree: extract indexbtree_seek() into a function like tablebtree_seek()
2025-04-23 17:34:32 +03:00
Levy A.
613a332e99
doc: add doc for DoubleDouble
2025-04-23 10:13:32 -03:00
Levy A.
2cbb59e3f9
refactor: renaming and better types
2025-04-23 09:53:37 -03:00
Levy A.
f1ee92bf2d
numeric types overhaul
2025-04-23 08:34:58 -03:00
Pekka Enberg
beaccae664
Merge 'Create an automatic ephemeral index when a nested table scan would otherwise be selected' from Jussi Saurio
...
Closes #747
- Creates an automatic ephemeral (in-memory) index on the right-side
table of a join if otherwise a nested table scan would be selected.
- This behavior is not hardcoded; instead this PR introduces a (quite
dumb) cost estimator that naturally deincentivizes building ephemeral
indexes where they don't make sense (e.g. the outermost table). I will
probably build this estimator to be smarter in the future when working
on join reordering optimizations
### Example bytecode plans and runtimes (note that this is debug mode)
Example query with no persistent indexes to choose from. Without
ephemeral index it's a nested scan:
```sql
limbo> explain select * from t1 natural join t2;
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 13 0 0 Start at 13
1 OpenRead 0 2 0 0 table=t1, root=2
2 OpenRead 1 3 0 0 table=t2, root=3
3 Rewind 0 12 0 0 Rewind t1
4 Rewind 1 11 0 0 Rewind t2
5 Column 0 0 2 0 r[2]=t1.a
6 Column 1 0 3 0 r[3]=t2.a
7 Ne 2 3 10 0 if r[2]!=r[3] goto 10
8 Column 0 0 1 0 r[1]=t1.a
9 ResultRow 1 1 0 0 output=r[1]
10 Next 1 5 0 0
11 Next 0 4 0 0
12 Halt 0 0 0 0
13 Transaction 0 0 0 0 write=false
14 Goto 0 1 0 0
limbo> .timer on
limbo> select * from t1 natural join t2;
┌───┐
│ a │
├───┤
└───┘
Command stats:
----------------------------
total: 953 ms (this includes parsing/coloring of cli app)
```
Same query with autoindexing enabled:
```sql
limbo> explain select * from t1 natural join t2;
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 22 0 0 Start at 22
1 OpenRead 0 2 0 0 table=t1, root=2
2 OpenRead 1 3 0 0 table=t2, root=3
3 Rewind 0 21 0 0 Rewind t1
4 Once 12 0 0 0 goto 12 # execute block 5-11 only once, on subsequent iters jump straight to 12
5 OpenAutoindex 3 0 0 0 cursor=3
6 Rewind 1 12 0 0 Rewind t2 # open source table for ephemeral index
7 Column 1 0 2 0 r[2]=t2.a
8 RowId 1 3 0 0 r[3]=t2.rowid
9 MakeRecord 2 2 4 0 r[4]=mkrec(r[2..3])
10 IdxInsert 3 4 2 0 key=r[4] # insert stuff to ephemeral index
11 Next 1 7 0 0
12 Column 0 0 5 0 r[5]=t1.a
13 IsNull 5 20 0 0 if (r[5]==NULL) goto 20
14 SeekGE 3 20 5 0 key=[5..5] # perform seek on ephemeral index
15 IdxGT 3 20 5 0 key=[5..5]
16 DeferredSeek 3 1 0 0
17 Column 0 0 1 0 r[1]=t1.a
18 ResultRow 1 1 0 0 output=r[1]
19 Next 2 15 0 0
20 Next 0 4 0 0
21 Halt 0 0 0 0
22 Transaction 0 0 0 0 write=false
23 Goto 0 1 0 0
limbo> .timer on
limbo> select * from t1 natural join t2;
┌───┐
│ a │
├───┤
└───┘
Command stats:
----------------------------
total: 220 ms (this includes parsing/coloring of cli app)
```
Closes #1356
2025-04-22 13:00:06 +03:00
Pekka Enberg
d92fb75262
Merge 'Fix incorrect between expression documentation' from Pedro Muniz
...
I was reading through the `translate_expr` function and `COMPAT.md` to
see what was not implemented yet. I saw that `Expr::Between` was marked
as a `todo!` so I set trying to implement it only to find that it was
being rewritten in the optimizer haha. This PR just adjusts the docs and
add an `unreachable` in the appropriate locations.
Closes #1378
2025-04-22 11:56:01 +03:00
Pekka Enberg
e41bf3993a
Merge 'bindings/rust: Add Statement.columns() support' from Timo Kösters
...
This PR adds the statement.columns() function, inspired from Rusqlite: h
ttps://docs.rs/rusqlite/latest/rusqlite/struct.Statement.html#method.col
umns
Note that the rusqlite documentation says
> If associated DB schema can be altered concurrently, you should make
sure that current statement has already been stepped once before calling
this method.
Do we have this requirement as well?
The first commit is just the rust binding. The second commit implements
the column name for the rowid column.
Closes #1376
2025-04-22 10:52:25 +03:00
Pekka Enberg
7308f6d6e8
Merge 'Bump julian_day_converter to 0.4.5' from meteorgan
...
The previous version of `julian_day-converter` had precision issues,
potentially causing loss of precision when converting between
`julianday` and `datetime`

Reviewed-by: Diego Reis (@diegoreis42)
Closes #1344
2025-04-22 10:48:36 +03:00
Timo Kösters
68d8b86bb7
fix: get name of rowid column
2025-04-22 08:46:37 +02:00
pedrocarlo
1928dcfa10
Correct docs regarding between
2025-04-21 23:05:01 -03:00
Jussi Saurio
f256fb46fd
remove print spam from index insert
2025-04-21 14:59:13 +03:00
Jussi Saurio
3b44b269a3
optimizer: try to build ephemeral index to avoid nested table scan
2025-04-21 14:59:13 +03:00
Jussi Saurio
6924424f11
optimizer: add highly unintelligent heuristics-based cost estimation
2025-04-21 14:59:13 +03:00
Jussi Saurio
a50fa03d24
optimizer: allow calling try_extract_index... without any persistent indexes
2025-04-21 14:59:13 +03:00
Jussi Saurio
af21f60887
translate/main_loop: create autoindex when index.ephemeral=true
2025-04-21 14:59:13 +03:00
Jussi Saurio
c1b2dfc32b
TableReference: add method column_is_used()
2025-04-21 14:59:13 +03:00
Jussi Saurio
09ad6d8f01
vdbe: resolve labels for Insn::Once
2025-04-21 14:59:13 +03:00
Jussi Saurio
d0da7307be
Index: add new field ephemeral: bool
2025-04-21 14:59:13 +03:00
Pere Diaz Bou
fc4deb2b7b
Merge 'btree: avoid reading entire cell when only rowid needed' from Jussi Saurio
...
This PR is based on #1357 and further improves performance:
```sql
limbo> select l_orderkey, 3 as revenue, o_orderdate, o_shippriority from lineitem, orders, customer where c_mktsegment = 'FURNITURE' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < cast('1995-03-29' as datetime) and l_shipdate > cast('1995-03-29' as datetime);
┌────────────┬─────────┬─────────────┬────────────────┐
│ l_orderkey │ revenue │ o_orderdate │ o_shippriority │
├────────────┼─────────┼─────────────┼────────────────┤
└────────────┴─────────┴─────────────┴────────────────┘
Command stats:
----------------------------
total: 3.728050958 s (this includes parsing/coloring of cli app)
```
Reviewed-by: Preston Thorpe (@PThorpe92)
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com >
Closes #1358
2025-04-21 12:14:21 +02:00
Jussi Saurio
83c509a613
Fix bug: left join null flag not being cleared
...
In left joins, even if the join condition is not matched, the system
must emit a row for every row of the outer table:
-- this must return t1.count() rows, with NULLs for all columns of t2
SELECT * FROM t1 LEFT JOIN t2 ON FALSE;
Our logic for clearing the null flag was to do it in Next/Prev. However,
this is problematic for a few reasons:
- If the inner table of the left join is using SeekRowid, then Next/Prev
is never called on its cursor, so the null flag doesn't get cleared.
- If the inner table of the left join is using a non-covering index seek,
i.e. it iterates its rows using an index, but seeks to the main table
to fetch data, then Next/Prev is never called on the main table, and the
main table's null flag doesn't get cleared.
What this results in is NULL values incorrectly being emitted for the
inner table after the first correct NULL row, since the null flag is
correctly set to true, but never cleared.
This PR fixes the issue by clearing the null flag whenever seek() is
invoked on the cursor. Hence, the null flag is now cleared on:
- next()
- prev()
- seek()
2025-04-19 13:56:52 +03:00
Jussi Saurio
017cdb9568
btree: avoid reading entire cell when only rowid needed
2025-04-18 16:52:05 +03:00
Jussi Saurio
3dab59201d
Separate both table&index move_to impls into different funcs
2025-04-18 16:27:50 +03:00
Jussi Saurio
0974ba6e71
default to using tablebtree_move_to in all calls to move_to with rowids
2025-04-18 16:11:36 +03:00
Jussi Saurio
12e689b9fc
btree: use binary search on table leaf pages too
2025-04-18 16:11:36 +03:00
Jussi Saurio
3f9bdbdf14
btree: use binary search in move_to() for table btrees
2025-04-18 16:11:36 +03:00
Jussi Saurio
1ccc321030
Merge 'Feat: Covering indexes' from Jussi Saurio
...
Closes #364
Covering indexes mean being able to read all the necessary data from an
index instead of using the underlying table at all. This PR adds that
functionality.
This PR can be reviewed commit-by-commit as the first commits are
enablers for the actual covering index usage functionality
Example of a scan where covering index can be used:
```sql
limbo> .schema
CREATE TABLE t(a,b,c,d,e);
CREATE INDEX abc ON t (a,b,c);
limbo> explain select b+1,concat(a, c) from t;
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 12 0 0 Start at 12
1 OpenRead 0 3 0 0 table=abc, root=3
2 Rewind 0 11 0 0 Rewind abc
3 Column 0 1 3 0 r[3]=abc.b
4 Integer 1 4 0 0 r[4]=1
5 Add 3 4 1 0 r[1]=r[3]+r[4]
6 Column 0 0 5 0 r[5]=abc.a
7 Column 0 2 6 0 r[6]=abc.c
8 Function 0 5 2 concat 0 r[2]=func(r[5..6])
9 ResultRow 1 2 0 0 output=r[1..2]
10 Next 0 3 0 0
11 Halt 0 0 0 0
12 Transaction 0 0 0 0 write=false
13 Goto 0 1 0 0
```
Example of a scan where it can't be used:
```sql
limbo> .schema
CREATE TABLE t(a,b,c,d,e);
CREATE INDEX abc ON t (a,b,c);
limbo> explain select a,b,c,d from t limit 5;
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 11 0 0 Start at 11
1 OpenRead 0 2 0 0 table=t, root=2
2 Rewind 0 10 0 0 Rewind t
3 Column 0 0 4 0 r[4]=t.a
4 Column 0 1 5 0 r[5]=t.b
5 Column 0 2 6 0 r[6]=t.c
6 Column 0 3 7 0 r[7]=t.d
7 ResultRow 4 4 0 0 output=r[4..7]
8 DecrJumpZero 1 10 0 0 if (--r[1]==0) goto 10
9 Next 0 3 0 0
10 Halt 0 0 0 0
11 Transaction 0 0 0 0 write=false
12 Integer 5 1 0 0 r[1]=5
13 Integer 0 2 0 0 r[2]=0
14 OffsetLimit 1 3 2 0 if r[1]>0 then r[3]=r[1]+max(0,r[2]) else r[3]=(-1)
15 Goto 0 1 0 0
```
Closes #1351
2025-04-18 15:27:27 +03:00
Jussi Saurio
9d553c50cc
Merge 'allow index entry delete' from Pere Diaz Bou
...
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com >
Closes #1341
2025-04-18 15:26:05 +03:00
Jussi Saurio
bf2e198a57
Merge 'Fix out of bounds access on parse_numeric_str' from Levy A.
...
Fixes #1361 .
Closes #1362
2025-04-18 15:24:37 +03:00
Jussi Saurio
6c73db6fd3
feat: use covering indexes whenever possible
2025-04-18 15:13:09 +03:00
Jussi Saurio
5b71d3a3da
eliminate_unnecessary_orderby: add edge case handling
2025-04-18 15:12:06 +03:00
Jussi Saurio
40d880c3b0
TableReference: add resolve_cursors() method
2025-04-18 15:12:06 +03:00
Jussi Saurio
d5a6553e63
TableReference: add open_cursors()
2025-04-18 15:12:06 +03:00
Jussi Saurio
4ab4a3f6c3
TableReference: add index_is_covering() and utilizes_covering_index()
2025-04-18 15:12:06 +03:00
Levy A.
5fd2ed0bae
fix: handle empty case
2025-04-17 20:20:57 -03:00
Levy A.
32d59b8c78
refactor+fix: using a more robust pattern matching approach
2025-04-17 20:08:05 -03:00