TcMits
ee660187dc
fix negative free space after balance-shallower
2025-04-14 14:25:18 +07:00
TcMits
b3c2593980
btree balance-shallower
2025-04-14 12:49:30 +07:00
TcMits
a4a4879f3b
fix cargo fmt check
2025-04-11 14:53:10 +07:00
TcMits
9d7a779757
Fix drop empty page in balancing
2025-04-11 14:41:56 +07:00
Pekka Enberg
e3a4400329
Merge 'Multi column indexes + index seek refactor' from Jussi Saurio
...
# Multi column indexes + index seek refactor
## PR reader guide
I would say mostly you should just focus on the content of
`optimizer.rs` and `plan.rs` because the rest is just small type
changes, or in the case of `main_loop.rs`, a bunch of logic was just
moved out of there and rewritten.
## New feature - multi column index seeks
This PR adds support for utilizing multi-column indexes properly, i.e.
using as many columns in the seek key as possible. Previously, we only
used max one column per index. I've modified the existing compound index
seek fuzz test to use this functionality.
## Refactoring of index seek related logic
This PR moves a lot of index seek related logic out of `main_loop.rs`
into `optimizer.rs` and `plan.rs` and introduces a bunch of helper
structures to model finding and using an index to perform a seek + scan.
## Examples
Here are some examples of multi-column seeks:
### Example table setup:
```sql
sqlite> CREATE TABLE t(a,b,c,d,e);
sqlite> CREATE INDEX abc ON t (a,b,c);
-- create 10000 rows with random values between 0-9 for all columns
sqlite >INSERT INTO t SELECT ABS(RANDOM() % 10),ABS(RANDOM() % 10),ABS(RANDOM() % 10),ABS(RANDOM() % 10),ABS(RANDOM() % 10) FROM generate_series(1,10000,1);
```
### Example bytecode plans, results and timings vs main branch:
```sql
limbo> EXPLAIN SELECT * FROM t WHERE a = 5 and b = 6 and c = 7;
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 20 0 0 Start at 20
1 OpenReadAsync 0 2 0 0 table=t, root=2
2 OpenReadAwait 0 0 0 0
3 OpenReadAsync 1 3 0 0 table=abc, root=3
4 OpenReadAwait 0 0 0 0
5 Integer 5 6 0 0 r[6]=5
6 Integer 6 7 0 0 r[7]=6
7 Integer 7 8 0 0 r[8]=7
8 SeekGE 1 19 6 0 key=[6..8]
9 IdxGT 1 19 6 0 key=[6..8]
10 DeferredSeek 1 0 0 0
11 Column 0 0 1 0 r[1]=t.a
12 Column 0 1 2 0 r[2]=t.b
13 Column 0 2 3 0 r[3]=t.c
14 Column 0 3 4 0 r[4]=t.d
15 Column 0 4 5 0 r[5]=t.e
16 ResultRow 1 5 0 0 output=r[1..5]
17 NextAsync 1 0 0 0
18 NextAwait 1 9 0 0
19 Halt 0 0 0 0
20 Transaction 0 0 0 0 write=false
21 Goto 0 1 0 0
limbo> SELECT * FROM t WHERE a = 5 and b = 6 and c = 7;
5|6|7|9|9
5|6|7|4|7
5|6|7|3|2
5|6|7|3|7
5|6|7|5|2
5|6|7|5|3
5|6|7|9|7
runtime (debug build, this branch): total: 2 ms (this includes parsing/coloring of cli app)
runtime (debud build, main branch): total: 67 ms (this includes parsing/coloring of cli app)
```
```sql
limbo> EXPLAIN SELECT * FROM t WHERE a = 5 and b = 6 and c < 7;
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 21 0 0 Start at 21
1 OpenReadAsync 0 2 0 0 table=t, root=2
2 OpenReadAwait 0 0 0 0
3 OpenReadAsync 1 3 0 0 table=abc, root=3
4 OpenReadAwait 0 0 0 0
5 Integer 5 6 0 0 r[6]=5
6 Integer 6 7 0 0 r[7]=6
7 Null 0 8 0 0 r[8]=NULL
8 SeekGT 1 20 6 0 key=[6..8]
9 Integer 7 8 0 0 r[8]=7
10 IdxGE 1 20 6 0 key=[6..8]
11 DeferredSeek 1 0 0 0
12 Column 0 0 1 0 r[1]=t.a
13 Column 0 1 2 0 r[2]=t.b
14 Column 0 2 3 0 r[3]=t.c
15 Column 0 3 4 0 r[4]=t.d
16 Column 0 4 5 0 r[5]=t.e
17 ResultRow 1 5 0 0 output=r[1..5]
18 NextAsync 1 0 0 0
19 NextAwait 1 10 0 0
20 Halt 0 0 0 0
21 Transaction 0 0 0 0 write=false
22 Goto 0 1 0 0
limbo> SELECT * FROM t WHERE a = 5 and b = 6 and c < 7;
5|6|0|0|3
5|6|0|5|1
5|6|0|3|1
5|6|0|6|3
5|6|0|8|1
5|6|0|2|7
5|6|0|9|9
5|6|0|5|3
5|6|0|4|2
5|6|0|4|2
5|6|0|0|2
5|6|0|7|2
5|6|1|8|5
5|6|1|7|5
5|6|1|7|2
5|6|1|1|2
5|6|1|6|5
5|6|1|1|5
5|6|1|5|7
5|6|1|1|9
5|6|1|4|3
5|6|1|1|2
5|6|1|2|2
5|6|1|4|4
5|6|1|9|6
5|6|1|2|5
5|6|1|2|4
5|6|1|7|1
5|6|2|0|9
5|6|2|6|9
5|6|2|4|5
5|6|2|9|3
5|6|2|5|2
5|6|2|9|0
5|6|2|7|1
5|6|3|6|5
5|6|3|8|5
5|6|3|5|4
5|6|3|5|2
5|6|3|1|1
5|6|3|2|0
5|6|3|9|3
5|6|3|6|9
5|6|3|7|6
5|6|3|3|5
5|6|3|0|8
5|6|3|6|4
5|6|4|1|1
5|6|4|9|8
5|6|4|3|7
5|6|4|1|3
5|6|4|8|9
5|6|4|9|7
5|6|4|7|9
5|6|4|8|8
5|6|4|3|1
5|6|4|2|6
5|6|4|5|7
5|6|4|2|6
5|6|4|4|3
5|6|5|2|4
5|6|5|6|7
5|6|5|3|8
5|6|5|7|8
5|6|5|9|6
5|6|5|2|7
5|6|5|1|7
5|6|5|0|6
5|6|6|2|4
5|6|6|9|4
5|6|6|4|9
5|6|6|5|6
5|6|6|2|2
5|6|6|0|6
runtime (debug build, this branch): total: 9 ms (this includes parsing/coloring of cli app)
runtime (debug build, main branch): total: 71 ms (this includes parsing/coloring of cli app)
```
```sql
limbo> EXPLAIN SELECT * FROM t WHERE a = 5 and b = 6 and c < 7 ORDER BY a desc, b desc, c desc;
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 20 0 0 Start at 20
1 OpenReadAsync 0 2 0 0 table=t, root=2
2 OpenReadAwait 0 0 0 0
3 OpenReadAsync 1 3 0 0 table=abc, root=3
4 OpenReadAwait 0 0 0 0
5 Integer 5 6 0 0 r[6]=5
6 Integer 6 7 0 0 r[7]=6
7 Integer 7 8 0 0 r[8]=7
8 SeekLT 1 19 6 0 key=[6..8]
9 IdxLT 1 19 6 0 key=[6..7]
10 DeferredSeek 1 0 0 0
11 Column 0 0 1 0 r[1]=t.a
12 Column 0 1 2 0 r[2]=t.b
13 Column 0 2 3 0 r[3]=t.c
14 Column 0 3 4 0 r[4]=t.d
15 Column 0 4 5 0 r[5]=t.e
16 ResultRow 1 5 0 0 output=r[1..5]
17 PrevAsync 1 0 0 0
18 PrevAwait 1 0 0 0
19 Halt 0 0 0 0
20 Transaction 0 0 0 0 write=false
21 Goto 0 1 0 0
limbo> SELECT * FROM t WHERE a = 5 and b = 6 and c < 7 ORDER BY a desc, b desc, c desc;
5|6|6|0|6
5|6|6|2|2
5|6|6|5|6
5|6|6|4|9
5|6|6|9|4
5|6|6|2|4
5|6|5|0|6
5|6|5|1|7
5|6|5|2|7
5|6|5|9|6
5|6|5|7|8
5|6|5|3|8
5|6|5|6|7
5|6|5|2|4
5|6|4|4|3
5|6|4|2|6
5|6|4|5|7
5|6|4|2|6
5|6|4|3|1
5|6|4|8|8
5|6|4|7|9
5|6|4|9|7
5|6|4|8|9
5|6|4|1|3
5|6|4|3|7
5|6|4|9|8
5|6|4|1|1
5|6|3|6|4
5|6|3|0|8
5|6|3|3|5
5|6|3|7|6
5|6|3|6|9
5|6|3|9|3
5|6|3|2|0
5|6|3|1|1
5|6|3|5|2
5|6|3|5|4
5|6|3|8|5
5|6|3|6|5
5|6|2|7|1
5|6|2|9|0
5|6|2|5|2
5|6|2|9|3
5|6|2|4|5
5|6|2|6|9
5|6|2|0|9
5|6|1|7|1
5|6|1|2|4
5|6|1|2|5
5|6|1|9|6
5|6|1|4|4
5|6|1|2|2
5|6|1|1|2
5|6|1|4|3
5|6|1|1|9
5|6|1|5|7
5|6|1|1|5
5|6|1|6|5
5|6|1|1|2
5|6|1|7|2
5|6|1|7|5
5|6|1|8|5
5|6|0|7|2
5|6|0|0|2
5|6|0|4|2
5|6|0|4|2
5|6|0|5|3
5|6|0|9|9
5|6|0|2|7
5|6|0|8|1
5|6|0|6|3
5|6|0|3|1
5|6|0|5|1
5|6|0|0|3
runtime (debug build, this branch): total: 9 ms (this includes parsing/coloring of cli app)
runtime (debug build, main branch): total: 71 ms (this includes parsing/coloring of cli app)
```
Closes #1288
2025-04-11 09:36:25 +03:00
Pekka Enberg
2752c77cc2
Merge 'simulator: Add Bug Database(BugBase)' from Alperen Keleş
...
Previously, simulator used `tempfile` for storing the resulting
interaction plans, database file, seeds, and all relevant information.
This posed the problem that this information became ephemeral, and we
were not able to properly use the results of previous runs for
optimizing future runs. This PR removes the CLI option `output_dir`,
bases the storage infrastructure on top of `BugBase` interface.
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com >
Closes #1276
2025-04-11 09:35:09 +03:00
Pekka Enberg
d67e1b604b
Merge 'Added 'likelihood' scalar function' from Sachin Kumar Singh
...
The `likelihood(X,Y)` function returns argument X unchanged. The value Y
in likelihood(X,Y) must be a floating point constant between 0.0 and
1.0, inclusive.
```
sqlite> explain SELECT likelihood(42, 0.0);
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 6 0 0 Start at 6
1 Once 0 3 0 0
2 Integer 42 2 0 0 r[2]=42
3 Copy 2 1 0 0 r[1]=r[2]
4 ResultRow 1 1 0 0 output=r[1]
5 Halt 0 0 0 0
6 Goto 0 1 0 0
```
```
limbo> explain SELECT likelihood(42, 0.0);
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 4 0 0 Start at 4
1 Copy 2 1 0 0 r[1]=r[2]
2 ResultRow 1 1 0 0 output=r[1]
3 Halt 0 0 0 0
4 Integer 42 2 0 0 r[2]=42
5 Goto 0 1 0 0
```
Closes #1303
2025-04-11 09:34:36 +03:00
Pekka Enberg
13516fd53d
Merge 'feat: Add timediff data and time function' from Sachin Kumar Singh
...
This PR implemets the `timediff(A,B)` function, which returns a string
that describes the amount of time that must be added to B in order to
reach time A. I used sqlite's timediff function for format reference:
https://github.com/sqlite/sqlite/blob/master/src/date.c#L1694
Op-codes seems to be in order:
```
limbo> explain SELECT timediff('12:30:45.123', '12:30:44.987');
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 6 0 0 Start at 6
1 String8 0 2 0 12:30:45.123 0 r[2]='12:30:45.123'
2 String8 0 3 0 12:30:44.987 0 r[3]='12:30:44.987'
3 Function 0 2 1 timediff 0 r[1]=func(r[2..3])
4 ResultRow 1 1 0 0 output=r[1]
5 Halt 0 0 0 0
6 Goto 0 1 0 0
```
```
sqlite> explain SELECT timediff('12:30:45.123', '12:30:44.987');
addr opcode p1 p2 p3 p4 p5 comment
---- ------------- ---- ---- ---- ------------- -- -------------
0 Init 0 8 0 0 Start at 8
1 Once 0 5 0 0
2 String8 0 3 0 12:30:45.123 0 r[3]='12:30:45.123'
3 String8 0 4 0 12:30:44.987 0 r[4]='12:30:44.987'
4 Function 3 3 2 timediff(2) 0 r[2]=func(r[3..4])
5 Copy 2 1 0 0 r[1]=r[2]
6 ResultRow 1 1 0 0 output=r[1]
7 Halt 0 0 0 0
8 Goto 0 1 0 0
```
My first PR, I just followed the [contributing guides](https://github.co
m/tursodatabase/limbo/blob/main/CONTRIBUTING.md) and started.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com >
Closes #1302
2025-04-11 09:34:04 +03:00
Sachin Singh
23ab387143
handle formatting issues
2025-04-11 09:59:27 +05:30
Pekka Enberg
2f428b7dcc
Merge 'Fix overwrite cell with size less than cell size' from Pere Diaz Bou
...
We cannot simply paste a new payload into a cell with a payload with
less size because we need to track fragmentation + free blocks. Let's
keep it simple by only overwriting if size is the same.
Btw I feel like update is not re-entrant.
Reviewed-by: Preston Thorpe (@PThorpe92)
Closes #1301
2025-04-11 07:17:53 +03:00
Sachin Singh
01fa02364d
correctly handle edge cases
2025-04-11 08:34:29 +05:30
Sachin Singh
5ffdd42f12
Additional tests
2025-04-11 06:02:07 +05:30
Sachin Singh
482e93bfd0
feat: add likelihood scalar function
2025-04-11 05:54:23 +05:30
Sachin Singh
05b4b7b9f1
edit compat.md
2025-04-11 04:41:59 +05:30
Sachin Singh
ded308ccfa
additional tests
2025-04-11 04:40:09 +05:30
Sachin Singh
b7acfa490c
feat: add timediff data and time function
2025-04-11 04:30:57 +05:30
Pere Diaz Bou
745c2b92d0
unnecessary dirty set on overwrite
2025-04-10 22:24:15 +02:00
Pere Diaz Bou
038d78f096
overwrite when payload is equal size as current cell only
...
Prevoiusly we would overwrite even though size less than cell size. This
was wrong because it didn't update any fragment size or free blocks it
could. To be safe let's just overwrite only if local size is the same
amount.
2025-04-10 22:24:15 +02:00
Pere Diaz Bou
506c1a236c
find_free_cell fix use of no_offset writes
2025-04-10 22:24:15 +02:00
Pekka Enberg
17b206297e
Merge 'Emit ANSI codes only when tracing is outputting to terminal' from Preston Thorpe
...
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com >
Closes #1289
2025-04-10 20:54:21 +03:00
Pekka Enberg
ef893da6c7
Merge 'core/btree: Add PageContent::new() helper' from Pekka Enberg
...
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com >
Closes #1294
2025-04-10 20:53:41 +03:00
Pekka Enberg
a27126cd05
Merge 'B-Tree code cleanups' from Pekka Enberg
...
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com >
Closes #1290
2025-04-10 20:53:33 +03:00
Jussi Saurio
4daad0a858
Fix bug: accidentally skipped index selection for other tables except first found
2025-04-10 18:57:14 +03:00
Pekka Enberg
1d748de273
Merge 'btree index selection on rightmost pointer in balance_non_root' from Pere Diaz Bou
...
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com >
Closes #1297
2025-04-10 18:39:51 +03:00
Pekka Enberg
712a4caa22
stress: Fix per-thread query generation
2025-04-10 18:39:20 +03:00
Pere Diaz Bou
62d0febdb6
panic on corruption
2025-04-10 16:01:24 +02:00
Pere Diaz Bou
b35d805a81
tracing lock stress
2025-04-10 16:01:24 +02:00
Pere Diaz Bou
8e93471d00
fix cell index selection while balancing
...
Cell index doesn't move in `move_to` unless we don't need to check next
cell. On the other hand, with rightmost pointer, we advance cell index
by 1 even though where we are moving to was to that page
2025-04-10 16:01:24 +02:00
Pere Diaz Bou
4755acb571
init tracing in stress tool
2025-04-10 16:01:24 +02:00
Pere Diaz Bou
0c4e56ecf9
Merge 'Add support to load log file with stress test' from Pere Diaz Bou
...
run with: `RUST_BACKTRACE=1 cargo run -p limbo_stress -- -t 1 -l`
and then if you want to repeat same plan: `RUST_BACKTRACE=1 cargo run -p
limbo_stress -- -t 1 -L`
Closes #1296
2025-04-10 16:01:11 +02:00
Jussi Saurio
457bded14d
optimizer: refactor optimizer to support multicolumn index scans
2025-04-10 15:53:02 +03:00
Jussi Saurio
afad06fb23
vdbe/explain: add key info to Seek/Idx insns
2025-04-10 15:06:45 +03:00
Jussi Saurio
3d1b4c5292
test/fuzz: modify compound index scan fuzz to utilize both pk columns in where clause
2025-04-10 15:06:18 +03:00
Pere Diaz Bou
cdcbcafbdd
clipppy
2025-04-10 13:46:40 +02:00
Pere Diaz Bou
f795a9e331
Add support to load log file with stress test
2025-04-10 13:41:10 +02:00
Jussi Saurio
579d04f521
Merge 'io/linux: make syscallio the default (io_uring is really slow)' from Jussi Saurio
...
context: https://github.com/tursodatabase/limbo/issues/1275
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com >
Closes #1295
2025-04-10 13:55:06 +03:00
Jussi Saurio
60a13c129f
io/linux: make syscallio the default (io_uring is really slow)
2025-04-10 13:32:26 +03:00
Pekka Enberg
53633e8b6f
core/btree: Add PageContent::new() helper
2025-04-10 13:14:38 +03:00
Pekka Enberg
6ffa9cf56a
Merge 'Stress improvements' from Pekka Enberg
...
Closes #1292
2025-04-10 12:18:53 +03:00
Pekka Enberg
277efeb5ee
Merge 'VDBE code cleanups' from Pekka Enberg
...
Closes #1291
2025-04-10 12:10:22 +03:00
Pekka Enberg
3fd378cf9f
Fix Antithesis Dockerfile to include JavaScript bindings
2025-04-10 12:08:31 +03:00
Pekka Enberg
441cd637b5
stress: Make database file configurable
2025-04-10 11:59:25 +03:00
Pekka Enberg
c4d983bcfe
stress: Log SQL statements to a file
2025-04-10 11:59:25 +03:00
Pekka Enberg
39cee1b146
stress: Increase default number of iterations
2025-04-10 11:59:25 +03:00
Pekka Enberg
f50662205e
stress: Fix schema creation
2025-04-10 11:59:25 +03:00
Pekka Enberg
207563208f
stress: Add support for INSERT, DELETE, and UPDATE
2025-04-10 11:59:25 +03:00
Pekka Enberg
6aaa105321
stress: Add schema generation support
2025-04-10 11:43:32 +03:00
Pekka Enberg
31f0d174d7
core/vdbe: Move exec_*() funtions to execute.rs
2025-04-10 09:42:03 +03:00
Pekka Enberg
3fd51cdf06
core/vdbe: Move Insn implementation close to struct definition
2025-04-10 09:28:43 +03:00
Pekka Enberg
5906d7971a
core/vdbe: Clean up imports
2025-04-10 09:25:15 +03:00