Commit Graph

4161 Commits

Author SHA1 Message Date
Anton Harniakou
6c8eef2fac Support DROP INDEX
This commit adds suport for DROP INDEX.
Bytecode produced by this commit differs from SQLITE's bytecode, main
reason we don't do autovacuum or repacking of pages like SQLITE does.
2025-05-04 12:13:16 +03:00
Jussi Saurio
7137f4ab3b Merge 'Feature: Composite Primary key constraint' from Pedro Muniz
Closes #1384 . This PR implements Primary Key constraint for inserts. As
can be seen in the issue, if you created an Index with a Primary Key
constraint, it could trigger `Unique Constraint` error, but still insert
the record. Sqlite uses the opcode `NoConflict` to check if the record
already exists in the Btree. As we did not have this Opcode yet, I
implemented it. It is very similar to `NotFound` with the difference
that if any value in the Record is Null, it will immediately jump to the
offset. The added benefit of implementing this, is that now we fully
support Composite Primary Keys. Also, I think with the current
implementation, it will be trivial to implement the Unique opcode for
Insert. To support Updates, I need to understand more of the plan
optimizer to and find where we are Making the Record and opening the
autoindex.
For testing, I have written a test generator to generate many different
tables that can have a varying numbers of Primary Keys.
```sql
limbo> CREATE TABLE users (id INT, username TEXT, PRIMARY KEY (id, username));
limbo> INSERT INTO users VALUES (1, 'alice');
limbo> explain INSERT INTO users VALUES (1, 'alice');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     16    0                    0   Start at 16
1     OpenWrite          0     2     0                    0
2     Integer            1     2     0                    0   r[2]=1
3     String8            0     3     0     alice          0   r[3]='alice'
4     OpenWrite          1     3     0                    0
5     NewRowId           0     1     0                    0
6     Copy               2     5     0                    0   r[5]=r[2]
7     Copy               3     6     0                    0   r[6]=r[3]
8     Copy               1     7     0                    0   r[7]=r[1]
9     MakeRecord         5     3     8                    0   r[8]=mkrec(r[5..7])
10    NoConflict         1     12    5     2              0   key=r[5]
11    Halt               1555  0     0     users.id, users.username  0
12    IdxInsert          1     8     5                    0   key=r[8]
13    MakeRecord         2     2     4                    0   r[4]=mkrec(r[2..3])
14    Insert             0     4     1                    0
15    Halt               0     0     0                    0
16    Transaction        0     1     0                    0   write=true
17    Goto               0     1     0                    0
limbo> INSERT INTO users VALUES (1, 'alice');
  × Runtime error: UNIQUE constraint failed: users.id, users.username (19)
limbo> INSERT INTO users VALUES (1, 'bob');
limbo> INSERT INTO users VALUES (1, 'bob');
  × Runtime error: UNIQUE constraint failed: users.id, users.username (19)
```

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1393
2025-04-24 23:25:30 +03:00
Pekka Enberg
4d0c40a435 One more fix to Antithesis Dockerfile 2025-04-24 21:17:36 +03:00
Pekka Enberg
117dbe6c8c Fix Antithesis Docker file some more 2025-04-24 21:12:40 +03:00
Pekka Enberg
fa5d6dcf6b Fix Antithesis Docker file 2025-04-24 21:03:19 +03:00
Pekka Enberg
31677c9c94 scripts/antithesis: Build Docker image for x86-64 2025-04-24 20:55:30 +03:00
Pekka Enberg
2a5eb8e5bc stress: Make Clippy happy 2025-04-24 20:46:26 +03:00
Pekka Enberg
ebc2e475b6 Merge 'Add Antithesis Tests' from eric-dinh-antithesis
This PR adds 2 autonomous test suites for use with Antithesis: `bank-
test` and `stress-composer`. It also modifies the existing
`limbo_stress` test to run as a singleton and modifies other Antithesis-
related configuration files.
**bank-test**
- `first_setup.py`
  - initializes a DB table with columns accounts and balance
  - generates random balances for each account
  - stores initial state of the table
- `parallel_driver_generate_transaction.py`
  - selects 2 accounts from the table as sender and receiver
  - generates a random value which is subtracted from sender and added
to receiver
- `anytime/eventually/finally_validate.py`
  - checks that sum of initial balances == sum of current balances
**stress-composer**
- Breaks `limbo_stress` into component parts
- `first_setup.py`
  - creates up to 10 tables with up to 10 columns
  - stores table details in a separate db
- `parallel_driver_insert.py`
  - randomly generates and executes up to 100 insert statements into a
single table using random values derived from the table details
- `parallel_driver_update.py`
  - randomly generates and executes up to 100 updates into a single
table using random values derived from the table details
- `parallel_driver_delete.py`
  - randomly generates and executes up to 100 deletes from a single
table using random values derived from the table details

Closes #1401
2025-04-24 20:44:42 +03:00
eric-dinh-antithesis
27e15364c4 stress: suppress logfile since it's too big 2025-04-24 12:27:58 -04:00
eric-dinh-antithesis
b8885777dc stress: move sdk setup_complete from limbo_stress to docker-entrypoint 2025-04-24 12:27:05 -04:00
eric-dinh-antithesis
75ae5dbd13 stress: update docker-compose 2025-04-24 12:26:00 -04:00
eric-dinh-antithesis
8390233b99 Dockerfile.antithesis: update limbo_stress build step 2025-04-24 12:25:19 -04:00
eric-dinh-antithesis
5953d32e4d Dockerfile.antithesis: add symbols for rust, cataloging for python, and antithesis tests to image, update entrypoint 2025-04-24 12:24:44 -04:00
eric-dinh-antithesis
62e2745c3c Dockerfile.antithesis: install dependencies 2025-04-24 12:23:22 -04:00
eric-dinh-antithesis
364a78b270 Cargo.toml: add profile for antithesis builds for full debug 2025-04-24 12:22:03 -04:00
eric-dinh-antithesis
f993a22023 antithesis-tests: add all tests 2025-04-24 12:20:41 -04:00
pedrocarlo
2e147b20a8 Adjustments and explicitely just emitting NoConflict on unique indexes 2025-04-24 13:13:39 -03:00
Jussi Saurio
80d39929ad Merge 'types: refactor serialtype again to make it faster' from Jussi Saurio
basically serialtype got slower in #1398, maybe because of the wasted
space of `enum SerialType` being 16 bytes, so i've now refactored
`SerialType` to be a transparent newtype wrapper over `u64` and
introduced a separate `SerialTypeKind` enum
at least on my machine the perf regression was nullified, if not even a
bit better

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #1399
2025-04-24 18:59:31 +03:00
Jussi Saurio
7921d7c2e0 types: refactor serialtype again to make it faster 2025-04-24 17:28:31 +03:00
Jussi Saurio
2ffeefe165 Merge 'core/types: remove duplicate serialtype implementation' from Jussi Saurio
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #1398
2025-04-24 16:17:17 +03:00
Jussi Saurio
04adf8242a faster validate 2025-04-24 16:05:12 +03:00
Jussi Saurio
af6a783f4d core/types: remove duplicate serialtype implementation 2025-04-24 15:38:47 +03:00
Jussi Saurio
0c800524af Merge 'Bugfix: Explain command should display syntax errors in CLI' from Anton Harniakou
Closes #1392

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1396
2025-04-24 15:11:59 +03:00
Anton Harniakou
fdf3dd9796 Bugfix: Explain command should display syntax errors in CLI
Closes #1392
2025-04-24 13:25:00 +03:00
Jussi Saurio
dc3e97887f Merge 'replace vec with array in btree balancing' from Lâm Hoàng Phúc
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1354
2025-04-24 11:22:07 +03:00
Jussi Saurio
2e8042510e Merge 'Pragma page size reading' from Anton Harniakou
1) Fix a bug where cli pretty mode would not print pragma results;
2) Add ability to read page_size using PRAGMA page_size;

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1394
2025-04-24 11:08:55 +03:00
Jussi Saurio
6ff5ff49b7 Merge 'perf/btree: use binary search for Index seek operations' from Jussi Saurio
## Beef
Followup to #1357 which did the same treatment for table btrees only.
After this PR, all of our seeks use binary search for both interior and
leaf pages.
## Perf comparison
using TPC-H 1GB db for this query:
```sql
limbo> explain select count(1) from lineitem join partsupp on l_partkey = ps_partkey and l_suppkey = ps_suppkey;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     18    0                    0   Start at 18
1     OpenRead           0     10    0                    0   table=lineitem, root=10
2     OpenRead           1     7     0                    0   table=sqlite_autoindex_partsupp_1, root=7
3     Rewind             0     14    0                    0   Rewind lineitem
4       Column           0     1     2                    0   r[2]=lineitem.l_partkey
5       IsNull           2     13    0                    0   if (r[2]==NULL) goto 13
6       Column           0     2     3                    0   r[3]=lineitem.l_suppkey
7       IsNull           3     13    0                    0   if (r[3]==NULL) goto 13
8       SeekGE           1     13    2                    0   key=[2..3] <-- index seek here, for every row in lineitem
9         IdxGT          1     13    2                    0   key=[2..3]
10        Integer        1     5     0                    0   r[5]=1
11        AggStep        0     5     4     count          0   accum=r[4] step(r[5])
12      Next             1     9     0                    0
13    Next               0     4     0                    0
14    AggFinal           0     4     0     count          0   accum=r[4]
15    Copy               4     1     0                    0   r[1]=r[4]
16    ResultRow          1     1     0                    0   output=r[1]
17    Halt               0     0     0                    0
18    Transaction        0     0     0                    0   write=false
19    Goto               0     1     0                    0
```
main:
```sql
limbo> select count(1) from lineitem join partsupp on l_partkey = ps_partkey and l_suppkey = ps_suppkey;
┌───────────┐
│ count (1) │
├───────────┤
│   6001215 │
└───────────┘
Command stats:
----------------------------
total: 40.292102375 s (this includes parsing/coloring of cli app)
```
PR:
```sql
limbo> select count(1) from lineitem join partsupp on l_partkey = ps_partkey and l_suppkey = ps_suppkey;
┌───────────┐
│ count (1) │
├───────────┤
│   6001215 │
└───────────┘
Command stats:
----------------------------
total: 14.021689916 s (this includes parsing/coloring of cli app)
```
almost 3x faster. buzzkill: still 3x slower than sqlite :)

Closes #1387
2025-04-24 10:53:35 +03:00
Anton Harniakou
51fc1773ea Fix missing documentation warning; improve the documentation message 2025-04-24 10:36:23 +03:00
Jussi Saurio
c88c579154 Merge 'expr.is_nonnull(): return true if col.primary_key || col.notnull' from Jussi Saurio
This avoids redundant `IsNull` instructions during index seeks if the
seek key columns are primary keys of other tables, which they often are.

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #1388
2025-04-24 10:32:00 +03:00
Anton Harniakou
0a69ea0138 Support reading db page size using PRAGMA page_size 2025-04-24 10:12:02 +03:00
pedrocarlo
9dd1ced5ad added tests 2025-04-23 20:38:08 -03:00
pedrocarlo
b6036cc79d Primary key constraint working 2025-04-23 16:44:13 -03:00
Jussi Saurio
c09e4d1d38 Merge 'Numeric Types Overhaul' from Levy A.
### Summary
  - Sqlite compatible string to float conversion
    - Accompanied with the new `cast_real` fuzz target
  - `NonNan` wrapper type over `f64`
    - Now we can guarantee that operations that can make result in a NaN
need to be handled
  - `Numeric` and `NullableInteger` types that encapsulate all numeric
and bitwise operations
    - This is now guaranteed to be 100% compatible with sqlite with the
`expression` fuzz target (with the exception of the commented out
operation that will be implemented in a later PR)
One thing that might be reworked here is the heavy use of traits and
operator overloading, but looks reasonable to me.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1386
2025-04-23 18:34:32 +03:00
Jussi Saurio
9e1f15c679 Merge 'python: add UV project for 'scripts'' from Jussi Saurio
mainly so i don't have to install pygithub every time i want to `uv run
scripts/merge-pr.py`

Closes #1385
2025-04-23 18:33:57 +03:00
Jussi Saurio
a7488496d5 expr.is_nonnull(): return true if col.primary_key || col.notnull 2025-04-23 18:10:33 +03:00
Jussi Saurio
af703110f8 btree: remove extra iter_dir argument that can be derived from seek_op 2025-04-23 17:38:48 +03:00
Jussi Saurio
044339efc7 btree: rename tablebtree_move_to_binsearch -> tablebtree_move_to 2025-04-23 17:35:22 +03:00
Jussi Saurio
8c338438dd btree: use binary search for index interior cell seek 2025-04-23 17:34:32 +03:00
Jussi Saurio
7a133f422f btree: use binary search for index leaves 2025-04-23 17:34:32 +03:00
Jussi Saurio
8743dcd0da btree: extract indexbtree_seek() into a function like tablebtree_seek() 2025-04-23 17:34:32 +03:00
Jussi Saurio
48071b7ad7 tests/fuzz/compound_index_seek: order select cols by definition order 2025-04-23 17:34:32 +03:00
Jussi Saurio
517390a4ea tests/fuzz/compound_index_seek: show which table had failed query 2025-04-23 16:57:43 +03:00
Anton Harniakou
5c18c1c57a Draw table if it contains any row
Some table can be headerless, for example results of PRAGMA calls
2025-04-23 16:36:43 +03:00
Levy A.
8ff906e353 fix: decrease even more nested operations
this is a worrying trend
2025-04-23 10:15:49 -03:00
Levy A.
613a332e99 doc: add doc for DoubleDouble 2025-04-23 10:13:32 -03:00
Levy A.
2cbb59e3f9 refactor: renaming and better types 2025-04-23 09:53:37 -03:00
Levy A.
ed27f22e2f comment out incompatible operations 2025-04-23 08:34:58 -03:00
Levy A.
f1ee92bf2d numeric types overhaul 2025-04-23 08:34:58 -03:00
Jussi Saurio
3bbd443286 python: add UV project for 'scripts'
mainly so i don't have to install pygithub every time i want to
`uv run scripts/merge-pr.py`
2025-04-23 10:32:38 +03:00
Jussi Saurio
fd2b274556 Merge 'Python script to compare vfs performance' from Preston Thorpe
This PR adds a python script that uses the `TestLimboShell` setup to run
some semi naive benchmarks/comparisons against `io_uring` and `syscall`
IO back-ends.
### Usage:
```sh
make bench-vfs SQL="insert into products (name, price) values ('testing', randomblob(1024*4));" N=50
```
The script will execute the given `SQL` `N` times with each back-end,
get the average/mean and display them.
![image](https://github.com/user-
attachments/assets/b2399196-dbdd-4b98-8210-536e68979edd)
😬

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1377
2025-04-23 10:25:56 +03:00