turso

mirror of https://github.com/aljazceru/turso.git synced 2026-02-12 11:44:21 +01:00

Author	SHA1	Message	Date
meteorgan	eabe5e1631	temporarily comment the pragma-page-count-empty test case	2025-04-26 21:45:18 +08:00
meteorgan	f3f09a5b7b	Fix pragma page_count	2025-04-26 21:45:18 +08:00
Pekka Enberg	bde2d4f0a3	Fix Antithesis docker-compose.yaml	2025-04-26 09:14:24 +03:00
Jussi Saurio	0d77ea9446	Merge 'Optimization: only initialize `Rustyline` if we are in a tty' from Pedro Muniz This is small nitpick, but it will be useful for #1258. If we are testing or just piping some sql through stdin, we can just not initialize `Rustyline` and save some execution time. On `Select 1` bench, I got a minor performance bump, but it starts to become less apparent on more complex queries. <img width="759" alt="image" src="https://github.com/user- attachments/assets/12e22675-e081-4284-a5ed-15d53a9c5579" /> Closes #1372	2025-04-25 23:02:42 +03:00
Jussi Saurio	454a409cae	Merge 'refactor database open_file and open' from meteorgan reduce redundant code Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1406	2025-04-25 22:03:49 +03:00
Jussi Saurio	3553d05c32	Merge 'Give name to hard-coded page_size values' from Anton Harniakou Related to #1379 I guess there are more hard-coded values. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1404	2025-04-25 22:03:43 +03:00
meteorgan	6a860e75b8	fix cargo clippy	2025-04-25 22:06:44 +08:00
meteorgan	0202fa3ed0	add back one comment	2025-04-25 21:57:35 +08:00
Jussi Saurio	fe65d6e991	Merge 'Performance: hoist entire expressions out of hot loops if they are constant' from Jussi Saurio ## Problem: - We have cases where we are evaluating expressions in a hot loop that could only be evaluated once. For example: `CAST('2025-01-01' as DATETIME)` -- the value of this never changes, so we should only run it once. - We have no robust way of doing this right now for entire _expressions_ -- the only existing facility we have is `program.mark_last_insn_constant()`, which has no concept of how many instructions translating a given _expression_ spends, and breaks very easily for this reason. ## Main ideas of this PR: - Add `expr.is_constant()` determining whether the expression is compile-time constant. Tries to be conservative and not deem something compile-time constant if there is no certainty. - Whenever we think a compile-time constant expression is about to be translated into bytecode in `translate_expr()`, start a so called `constant span`, which means a range of instructions that are part of a compile-time constant expression. - At the end of translating the program, all `constant spans` are hoisted outside of any table loops so they only get evaluated once. - The target offsets of any jump instructions (e.g. `Goto`) are moved to the correct place, taking into account all instructions whose offsets were shifted due to moving the compile-time constant expressions around. - An escape hatch wrapper `translate_expr_no_constant_opt()` is added for cases where we should not hoist constants even if we otherwise could. Right now the only example of this is cases where we are reusing the same register(s) in multiple iterations of some kind of loop, e.g. `VALUES(...)` or in the `coalesce()` function implementation. ## Performance effects Here is an example of a modified/simplified TPC-H query where the `CAST()` calls were previously run millions of times in a hot loop, but now they are optimized out of the loop. BYTECODE PLAN BEFORE: ```sql limbo> explain select l_orderkey, 3 as revenue, o_orderdate, o_shippriority from lineitem, orders, customer where c_mktsegment = 'FURNITURE' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < cast('1995-03-29' as datetime) and l_shipdate > cast('1995-03-29' as datetime); addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 26 0 0 Start at 26 1 OpenRead 0 10 0 0 table=lineitem, root=10 2 OpenRead 1 9 0 0 table=orders, root=9 3 OpenRead 2 8 0 0 table=customer, root=8 4 Rewind 0 25 0 0 Rewind lineitem 5 Column 0 10 5 0 r[5]=lineitem.l_shipdate 6 String8 0 7 0 1995-03-29 0 r[7]='1995-03-29' 7 Function 0 7 6 cast 0 r[6]=func(r[7..8]) <-- CAST() executed millions of times 8 Le 5 6 24 0 if r[5]<=r[6] goto 24 9 Column 0 0 9 0 r[9]=lineitem.l_orderkey 10 SeekRowid 1 9 24 0 if (r[9]!=orders.rowid) goto 24 11 Column 1 4 10 0 r[10]=orders.o_orderdate 12 String8 0 12 0 1995-03-29 0 r[12]='1995-03-29' 13 Function 0 12 11 cast 0 r[11]=func(r[12..13]) 14 Ge 10 11 24 0 if r[10]>=r[11] goto 24 15 Column 1 1 14 0 r[14]=orders.o_custkey 16 SeekRowid 2 14 24 0 if (r[14]!=customer.rowid) goto 24 17 Column 2 6 15 0 r[15]=customer.c_mktsegment 18 Ne 15 16 24 0 if r[15]!=r[16] goto 24 19 Column 0 0 1 0 r[1]=lineitem.l_orderkey 20 Integer 3 2 0 0 r[2]=3 21 Column 1 4 3 0 r[3]=orders.o_orderdate 22 Column 1 7 4 0 r[4]=orders.o_shippriority 23 ResultRow 1 4 0 0 output=r[1..4] 24 Next 0 5 0 0 25 Halt 0 0 0 0 26 Transaction 0 0 0 0 write=false 27 String8 0 8 0 DATETIME 0 r[8]='DATETIME' 28 String8 0 13 0 DATETIME 0 r[13]='DATETIME' 29 String8 0 16 0 FURNITURE 0 r[16]='FURNITURE' 30 Goto 0 1 0 ``` BYTECODE PLAN AFTER: ```sql limbo> explain select l_orderkey, 3 as revenue, o_orderdate, o_shippriority from lineitem, orders, customer where c_mktsegment = 'FURNITURE' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < cast('1995-03-29' as datetime) and l_shipdate > cast('1995-03-29' as datetime); addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 21 0 0 Start at 21 1 OpenRead 0 10 0 0 table=lineitem, root=10 2 OpenRead 1 9 0 0 table=orders, root=9 3 OpenRead 2 8 0 0 table=customer, root=8 4 Rewind 0 20 0 0 Rewind lineitem 5 Column 0 10 5 0 r[5]=lineitem.l_shipdate 6 Le 5 6 19 0 if r[5]<=r[6] goto 19 7 Column 0 0 9 0 r[9]=lineitem.l_orderkey 8 SeekRowid 1 9 19 0 if (r[9]!=orders.rowid) goto 19 9 Column 1 4 10 0 r[10]=orders.o_orderdate 10 Ge 10 11 19 0 if r[10]>=r[11] goto 19 11 Column 1 1 14 0 r[14]=orders.o_custkey 12 SeekRowid 2 14 19 0 if (r[14]!=customer.rowid) goto 19 13 Column 2 6 15 0 r[15]=customer.c_mktsegment 14 Ne 15 16 19 0 if r[15]!=r[16] goto 19 15 Column 0 0 1 0 r[1]=lineitem.l_orderkey 16 Column 1 4 3 0 r[3]=orders.o_orderdate 17 Column 1 7 4 0 r[4]=orders.o_shippriority 18 ResultRow 1 4 0 0 output=r[1..4] 19 Next 0 5 0 0 20 Halt 0 0 0 0 21 Transaction 0 0 0 0 write=false 22 String8 0 7 0 1995-03-29 0 r[7]='1995-03-29' 23 String8 0 8 0 DATETIME 0 r[8]='DATETIME' 24 Function 1 7 6 cast 0 r[6]=func(r[7..8]) <-- CAST() executed twice 25 String8 0 12 0 1995-03-29 0 r[12]='1995-03-29' 26 String8 0 13 0 DATETIME 0 r[13]='DATETIME' 27 Function 1 12 11 cast 0 r[11]=func(r[12..13]) 28 String8 0 16 0 FURNITURE 0 r[16]='FURNITURE' 29 Integer 3 2 0 0 r[2]=3 30 Goto 0 1 0 0 ``` EXECUTION RUNTIME BEFORE: ```sql limbo> select l_orderkey, 3 as revenue, o_orderdate, o_shippriority from lineitem, orders, customer where c_mktsegment = 'FURNITURE' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < cast('1995-03-29' as datetime) and l_shipdate > cast('1995-03-29' as datetime); ┌────────────┬─────────┬─────────────┬────────────────┐ │ l_orderkey │ revenue │ o_orderdate │ o_shippriority │ ├────────────┼─────────┼─────────────┼────────────────┤ └────────────┴─────────┴─────────────┴────────────────┘ Command stats: ---------------------------- total: 3.633396667 s (this includes parsing/coloring of cli app) ``` EXECUTION RUNTIME AFTER: ```sql limbo> select l_orderkey, 3 as revenue, o_orderdate, o_shippriority from lineitem, orders, customer where c_mktsegment = 'FURNITURE' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < cast('1995-03-29' as datetime) and l_shipdate > cast('1995-03-29' as datetime); ┌────────────┬─────────┬─────────────┬────────────────┐ │ l_orderkey │ revenue │ o_orderdate │ o_shippriority │ ├────────────┼─────────┼─────────────┼────────────────┤ └────────────┴─────────┴─────────────┴────────────────┘ Command stats: ---------------------------- total: 2.0923475 s (this includes parsing/coloring of cli app) ```` Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1359	2025-04-25 16:55:41 +03:00
meteorgan	f464d15f8b	refactor database open_file and open	2025-04-25 21:45:18 +08:00
Anton Harniakou	dd7c0ad1c8	Give name to hard-coded page_size values	2025-04-25 14:15:15 +03:00
Jussi Saurio	7137f4ab3b	Merge 'Feature: Composite Primary key constraint' from Pedro Muniz Closes #1384 . This PR implements Primary Key constraint for inserts. As can be seen in the issue, if you created an Index with a Primary Key constraint, it could trigger `Unique Constraint` error, but still insert the record. Sqlite uses the opcode `NoConflict` to check if the record already exists in the Btree. As we did not have this Opcode yet, I implemented it. It is very similar to `NotFound` with the difference that if any value in the Record is Null, it will immediately jump to the offset. The added benefit of implementing this, is that now we fully support Composite Primary Keys. Also, I think with the current implementation, it will be trivial to implement the Unique opcode for Insert. To support Updates, I need to understand more of the plan optimizer to and find where we are Making the Record and opening the autoindex. For testing, I have written a test generator to generate many different tables that can have a varying numbers of Primary Keys. ```sql limbo> CREATE TABLE users (id INT, username TEXT, PRIMARY KEY (id, username)); limbo> INSERT INTO users VALUES (1, 'alice'); limbo> explain INSERT INTO users VALUES (1, 'alice'); addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 16 0 0 Start at 16 1 OpenWrite 0 2 0 0 2 Integer 1 2 0 0 r[2]=1 3 String8 0 3 0 alice 0 r[3]='alice' 4 OpenWrite 1 3 0 0 5 NewRowId 0 1 0 0 6 Copy 2 5 0 0 r[5]=r[2] 7 Copy 3 6 0 0 r[6]=r[3] 8 Copy 1 7 0 0 r[7]=r[1] 9 MakeRecord 5 3 8 0 r[8]=mkrec(r[5..7]) 10 NoConflict 1 12 5 2 0 key=r[5] 11 Halt 1555 0 0 users.id, users.username 0 12 IdxInsert 1 8 5 0 key=r[8] 13 MakeRecord 2 2 4 0 r[4]=mkrec(r[2..3]) 14 Insert 0 4 1 0 15 Halt 0 0 0 0 16 Transaction 0 1 0 0 write=true 17 Goto 0 1 0 0 limbo> INSERT INTO users VALUES (1, 'alice'); × Runtime error: UNIQUE constraint failed: users.id, users.username (19) limbo> INSERT INTO users VALUES (1, 'bob'); limbo> INSERT INTO users VALUES (1, 'bob'); × Runtime error: UNIQUE constraint failed: users.id, users.username (19) ``` Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1393	2025-04-24 23:25:30 +03:00
Pekka Enberg	4d0c40a435	One more fix to Antithesis Dockerfile	2025-04-24 21:17:36 +03:00
Pekka Enberg	117dbe6c8c	Fix Antithesis Docker file some more	2025-04-24 21:12:40 +03:00
Pekka Enberg	fa5d6dcf6b	Fix Antithesis Docker file	2025-04-24 21:03:19 +03:00
Pekka Enberg	31677c9c94	scripts/antithesis: Build Docker image for x86-64	2025-04-24 20:55:30 +03:00
Pekka Enberg	2a5eb8e5bc	stress: Make Clippy happy	2025-04-24 20:46:26 +03:00
Pekka Enberg	ebc2e475b6	Merge 'Add Antithesis Tests' from eric-dinh-antithesis This PR adds 2 autonomous test suites for use with Antithesis: `bank- test` and `stress-composer`. It also modifies the existing `limbo_stress` test to run as a singleton and modifies other Antithesis- related configuration files. bank-test - `first_setup.py` - initializes a DB table with columns accounts and balance - generates random balances for each account - stores initial state of the table - `parallel_driver_generate_transaction.py` - selects 2 accounts from the table as sender and receiver - generates a random value which is subtracted from sender and added to receiver - `anytime/eventually/finally_validate.py` - checks that sum of initial balances == sum of current balances stress-composer - Breaks `limbo_stress` into component parts - `first_setup.py` - creates up to 10 tables with up to 10 columns - stores table details in a separate db - `parallel_driver_insert.py` - randomly generates and executes up to 100 insert statements into a single table using random values derived from the table details - `parallel_driver_update.py` - randomly generates and executes up to 100 updates into a single table using random values derived from the table details - `parallel_driver_delete.py` - randomly generates and executes up to 100 deletes from a single table using random values derived from the table details Closes #1401	2025-04-24 20:44:42 +03:00
eric-dinh-antithesis	27e15364c4	stress: suppress logfile since it's too big	2025-04-24 12:27:58 -04:00
eric-dinh-antithesis	b8885777dc	stress: move sdk setup_complete from limbo_stress to docker-entrypoint	2025-04-24 12:27:05 -04:00
eric-dinh-antithesis	75ae5dbd13	stress: update docker-compose	2025-04-24 12:26:00 -04:00
eric-dinh-antithesis	8390233b99	Dockerfile.antithesis: update limbo_stress build step	2025-04-24 12:25:19 -04:00
eric-dinh-antithesis	5953d32e4d	Dockerfile.antithesis: add symbols for rust, cataloging for python, and antithesis tests to image, update entrypoint	2025-04-24 12:24:44 -04:00
eric-dinh-antithesis	62e2745c3c	Dockerfile.antithesis: install dependencies	2025-04-24 12:23:22 -04:00
eric-dinh-antithesis	364a78b270	Cargo.toml: add profile for antithesis builds for full debug	2025-04-24 12:22:03 -04:00
eric-dinh-antithesis	f993a22023	antithesis-tests: add all tests	2025-04-24 12:20:41 -04:00
pedrocarlo	2e147b20a8	Adjustments and explicitely just emitting NoConflict on unique indexes	2025-04-24 13:13:39 -03:00
Jussi Saurio	80d39929ad	Merge 'types: refactor serialtype again to make it faster' from Jussi Saurio basically serialtype got slower in #1398, maybe because of the wasted space of `enum SerialType` being 16 bytes, so i've now refactored `SerialType` to be a transparent newtype wrapper over `u64` and introduced a separate `SerialTypeKind` enum at least on my machine the perf regression was nullified, if not even a bit better Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1399	2025-04-24 18:59:31 +03:00
Jussi Saurio	7921d7c2e0	types: refactor serialtype again to make it faster	2025-04-24 17:28:31 +03:00
Jussi Saurio	2ffeefe165	Merge 'core/types: remove duplicate serialtype implementation' from Jussi Saurio Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1398	2025-04-24 16:17:17 +03:00
Jussi Saurio	04adf8242a	faster validate	2025-04-24 16:05:12 +03:00
Jussi Saurio	af6a783f4d	core/types: remove duplicate serialtype implementation	2025-04-24 15:38:47 +03:00
Jussi Saurio	0c800524af	Merge 'Bugfix: Explain command should display syntax errors in CLI' from Anton Harniakou Closes #1392 Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1396	2025-04-24 15:11:59 +03:00
Anton Harniakou	fdf3dd9796	Bugfix: Explain command should display syntax errors in CLI Closes #1392	2025-04-24 13:25:00 +03:00
Jussi Saurio	dc3e97887f	Merge 'replace vec with array in btree balancing' from Lâm Hoàng Phúc Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1354	2025-04-24 11:22:07 +03:00
Jussi Saurio	2e8042510e	Merge 'Pragma page size reading' from Anton Harniakou 1) Fix a bug where cli pretty mode would not print pragma results; 2) Add ability to read page_size using PRAGMA page_size; Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1394	2025-04-24 11:08:55 +03:00
Jussi Saurio	c3441f9685	vdbe: move comments if instructions were moved around in emit_constant_insns()	2025-04-24 11:05:21 +03:00
Jussi Saurio	029e5eddde	Fix existing resolve_label() calls to work with new system	2025-04-24 11:05:21 +03:00
Jussi Saurio	e557503091	expr.rs: use constant spans to optimize constant expressions	2025-04-24 11:05:21 +03:00
Jussi Saurio	0f5c791784	vdbe: refactor label resolution to account for insn offsets changing	2025-04-24 11:05:21 +03:00
Jussi Saurio	b4b38bdb3c	vdbe: resolve labels for InitCoroutine::start_offset	2025-04-24 11:05:21 +03:00
Jussi Saurio	47f3f3bda3	vdbe: replace constant_insns with constant_spans	2025-04-24 11:05:21 +03:00
Jussi Saurio	e5bab63522	add expr.is_constant()	2025-04-24 11:05:21 +03:00
Jussi Saurio	5bed331505	add Func::is_deterministic()	2025-04-24 11:05:21 +03:00
Jussi Saurio	b36c898842	rename check_constant() to less confusing name	2025-04-24 11:05:21 +03:00
Jussi Saurio	6ff5ff49b7	Merge 'perf/btree: use binary search for Index seek operations' from Jussi Saurio ## Beef Followup to #1357 which did the same treatment for table btrees only. After this PR, all of our seeks use binary search for both interior and leaf pages. ## Perf comparison using TPC-H 1GB db for this query: ```sql limbo> explain select count(1) from lineitem join partsupp on l_partkey = ps_partkey and l_suppkey = ps_suppkey; addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 18 0 0 Start at 18 1 OpenRead 0 10 0 0 table=lineitem, root=10 2 OpenRead 1 7 0 0 table=sqlite_autoindex_partsupp_1, root=7 3 Rewind 0 14 0 0 Rewind lineitem 4 Column 0 1 2 0 r[2]=lineitem.l_partkey 5 IsNull 2 13 0 0 if (r[2]==NULL) goto 13 6 Column 0 2 3 0 r[3]=lineitem.l_suppkey 7 IsNull 3 13 0 0 if (r[3]==NULL) goto 13 8 SeekGE 1 13 2 0 key=[2..3] <-- index seek here, for every row in lineitem 9 IdxGT 1 13 2 0 key=[2..3] 10 Integer 1 5 0 0 r[5]=1 11 AggStep 0 5 4 count 0 accum=r[4] step(r[5]) 12 Next 1 9 0 0 13 Next 0 4 0 0 14 AggFinal 0 4 0 count 0 accum=r[4] 15 Copy 4 1 0 0 r[1]=r[4] 16 ResultRow 1 1 0 0 output=r[1] 17 Halt 0 0 0 0 18 Transaction 0 0 0 0 write=false 19 Goto 0 1 0 0 ``` main: ```sql limbo> select count(1) from lineitem join partsupp on l_partkey = ps_partkey and l_suppkey = ps_suppkey; ┌───────────┐ │ count (1) │ ├───────────┤ │ 6001215 │ └───────────┘ Command stats: ---------------------------- total: 40.292102375 s (this includes parsing/coloring of cli app) ``` PR: ```sql limbo> select count(1) from lineitem join partsupp on l_partkey = ps_partkey and l_suppkey = ps_suppkey; ┌───────────┐ │ count (1) │ ├───────────┤ │ 6001215 │ └───────────┘ Command stats: ---------------------------- total: 14.021689916 s (this includes parsing/coloring of cli app) ``` almost 3x faster. buzzkill: still 3x slower than sqlite :) Closes #1387	2025-04-24 10:53:35 +03:00
Anton Harniakou	51fc1773ea	Fix missing documentation warning; improve the documentation message	2025-04-24 10:36:23 +03:00
Jussi Saurio	c88c579154	Merge 'expr.is_nonnull(): return true if col.primary_key \|\| col.notnull' from Jussi Saurio This avoids redundant `IsNull` instructions during index seeks if the seek key columns are primary keys of other tables, which they often are. Reviewed-by: Preston Thorpe (@PThorpe92) Closes #1388	2025-04-24 10:32:00 +03:00
Anton Harniakou	0a69ea0138	Support reading db page size using PRAGMA page_size	2025-04-24 10:12:02 +03:00
pedrocarlo	9dd1ced5ad	added tests	2025-04-23 20:38:08 -03:00

1 2 3 4 5 ...

4183 Commits