turso

mirror of https://github.com/aljazceru/turso.git synced 2026-01-08 02:34:20 +01:00

Author	SHA1	Message	Date
Zaid Humayun	e994adfb40	Persisting database header and pointer map page to cache This commit ensures that the metadata in the database header and the pointer map pages allocated are correctly persisted to the page cache. This was not being done earlier.	2025-06-06 23:14:25 +05:30
Zaid Humayun	1f5025541c	addresses comment https://github.com/tursodatabase/limbo/pull/1600#discussion_r2115796655 by @jussisaurio this commit ensures that ptrmap operations return a CursorResult so operation can be suspended & later retried	2025-06-06 23:14:25 +05:30
Zaid Humayun	5827a33517	Beginnings of AUTOVACUUM This commit introduces AUTOVACUUM to Limbo. It introduces the concept of ptrmap pages and also adds some additional instructions that are required to make AUTOVACUUM PRAGMA work	2025-06-06 23:14:22 +05:30
Jussi Saurio	2087393d22	Merge 'Write database header via normal pager route' from meteorgan Closes: #1613 Closes #1634	2025-06-04 09:39:14 +03:00
Pekka Enberg	c6ef19396d	Merge 'Add support for pragma table-valued functions' from Piotr Rżysko This PR adds support for table-valued functions for PRAGMAs (see the [PRAGMA functions section](https://www.sqlite.org/pragma.html)). Additionally, it introduces built-in table-valued functions. I considered using extensions for this, but there are several reasons in favor of a dedicated mechanism: * It simplifies the use of internal functions, structs, etc. For example, when implementing `json_each` and `json_tree`, direct access to internals was necessary: https://github.com/tursodatabase/limbo/pull/1088 * It avoids FFI overhead. [Benchmarks](https://github.com/piotrrzysko/li mbo/blob/pragma_vtabs_bench/core/benches/pragma_benchmarks.rs) on my hardware show that `pragma_table_info()` implemented as an extension is 2.5× slower than the built-in version. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1642	2025-06-04 09:08:10 +03:00
meteorgan	f2bf6251cd	write database header via normal pager route	2025-06-03 22:06:08 +08:00
Jussi Saurio	c488c32d43	Merge 'Make cursor seek reentrant' from Pedro Muniz Closes #1628. Every function that calls `process_overflow_read` needs to be reentrant. I did not change it here, but it would include `get_prev_record` and `get_next_record`. Maybe `tablebtree_move_to` did not need to use the state machine, but I included it as a safeguard. Edit: Closes #1625 . When I implemented `restore_context`, I forgot to add a `return_if_io` after calling it in `next` 🤦‍♂️ Edit: Closes #1617 . Just tested it and it also solves this bug. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1636	2025-06-03 14:24:40 +03:00
Jussi Saurio	5f586b7b24	Merge 'Small tracing enhancement' from Pedro Muniz Instrument trace_insn to debug print its the stack pc and instruction. Also, disable rustyline logs for the CLI as it is too noisy to work with. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1635	2025-06-02 17:41:52 +03:00
Piotr Rzysko	4d35e36b77	Introduce virtual table types	2025-06-01 07:45:57 +02:00
Piotr Rzysko	b291179554	Extract cursor logic from VirtualTable into VirtualTableCursor	2025-06-01 07:45:57 +02:00
Piotr Rzysko	6300deb77f	Move VTabOpaqueCursor to vtab module	2025-06-01 07:45:57 +02:00
pedrocarlo	c9c73f2497	fix explain panicking on None CursorKey	2025-05-31 01:19:26 -03:00
pedrocarlo	bc563266b3	add instrumentation to more functions for debugging + adjust how cursors are opened	2025-05-30 20:35:50 -03:00
pedrocarlo	33480540f1	make cursor seek reentrant	2025-05-30 13:25:46 -03:00
pedrocarlo	0757109676	instrument trace_insn	2025-05-30 11:33:22 -03:00
Pere Diaz Bou	d4f1b8e068	update i64::MAX comment	2025-05-30 14:02:05 +02:00
Pere Diaz Bou	da4190a23e	Convert u64 rowid to i64 Rowids can be negative, therefore let's swap to i64	2025-05-30 13:07:31 +02:00
Jussi Saurio	819a6138d0	Merge 'Fix: aggregate regs must be initialized as NULL at the start' from Jussi Saurio Again found when fuzzing nested where clause subqueries: Aggregate registers need to be NULLed at the start because the same registers might be reused on another invocation of a subquery, and if they are not NULLed, the 2nd invocation of the same subquery will have values left over from the first invocation. Reviewed-by: Preston Thorpe (@PThorpe92) Closes #1614	2025-05-30 09:39:37 +03:00
Jussi Saurio	5632a6046e	Merge 'Fix: allow DeferredSeek on more than one cursor per program' from Jussi Saurio Found while fuzzing nested subqueries. Since subqueries result in nested plans, it quickly revealed that there can be multiple `DeferredSeek` instructions issued for different cursors, but our `ProgramState` only supported one at a time. Closes #1610	2025-05-30 09:39:23 +03:00
Jussi Saurio	f8257df77b	Fix: aggregate regs must be initialized as NULL at the start	2025-05-29 18:44:53 +03:00
Jussi Saurio	69133b3b2e	Fix: allow DeferredSeek on more than one cursor per program	2025-05-29 16:05:47 +03:00
Jussi Saurio	cc405dea7e	Use new TableReferences struct everywhere	2025-05-29 11:44:56 +03:00
Jussi Saurio	592ba41137	Add assertion forbidding duplicate cursor keys	2025-05-29 01:04:45 +03:00
Jussi Saurio	77ce4780d9	Fix ProgramBuilder::cursor_ref not having unique keys Currently we have this: program.alloc_cursor_id(Option<String>, CursorType)` where the String is the table's name or alias ('users' or 'u' in the query). This is problematic because this can happen: `SELECT * FROM t WHERE EXISTS (SELECT * FROM t)` There are two cursors, both with identifier 't'. This causes a bug where the program will use the same cursor for both the main query and the subquery, since they are keyed by 't'. Instead introduce `CursorKey`, which is a combination of: 1. `TableInternalId`, and 2. index name (Option<String> -- in case of index cursors. This should provide key uniqueness for cursors: `SELECT * FROM t WHERE EXISTS (SELECT * FROM t)` here the first 't' will have a different `TableInternalId` than the second `t`, so there is no clash.	2025-05-29 00:59:24 +03:00
Pere Diaz Bou	28bd24b7d4	clear page cache on transaction failure This is the first step towards rollback, since we still don't spill pages with WAL, we can simply invalidate page cache in case of failure.	2025-05-28 15:54:28 +02:00
Jussi Saurio	ad0f2bb399	Merge 'Small VDBE insn tweaks' from Jussi Saurio 1. allow calling op_null with Insn::BeginSubrtn - BeginSubrtn is identical to Null, but named differently so that its use in context is clearer 2. Insn::Return: add possibility to fallthrough on non-integer values as per sqlite spec Closes #1588	2025-05-27 20:19:31 +03:00
meteorgan	d9d3a5ecbb	Use the SetCookie opcode to implement user_version pragma	2025-05-28 00:31:11 +08:00
Jussi Saurio	6914d61180	allow calling op_null with Insn::BeginSubrtn	2025-05-27 19:09:15 +03:00
Jussi Saurio	70965f4b28	Insn::Return: add possibility to fallthrough on non-integer values as per sqlite spec	2025-05-27 19:09:10 +03:00
Jussi Saurio	a88e1c38f3	Merge 'Fix bug: op_vopen should replace cursor slot, not add new one' from Jussi Saurio Found this when reviewing #1528 locally and this was crashing ```sql INSERT INTO t SELECT * FROM generate_series(1,10,1); ``` Reason was that `op_vopen` was not replacing the already allocated cursor slot, but using `.insert()` Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #1583	2025-05-27 12:50:11 +03:00
Pere Diaz Bou	312bb5205a	Merge 'Reset idx delete state after successful finish' from Pere Diaz Bou If we don't reset the state of `IdxDelete`, next `IdxDelete` will start in `Deleting` state which is completely wrong since it should seek from the start. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1584	2025-05-27 11:31:25 +02:00
Pere Diaz Bou	a5a8a52a07	reset-idx-delete-state	2025-05-27 10:47:21 +02:00
Jussi Saurio	360b1fcdae	Fix bug: op_vopen should replace cursor slot, not add new one	2025-05-27 10:52:36 +03:00
Jussi Saurio	b72b99c973	Merge 'feature: `INSERT INTO <table> SELECT`' from Pedro Muniz Closes #1528 . - Modified `translate_select` so that the caller can define if the statement is top-level statement or a subquery. - Refactored `translate_insert` to offload the translation of multi-row VALUES and SELECT statements to `translate_select` - I did not try to change much of `populate_column_registers` as I did not want to break `translate_virtual_table_insert`. Ideally, I would want to unite this remaining logic folding `populate_column_registers` into `populate_columns_multiple_rows` and the `translate_virtual_table_insert` into `translate_insert`. But, I think this may be best suited for a separate PR. ## TODO - ~Tests~ - Done - ~Need to emit a temp table when we are selecting and inserting into the Same Table - https://github.com/sqlite/sqlite/blob/master/src/insert.c#L1369~ - Done - Optimization when table have the exact same schema - open an Issue about it - Virtual Tables do not benefit yet from this feature - open an Issue about it Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1566	2025-05-27 10:50:26 +03:00
Jussi Saurio	3ba9f2ab97	Small cleanups to pager/wal/vdbe - mostly naming - Instead of using a confusing CheckpointStatus for many different things, introduce the following statuses: * PagerCacheflushStatus - cacheflush can result in either: - the WAL being written to disk and fsynced - but also a checkpoint to the main BD file, and fsyncing the main DB file Reflect this in the type. * WalFsyncStatus - previously CheckpointStatus was also used for this, even though fsyncing the WAL doesn't checkpoint. * CheckpointStatus/CheckpointResult is now used only for actual checkpointing. - Rename HaltState to CommitState (program.halt_state -> program.commit_state) - Make WAL a non-optional property in Pager * This gets rid of a lot of if let Some(...) boilerplate * For ephemeral indexes, provide a DummyWAL implementation that does nothing. - Rename program.halt() to program.commit_txn() - Add some documentation comments to structs and functions	2025-05-26 10:37:34 +03:00
pedrocarlo	e3fd1e589e	support using a INSERT SELECT that references the same table in both statements	2025-05-25 19:15:28 -03:00
Jussi Saurio	7c07c09300	Add stable internal_id property to TableReference Currently our "table id"/"table no"/"table idx" references always use the direct index of the `TableReference` in the plan, e.g. in `SelectPlan::table_references`. For example: ```rust Expr::Column { table: 0, column: 3, .. } ``` refers to the 0'th table in the `table_references` list. This is a fragile approach because it assumes the table_references list is stable for the lifetime of the query processing. This has so far been the case, but there exist certain query transformations, e.g. subquery unnesting, that may fold new table references from a subquery (which has its own table ref list) into the table reference list of the parent. If such a transformation is made, then potentially all of the Expr::Column references to tables will become invalid. Consider this example: ```sql -- Assume tables: users(id, age), orders(user_id, amount) -- Get total amount spent per user on orders over $100 SELECT u.id, sub.total FROM users u JOIN (SELECT user_id, SUM(amount) as total FROM orders o WHERE o.amount > 100 GROUP BY o.user_id) sub WHERE u.id = sub.user_id -- Before subquery unnesting: -- Main query table_references: [users, sub] -- u.id refers to table 0, column 0 -- sub.total refers to table 1, column 1 -- -- Subquery table_references: [orders] -- o.user_id refers to table 0, column 0 -- o.amount refers to table 0, column 1 -- -- After unnesting and folding subquery tables into main query, -- the query might look like this: SELECT u.id, SUM(o.amount) as total FROM users u JOIN orders o ON u.id = o.user_id WHERE o.amount > 100 GROUP BY u.id; -- Main query table_references: [users, orders] -- u.id refers to table index 0 (correct) -- o.amount refers to table index 0 (incorrect, should be 1) -- o.user_id refers to table index 0 (incorrect, should be 1) ``` We could ofc traverse every expression in the subquery and rewrite the table indexes to be correct, but if we instead use stable identifiers for each table reference, then all the column references will continue to be correct. Hence, this PR introduces a `TableInternalId` used in `TableReference` as well as `Expr::Column` and `Expr::Rowid` so that this kind of query transformations can happen with less pain.	2025-05-25 20:26:17 +03:00
PThorpe92	c2ec6caae1	Finish integrating xConnect into vtable open api	2025-05-24 14:49:58 -04:00
Jussi Saurio	597020bc0c	Merge 'Support values statement and values in select' from meteorgan Close: #866 limbo output: ``` limbo> explain values(1, 2); addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 5 0 0 Start at 5 1 Integer 1 1 0 0 r[1]=1 2 Integer 2 2 0 0 r[2]=2 3 ResultRow 1 2 0 0 output=r[1..2] 4 Halt 0 0 0 0 5 Goto 0 1 0 0 limbo> explain values(1, 2), (3, 4); addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 16 0 0 Start at 16 1 InitCoroutine 1 9 2 0 2 Integer 1 2 0 0 r[2]=1 3 Integer 2 3 0 0 r[3]=2 4 Yield 1 0 0 0 5 Integer 3 2 0 0 r[2]=3 6 Integer 4 3 0 0 r[3]=4 7 Yield 1 0 0 0 8 EndCoroutine 1 0 0 0 9 InitCoroutine 1 0 2 0 10 Yield 1 15 0 0 11 Copy 2 4 0 0 r[4]=r[2] 12 Copy 3 5 0 0 r[5]=r[3] 13 ResultRow 4 2 0 0 output=r[4..5] 14 Goto 0 10 0 0 15 Halt 0 0 0 0 16 Goto 0 1 0 0 limbo> explain select * from (values(1, 2), (3, 4)); addr opcode p1 p2 p3 p4 p5 comment ---- ----------------- ---- ---- ---- ------------- -- ------- 0 Init 0 16 0 0 Start at 16 1 InitCoroutine 1 9 2 0 2 Integer 1 2 0 0 r[2]=1 3 Integer 2 3 0 0 r[3]=2 4 Yield 1 0 0 0 5 Integer 3 2 0 0 r[2]=3 6 Integer 4 3 0 0 r[3]=4 7 Yield 1 0 0 0 8 EndCoroutine 1 0 0 0 9 InitCoroutine 1 0 2 0 10 Yield 1 15 0 0 11 Copy 2 4 0 0 r[4]=r[2] 12 Copy 3 5 0 0 r[5]=r[3] 13 ResultRow 4 2 0 0 output=r[4..5] 14 Goto 0 10 0 0 15 Halt 0 0 0 0 16 Transaction 0 0 0 0 write=false 17 Goto 0 1 0 0 ``` sqlite output: ``` sqlite> explain values(1, 2); addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 5 0 0 Start at 5 1 Integer 1 1 0 0 r[1]=1 2 Integer 2 2 0 0 r[2]=2 3 ResultRow 1 2 0 0 output=r[1..2] 4 Halt 0 0 0 0 5 Goto 0 1 0 0 sqlite> explain values(1, 2), (3, 4); addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 16 0 0 Start at 16 1 InitCoroutine 1 9 2 0 2 Integer 1 4 0 0 r[4]=1 3 Integer 2 5 0 0 r[5]=2 4 Yield 1 0 0 0 5 Integer 3 4 0 0 r[4]=3 6 Integer 4 5 0 0 r[5]=4 7 Yield 1 0 0 0 8 EndCoroutine 1 0 0 0 9 InitCoroutine 1 0 2 0 10 Yield 1 15 0 0 next row of 2-ROW VALUES CLAUSE 11 Copy 4 8 0 2 r[8]=r[4] 12 Copy 5 9 0 2 r[9]=r[5] 13 ResultRow 8 2 0 0 output=r[8..9] 14 Goto 0 10 0 0 15 Halt 0 0 0 0 16 Goto 0 1 0 0 sqlite> explain select * from (values(1, 2), (3, 4)); addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 16 0 0 Start at 16 1 InitCoroutine 1 9 2 0 2 Integer 1 4 0 0 r[4]=1 3 Integer 2 5 0 0 r[5]=2 4 Yield 1 0 0 0 5 Integer 3 4 0 0 r[4]=3 6 Integer 4 5 0 0 r[5]=4 7 Yield 1 0 0 0 8 EndCoroutine 1 0 0 0 9 InitCoroutine 1 0 2 0 10 Yield 1 15 0 0 next row of 2-ROW VALUES CLAUSE 11 Copy 4 8 0 2 r[8]=r[4] 12 Copy 5 9 0 2 r[9]=r[5] 13 ResultRow 8 2 0 0 output=r[8..9] 14 Goto 0 10 0 0 15 Halt 0 0 0 0 16 Goto 0 1 0 0 ``` Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1549	2025-05-23 13:56:31 +03:00
Zaid Humayun	4312d371fb	addresses comment https://github.com/tursodatabase/limbo/pull/1548#discussion_r2102606810 by @jussisaurio this commit changes the btree_destroy() signature to return an Option<usize>. This more closely resembles Rust semantics instead of passing a pointer to a usize. However, I'm unsure if I'm handling the cursor result correctly	2025-05-23 00:46:05 +05:30
meteorgan	34e05ef974	make values work in subquery	2025-05-23 00:30:04 +08:00
Zaid Humayun	4072a41c9c	Drop Table now uses an ephemeral table as a scratch table Now when dropping a table, an ephemeral table is created as a scratch table. If a root page of some other table is moved into the page occupied by the root page of the table being dropped, that row is first written into an ephemeral table. Then on a next pass, it is deleted from the schema table and then re-inserted with the new root page. This happens during AUTOVACUUM when deleting a root page will force the last root page to move into the slot being vacated by the root page of the table being deleted	2025-05-22 19:39:46 +05:30
Jussi Saurio	533a00eae3	Fix bug in op_decr_jump_zero()	2025-05-22 11:40:49 +03:00
Jussi Saurio	8bec75d804	Merge 'Initial Support for Nested Translation' from Pedro Muniz This PR introduces some modifications to the Program Builder to allow us to use nested parsing. By focusing the emission of Init and the last Goto (prologue and epilogue), inside the ProgramBuilder, we can just not emit them if we are parsing/translating in a nested context. For this PR, I only migrated insert to use these functions as I need them to support Insert statements that use `SELECT FROM` syntax. Nested parsing overall enables code reuse for us and arguably is one of the only ways to parse deeply nested queries without a lot of code duplication. #1528 Closes #1543	2025-05-22 10:52:00 +03:00
Jussi Saurio	c7f984c5c8	Merge 'Page cache fixes' from Pere Diaz Bou This PR builds on top of https://github.com/tursodatabase/limbo/pull/1368 and adds few things like allowing inserting pages with the same page key, fix fuzz tests by adding transactions and some minor improvements to cacheflush. Closes #1523	2025-05-22 10:12:56 +03:00
Jussi Saurio	fc150b12c9	Merge 'CSV virtual table extension' from Piotr Rżysko This PR adds a port of [SQLite's CSV virtual table extension](https://www.sqlite.org/csv.html). Planned follow-ups: * Pass detailed error messages from `VTabModule::create`, not just `ResultCode`s. * Address the TODO in `VTabModuleImpl::create_schema`. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1544	2025-05-22 09:48:53 +03:00
pedrocarlo	53bf5d5ef5	adjust translate functions to take a program instead of `Option<ProgramBuilder>` + remove any Init emission in traslate functions + use epilogue in all places necessary	2025-05-21 16:41:10 -03:00
pedrocarlo	1c12535d9f	push prologue to top-level translate function	2025-05-21 15:50:43 -03:00
pedrocarlo	3090dd91fa	push translate_ctx creation outside of prologue	2025-05-21 13:06:25 -03:00
pedrocarlo	f5d6d11d16	extract prologue and epilogue to program builder	2025-05-21 12:47:51 -03:00

1 2 3 4 5 ...

741 Commits