turso

mirror of https://github.com/aljazceru/turso.git synced 2026-01-02 16:04:20 +01:00

Author	SHA1	Message	Date
Jussi Saurio	47ef30b22e	btree: fix interior cell replacement in btrees with depth >=3 When a divider cell is deleted from an index interior page, the following algorithm is used: 1. Find predecessor: Move to largest key in left subtree (self.prev()) 2. Create replacement: Convert predecessor leaf cell to interior cell format, using original cell's left child pointer 3. Replace: Drop original cell from parent page, insert replacement at same position 4. Cleanup: Delete predecessor from leaf page The error in our logic was that we always expected to only traverse down one level of the btree: ```rust let parent_page = self.stack.parent_page().unwrap(); let leaf_page = self.stack.top(); ``` This meant that when the deletion happened on say, level 1, and the replacement cell was taken from level 3, we actually inserted the replacement cell into level 2 instead of level 1. In #2106, this manifested as the following chain of pages, going from parent to children: 3 -> 111 -> 119 Cell was deleted from page 3 (whose left pointer is 111), and a replacement cell was taken from 119, incorrectly inserted into 111, and its left child pointer also set as 111! The fix is quite trivial: store the page we are on before we start traversing down. Closes #2106	2025-07-16 10:12:59 +03:00
Jussi Saurio	f482424d77	Merge 'small refactor: rename "amount" to "extra_amount"' from Nikita Sivukhin Small refactoring to reduce confusion (I was caught in this trap and set `amount` to one in CDC branch during development) Also, this PR slightly fix broken `concat_ws` emit logic. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #2100	2025-07-16 06:51:35 +03:00
Diego Reis	0e9771ac07	refactor: Change redundant "Status" enums to IOResult Let's unify the semantics of "something done" or yields I/O into a single type	2025-07-15 20:56:18 -03:00
Diego Reis	d0af54ae77	refactor: Change CursorResult to IOResult The reasoning here is to treat I/O operations (Either is "Done" or yields to IO) with the same generic type.	2025-07-15 20:52:25 -03:00
Nikita Sivukhin	c018b06bf5	fix bug in concat_ws translation	2025-07-16 00:48:17 +04:00
Nikita Sivukhin	f7fb2aac5e	adjust extra_amount for schema translation code	2025-07-16 00:47:59 +04:00
Nikita Sivukhin	be0a607ba8	rename amount -> extra_amount	2025-07-16 00:46:17 +04:00
Jussi Saurio	86b1b0d009	Merge 'fix record header size calculations and incorrect assumptions' from Jussi Saurio - remove assumptions that record header size fits into 1 byte or serial type fits into 1 byte - add tests for record header size calculation ```sql turso> CREATE TABLE t(x TEXT, y); CREATE INDEX t_idx ON t(x); INSERT INTO t VALUES (replace(zeroblob(1000), x'00', 'a') \|\| 'a', 1); -- 1000 bytes of 'a' INSERT INTO t VALUES (replace(zeroblob(1000), x'00', 'a') \|\| 'b', 2); INSERT INTO t VALUES (replace(zeroblob(1000), x'00', 'a') \|\| 'c', 3); INSERT INTO t VALUES (replace(zeroblob(1000), x'00', 'a') \|\| 'd', 4); INSERT INTO t VALUES (replace(zeroblob(1000), x'00', 'a') \|\| 'e', 5); INSERT INTO t VALUES (replace(zeroblob(1000), x'00', 'a') \|\| 'f', 6); INSERT INTO t VALUES (replace(zeroblob(1000), x'00', 'a') \|\| 'g', 7); INSERT INTO t VALUES (replace(zeroblob(1000), x'00', 'a') \|\| 'h', 8); SELECT COUNT() FROM t WHERE x >= replace(hex(zeroblob(100)), '00', 'a'); ┌───────────┐ │ COUNT () │ ├───────────┤ │ 8 │ └───────────┘ ``` Fixes #2096 Fixes #2088 Reviewed-by: Nikita Sivukhin (@sivukhin) Closes #2098	2025-07-15 19:09:31 +03:00
Jussi Saurio	fda92d43a2	adjust comment in header size test	2025-07-15 18:52:27 +03:00
Jussi Saurio	025ddd98a6	Merge 'bench: add insert benchmark (batch sizes: 1,10,100)' from Jussi Saurio ``` Insert rows in batches/limbo_insert_1_rows time: [344.71 µs 363.45 µs 379.31 µs] Insert rows in batches/sqlite_insert_1_rows time: [575.12 µs 769.16 µs 983.30 µs] Insert rows in batches/limbo_insert_10_rows time: [1.4964 ms 1.5694 ms 1.6334 ms] Insert rows in batches/sqlite_insert_10_rows time: [510.79 µs 766.56 µs 1.0677 ms] Insert rows in batches/limbo_insert_100_rows time: [5.5177 ms 5.6806 ms 5.8619 ms] Insert rows in batches/sqlite_insert_100_rows time: [439.91 µs 879.43 µs 1.4260 ms] ``` Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #2092	2025-07-15 18:12:59 +03:00
Jussi Saurio	927a1f158a	Merge 'btree: unify table&index seek page boundary handling' from Jussi Saurio ## Background PR #2065 fixed a bug with table btree seeks concerning boundaries of leaf pages. The issue was that if we were e.g. looking for the first key greater than (GT) 100, we always assumed the key would either be found on the left child page of a given divider (e.g. divider 102) or not at all, which is incorrect. #2065 has more discussion and documentation about this, so read that one for more context. ## This PR We already had similar handling for index btrees as #2065 introduced for table btrees, but it was baked into the `BTreeCursor` struct's seek handling itself, whereas #2065 handled this on the VDBE side. This PR unifies this handling for both table and index btrees by always doing the additional cursor advancement in the VDBE. Unfortunately, unlike table btrees, index btrees may also need to do an additional advance when they are looking for an exact match. This resulted in a bigger refactor than anticipated, since there are quite a few VDBE instructions that may perform a seek, e.g.: `IdxInsert`, `IdxDelete`, `Found`, `NotFound`, `NoConflict`. All of these can potentially end up in a similar situation where the cursor needs one more advance after the initial seek, and they were currently calling `cursor.seek()` directly and expecting the `BTreeCursor` to handle the auto-advance fallback internally. For this reason, I have 1. removed the "TryAdvance"-ish logic from the index btree internals and 2. extracted a common VDBE helper `fn seek_internal()` - heavily based on the existing `op_seek_internal()`, but decoupled from instructions and the program counter - which all the interested VDBE instructions will call to delegate their seek logic. Closes #2083 Reviewed-by: Nikita Sivukhin (@sivukhin) Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #2084	2025-07-15 18:02:52 +03:00
Jussi Saurio	932536a03f	compare_records: fix assumption that header size is 1 byte and serial type is 1 byte	2025-07-15 17:57:52 +03:00
Jussi Saurio	7c353095ed	types: fix and unify record header size calculation	2025-07-15 17:38:02 +03:00
Jussi Saurio	3a861e1618	bench: add insert benchmark (batch sizes: 1,10,100)	2025-07-15 13:41:56 +03:00
Jussi Saurio	beaf393476	Merge 'Treat table-valued functions as tables' from Piotr Rżysko First step toward resolving https://github.com/tursodatabase/limbo/issues/1643. ### This PR With this change, the following two queries are considered equivalent: ```sql SELECT value FROM generate_series(5, 50); SELECT value FROM generate_series WHERE start = 5 AND stop = 50; ``` Arguments passed in parentheses to the virtual table name are now matched to hidden columns. Additionally, I fixed two bugs related to virtual tables. ### TODO (I'll handle this in a separate PR) Column references are still not supported as table-valued function arguments. The only difference is that previously, a query like: ```sql SELECT one.value, series.value FROM (SELECT 1 AS value) one, generate_series(one.value, 3) series; ``` would cause a panic. Now, it returns a proper error message instead. Adding support for column references is more nuanced for two main reasons: * We need to ensure that in joins where a TVF depends on other tables, those other tables are processed first. For example, in: ```sql SELECT one.value, series.value FROM generate_series(one.value, 3) series, (SELECT 1 AS value) one; ``` the one table must be processed by the top-level loop, and series must be nested. * For outer joins involving TVFs, the arguments must be treated as `ON` predicates, not `WHERE` predicates. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1727	2025-07-15 12:23:45 +03:00
Jussi Saurio	0ab0af912c	Merge 'bindings/js: fix more tests' from Mikaël Francoeur Six more tests passing on Turso. The commits can be reviewed separately. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #2085	2025-07-15 12:17:15 +03:00
meteorgan	d7bdfeb711	reinitialize WalFileShare when reset page size	2025-07-15 16:34:07 +08:00
meteorgan	b42a1ef272	minor improvements based on PR comments	2025-07-15 16:34:07 +08:00
meteorgan	f123c77ee8	fix set page_size in pager	2025-07-15 16:34:07 +08:00
meteorgan	e2ab673624	fix self.pager.replace() panic	2025-07-15 16:34:07 +08:00
meteorgan	bf69b86e94	fix: not all pragma need transaction	2025-07-15 16:34:07 +08:00
meteorgan	a6faab17e9	fix query page size	2025-07-15 16:34:07 +08:00
meteorgan	cf126824de	Support set page size	2025-07-15 16:34:07 +08:00
Mikaël Francoeur	e25064959b	return info object	2025-07-14 14:35:48 -04:00
Jussi Saurio	553396e9ca	btree: unify table&index seek page boundary handling PR #2065 fixed a bug with table btree seeks concerning boundaries of leaf pages. The issue was that if we were e.g. looking for the first key greater than (GT) 100, we always assumed the key would either be found on the left child page of a given divider (e.g. divider 102), which is incorrect. #2065 has more discussion and documentation about this, so read that one for more context. Anyway: We already had similar handling for index btrees, but it was baked into the `BTreeCursor` struct's seek handling itself, whereas #2065 handled this on the VDBE side. This PR unifies this handling for both table and index btrees by always doing the additional cursor advancement in the VDBE. Unfortunately, since indexes may also need to do an additional advance when they are looking for an exact match, this resulted in a bigger refactor than anticipated, since there are quite a few VDBE instructions that may perform a seek, e.g.: `IdxInsert`, `IdxDelete`, `Found`, `NotFound`, `NoConflict`. All of these can potentially end up in a similar situation where the cursor needs one more advance after the initial seek. For this reason, I have extracted a common VDBE helper `fn seek_internal()` which all the interested VDBE instructions will call to delegate their seek logic.	2025-07-14 16:46:43 +03:00
Pekka Enberg	55cf9c8f02	Merge 'Add async header accessor functionality' from Zaid Humayun This PR addresses https://github.com/tursodatabase/turso/issues/1828 in a phased manner. Making database header access async in one PR will be complicated. This PR ports adds an async API to `header_accessor.rs` and ports over some of `pager.rs` to use this API. This will allow gradual porting over of all call sites. Once all call sites are ported over, one mechanical rename will fix everything in the repo so we don't have any `<header_name>_async` functions. Also, porting header accessors over from sync to async would be a good way to get introduced to the Limbo codebase for first time contributors. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1966	2025-07-14 13:08:29 +03:00
Pekka Enberg	1653b0883a	Merge 'core/vector: Euclidean distance support for vector search' from KarinaMilet This PR provides Euclidean distance support for limbo's vector search. At the same time, some type abstractions are introduced, such as `DistanceCalculator`, etc. This is because I hope to unify the current vector module in the future to make it more structured, clearer, and more extensible. While practicing Euclidean distance for Limbo, I discovered that many checks could be done using the type system or in advance, rather than waiting until the distance is calculated. By building these checks into the type system or doing them ahead of time, this would allow us to explore more efficient computations, such as automatic vectorization or SIMD acceleration, which is future work. Reviewed-by: Nikita Sivukhin (@sivukhin) Closes #1986	2025-07-14 13:07:20 +03:00
Pekka Enberg	90532eabdf	Merge 'b-tree: fix bug in case when no matching rows was found in seek in the leaf page' from Nikita Sivukhin Current table B-Tree seek code rely on the invariant that if key `K` is present in interior page then it also must be present in the leaf page. This is generally not true if data was ever deleted from the table because leaf row which key was used as a divider in the interior pages can be deleted. Also, SQLite spec says nothing about such invariant - so `turso-db` implementation of B-Tree should not rely on it. This PR introduce 3 options for B-Tree `seek` result: `Found` / `NotFound` and `TryAdvance` which is generated when leaf page have no match for `seek_op` but DB don't know if neighbor page can have matching data. There is an alternative approach where we can move cursor in the `seek` itself to the neighbor page - but I was afraid to introduce such changes because analogue `seek` function from SQLite works exactly like current version of the code and I think some query planner internals (for insertion) can rely on the fact that repositioning will leave cursor at the position of insertion: > If an exact match is not found, then the cursor is always left pointing at a leaf page which would hold the entry if it were present. The cursor might point to an entry that comes before or after the key. Also, this PR introduces new B-tree fuzz tests which generate table B-tree from scratch and execute opreations over it. This can help to reach some non trivial states and also generate huge DBs faster (that's how this bug was discovered) Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #2065	2025-07-14 12:57:09 +03:00
Pekka Enberg	1a0d618a41	Merge 'Assert I/O read and write sizes' from Pere Diaz Bou Let's assert for now that we do not read/write less bytes than expected. This should be fixed to retrigger several reads/writes if we couldn't read/write enough but for now let's assert. Closes #2078	2025-07-14 12:22:18 +03:00
Nikita Sivukhin	413d93f041	fix after rebase	2025-07-14 13:05:20 +04:00
Nikita Sivukhin	5bd3287826	add comments	2025-07-14 13:01:15 +04:00
Nikita Sivukhin	aceaf182b1	remove comment	2025-07-14 13:01:15 +04:00
Nikita Sivukhin	6e2ccdff20	add btree fuzz tests which generate seed file from scratch	2025-07-14 13:01:15 +04:00
Nikita Sivukhin	f9cd5fad4c	add small comment	2025-07-14 13:01:15 +04:00
Nikita Sivukhin	fc400906d5	handle case when target seek page has no matching entries	2025-07-14 13:01:15 +04:00
Nikita Sivukhin	03b2725cc7	return SeekResult from seek operation - Apart from regular states Found/NotFound seek result has TryAdvance value which tells caller to advance the cursor in necessary direction because the leaf page which would hold the entry if it was present actually has no matching entry (but neighbouring page can have match)	2025-07-14 13:01:15 +04:00
Nikita Sivukhin	77bf6c287d	introduce proper state machine for seek op code	2025-07-14 13:01:14 +04:00
Pekka Enberg	9285d8b83b	Merge 'Fix: OP_NewRowId to generate semi random rowid when largest rowid is `i64::MAX`' from Krishna Vishal - `OP_NewRowId` now generates new rowid semi randomly when the largest rowid in the table is `i64::MAX`. - Introduced new `LimboError` variant `DatabaseFull` to signify that database might be full (SQLite behaves this way returning `SQLITE_FULL`). Now: ```SQL turso> CREATE TABLE q(x INTEGER PRIMARY KEY, y); turso> INSERT INTO q VALUES (9223372036854775807, 1); turso> INSERT INTO q(y) VALUES (2); turso> INSERT INTO q(y) VALUES (3); turso> SELECT * FROM q; ┌─────────────────────┬───┐ │ x │ y │ ├─────────────────────┼───┤ │ 1841427626667347484 │ 2 │ ├─────────────────────┼───┤ │ 4000338366725695791 │ 3 │ ├─────────────────────┼───┤ │ 9223372036854775807 │ 1 │ └─────────────────────┴───┘ ``` Fixes: https://github.com/tursodatabase/turso/issues/1977 Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #1985	2025-07-14 11:56:09 +03:00
Pekka Enberg	0b544717a1	Merge 'do not check rowid alias for null' from Nikita Sivukhin Simple PR to check minor issue that `INTEGER PRIMARY KEY NOT NULL` (`NOT NULL` is redundant here obviously) will prevent user to insert anything to the table as rowid-alias column always set to null by `turso-db` Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #2063	2025-07-14 11:55:06 +03:00
Pere Diaz Bou	3a34f21434	io/windows: pread return bytes read	2025-07-14 10:44:56 +02:00
Pere Diaz Bou	340391538a	io: change comment for assert	2025-07-14 10:36:06 +02:00
Pere Diaz Bou	88ff218810	io: assert small I/O Let's assert for now that we do not read/write less bytes than expected. This should be fixed to retrigger several reads/writes if we couldn't read/write enough but for now let's assert.	2025-07-14 10:19:41 +02:00
Nikita Sivukhin	0457567714	more clippy fixes	2025-07-14 12:09:39 +04:00
Krishna Vishal	12f9743443	Remove unused imports	2025-07-14 13:13:54 +05:30
Krishna Vishal	ab0cb06755	split seek and getting rowid as two separate states	2025-07-14 13:11:41 +05:30
Krishna Vishal	3e880c34d6	Make `op_new_rowid` re-entrant Introduce `OpNewRowidState` state machine remove `get_new_rowid` from vdbe/mod.rs	2025-07-14 13:11:40 +05:30
Krishna Vishal	98ca275b33	Add a way to semi randomly generate rowid when the max rowid reaches `i64::MAX`. We do this by attempting to generate random values smaller than `i64::MAX` for 100 times and returns `DatabaseFull` error on failure - Introduced `DatabaseFull` error variant Fixes: https://github.com/tursodatabase/turso/issues/1977	2025-07-14 13:09:34 +05:30
Nikita Sivukhin	b330c6b70e	fix clippy	2025-07-14 11:38:08 +04:00
Nikita Sivukhin	e94ebbad04	remove unwanted changes	2025-07-14 11:27:51 +04:00
Nikita Sivukhin	cc04f11bd6	remove clone	2025-07-14 11:27:51 +04:00

1 2 3 4 5 ...

3366 Commits