Commit Graph

10455 Commits

Author SHA1 Message Date
Jussi Saurio
27b1c1a1db Merge 'Fix self-insert with nested subquery' from Mikaël Francoeur
There were 2 problems:
1. The SELECT wasn't propagating which register it used for its results,
so sometimes the INSERT read bad data.
2. `TableReferences::contains_table` was only checking the top-level
tables, not the nested tables in FROM queries. This condition is used to
emit "template 4", the bytecode template for self-inserts.
Closes https://github.com/tursodatabase/turso/issues/3312

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3436
2025-10-01 08:56:16 +03:00
Jussi Saurio
8a08f085e8 Merge 'Fix SQLite database file pending byte page' from Pedro Muniz
Sqlite has a crazy easter egg where a 1 Gib file offset, it creates a
`PENDING_BYTE_PAGE` that is used only by the VFS layer, and is never
read or written into.
To properly test this, I took inspiration from SQLITE testing framework,
and defined a helper method, that is conditionally compiled with the
`test_helper` feature enabled.
https://github.com/sqlite/sqlite/blob/7e38287da43ea3b661da3d8c1f431aa907
d648c9/src/main.c#L4327
As the `PENDING_BYTE` is normally at the 1 Gib mark, I created a
function that modifies the static `PENDING_BYTE` atomic to whatever
value we want. This means we can test this unusual behaviours at any DB
file size we want.
`fuzz_pending_byte_database` is the test that fuzzes different pending
byte offsets and does an integrity check at the end to confirm, we are
compatible with SQLITE
Closes #2749
<img width="1100" height="740" alt="image" src="https://github.com/user-
attachments/assets/06eb258f-b4b4-47bf-85f9-df1cf411e1df" />

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3431
2025-10-01 08:55:44 +03:00
Jussi Saurio
65abe3efdc Merge 'MVCC: Handle table ID / rootpages properly for both checkpointed and non-checkpointed tables' from Jussi Saurio
**Handle table ID / rootpages properly for both checkpointed and non-
checkpointed tables**
Table ID is an opaque identifier that is only meaningful to the MV
store.
Each checkpointed MVCC table corresponds to a single B-tree on the
pager,
which naturally has a root page.
**We cannot use root page as the MVCC table ID directly because:**
- We assign table IDs during MVCC commit, but
- we commit pages to the pager only during checkpoint
which means the root page is not easily knowable ahead of time.
**Hence:**
- MVCC table ids are always negative
- sqlite_schema rows will have a negative rootpage column if the
  table has not been checkpointed yet.
- on checkpoint when the table is allocated a real root page, we update
the row in sqlite_schema and in MV store's internal mapping
**On recovery:**
- All sqlite_schema tables are read directly from disk and assigned
`table_id = -1 * root_page` -- root_page on disk must be positive
- Logical log is deserialized and inserted into MV store
- Schema changes from logical_log are captured into the DB's global
schema
**Note about recovery:**
I changed MVCC recovery to happen on DB initialization which should
prevent any races, so no need for `recover_lock`, right @pereman2 ?

Closes #3419
2025-10-01 08:55:10 +03:00
Pekka Enberg
a09fd83544 Add Mold linker setup to CONTRIBUTING.md 2025-10-01 07:49:31 +03:00
Pekka Enberg
16540724aa Beta 2025-10-01 07:18:25 +03:00
Preston Thorpe
6fd2ad2f5e Merge 'support multiple conflict clauses in upsert' from Nikita Sivukhin
This PR implements support for `ON CONFLICT` clause chain, e.g.
```
INSERT INTO ct(id, x, y) VALUES (4, 'x', 'y1'), (5, 'a1', 'b'), (3, '_', '_')
  ON CONFLICT(x) DO UPDATE SET x = excluded.x || '-' || x, y = excluded.y || '@' || y, z = 'x' 
  ON CONFLICT(y) DO UPDATE SET x = excluded.x || '+' || x, y = excluded.y || '!' || y, z = 'y' 
  ON CONFLICT DO UPDATE SET x = excluded.x || '#' || x, y = excluded.y || '%' || y, z = 'fallback';
```

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3453
2025-09-30 19:50:59 -04:00
Nikita Sivukhin
7869ac348e rewrite MaybeLazy and add some test 2025-10-01 02:24:29 +04:00
Jussi Saurio
63f9913dbb Clear WhereTerm 'from_outer_join' state when LEFT JOIN is optimized to INNER JOIN
Closes #2470

In a query like `SELECT * FROM t LEFT JOIN s ON t.a=s.a WHERE s.a = 'foo'` we can
remove the LEFT JOIN because NULL values will be equal to 'foo'. In fact, we have
this optimization already.

However, there was a dumb bug where `WhereTerm`s involving this join still retained
their `from_outer_join` state, resulting in forcing the evaluation of those terms
at the original join index, which results in completely wrong bytecode if the join
optimizer decides to reorder the join as `s JOIN t` instead. Effectively it will
evaluate `t.a=s.a` after table `s` is open but table `t` is not open yet.

This PR fixes that issue by clearing `from_outer_join` properly from the relevant
`WhereTerm`s.
2025-10-01 00:33:22 +03:00
Jussi Saurio
d4d50b564a fix even more tests 2025-09-30 23:22:07 +03:00
Jussi Saurio
adc5b7b27f remove monkey print 2025-09-30 22:57:21 +03:00
Jussi Saurio
fe871188bf fix tests again 2025-09-30 22:54:48 +03:00
Jussi Saurio
fb2878973f fix sort order of write set 2025-09-30 22:54:36 +03:00
Jussi Saurio
509bde109e mvcc benchmark compilation fix 2025-09-30 22:27:28 +03:00
Jussi Saurio
fd84fd0683 fix test compilation errors 2025-09-30 22:27:28 +03:00
Jussi Saurio
e68c652f8f Add some table ID integrity checks to logical log recovery 2025-09-30 22:27:28 +03:00
Duy Dang
5ceab1b3f4 Circle detection for views 2025-10-01 02:12:21 +07:00
pedrocarlo
65cd4d998d page_size can be 0 when it is not initialized, so account for that 2025-09-30 15:58:38 -03:00
Pekka Enberg
229d96abf2 Merge 'core/vdbe: Don't clear parameters in Statement::reset()' from Pekka Enberg
As per SQLite API, sqlite3_reset() does *not* clear bind parameters.
Instead they're persistent across statement reset and only cleared with
sqlite3_clear_bindings().

Reviewed-by: Avinash Sajjanshetty (@avinassh)

Closes #3466
2025-09-30 21:57:59 +03:00
Nikita Sivukhin
33c46f77ce add js test 2025-09-30 22:46:21 +04:00
Nikita Sivukhin
f4263bf472 fix clippy 2025-09-30 22:43:58 +04:00
Nikita Sivukhin
9ef05adc5e fix upsert conflict handling 2025-09-30 22:39:55 +04:00
pedrocarlo
d5e365def4 add test 2025-09-30 15:26:13 -03:00
Pekka Enberg
25ffd4f01e core/vdbe: Don't clear parameters in Statement::reset()
As per SQLite API, sqlite3_reset() does *not* clear bind parameters.
Instead they're persistent across statement reset and only cleared with
sqlite3_clear_bindings().
2025-09-30 20:22:09 +03:00
Pavan-Nambi
15b5cefa6f format shall not be used 2025-09-30 22:31:59 +05:30
pedrocarlo
aa5055e563 fuzz tests for pending_byte 2025-09-30 13:52:40 -03:00
Nikita Sivukhin
73f68dfcfb remove unnecessary log 2025-09-30 20:47:39 +04:00
Nikita Sivukhin
f6d829f52d simplify upsert codegen 2025-09-30 20:47:39 +04:00
Nikita Sivukhin
3590f9882d support multiple conflict clauses in upsert 2025-09-30 20:47:39 +04:00
Nikita Sivukhin
18e8c037e9 fix tests 2025-09-30 20:45:00 +04:00
pedrocarlo
3d5978c718 add special hipp pending page that is supposed to be ignored 2025-09-30 13:43:10 -03:00
pedrocarlo
ddfe56bbb9 fix substr handling with utf-8 and blobs 2025-09-30 13:38:32 -03:00
Pekka Enberg
0157ffec7f Merge 'stress: add option to choose how many tables to generate' from Pere Diaz Bou
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3460
2025-09-30 19:37:35 +03:00
Preston Thorpe
3456d61ac0 Merge 'Index search fixes' from Nikita Sivukhin
This PR bundles 2 fixes:
1. Index search must skip NULL values
2. UPDATE must avoid using index which column is used in the SET clause
    * This was an optimization to not do full scan in case of `UPDATE t
SET ... WHERE col = ?` but instead of doing this hacks we must properly
load updated row set to the ephemeral index and flush it after update
will be finished instead of modifying BTree inplace
    * So, for now we completely remove this optimization and quitely
wait for proper optimization to land

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3459
2025-09-30 12:34:52 -04:00
Pekka Enberg
b511b23e70 Merge 'Make encryption opt in via flag' from Avinash Sajjanshetty
We had encryption feature behind a compiler flag. However, it wasn't
enabled by default. This patch:
- enables compiler flag by default
- it also adds an opt in runtime flag `experimental-encryption`
- the runtime flag is disabled by default

Closes #3457
2025-09-30 19:31:28 +03:00
Nikita Sivukhin
c84486c411 clippy logged in as jussi - so I need to fix more stuff 2025-09-30 18:45:17 +04:00
Nikita Sivukhin
4772c0406e make connect() method optional and call it implicitly on first query execution
- mostly needed for Drizzle - because other clients with ESM can just use await connect(...) wrapper
2025-09-30 18:40:01 +04:00
pedrocarlo
642679889a simplify exec_trim code + only pattern match on whitespace char 2025-09-30 11:09:47 -03:00
Nikita Sivukhin
bf5567de35 fix clippy
- the proper fix is to nuke it actually :)
2025-09-30 18:06:42 +04:00
Jussi Saurio
64ce33bd5c Move resolution of tableid/rootpage inside MvCursor constructor 2025-09-30 17:04:37 +03:00
Nikita Sivukhin
4a9309fe31 fix clippy 2025-09-30 17:58:12 +04:00
Nikita Sivukhin
e5aa836ad5 add simple test 2025-09-30 17:57:25 +04:00
Nikita Sivukhin
f1597dea90 fix all combinations of iteration direction and index order to properly handle nulls 2025-09-30 17:57:03 +04:00
Jussi Saurio
7c897d382f Implement MvTableId newtype for better type safety of table ids 2025-09-30 16:54:22 +03:00
Jussi Saurio
0ba4c6c00e use negative table id in mvcc tests 2025-09-30 16:53:12 +03:00
Jussi Saurio
a52dbb7842 Handle table ID / rootpages properly for both checkpointed and non-checkpointed tables
Table ID is an opaque identifier that is only meaningful to the MV store.
Each checkpointed MVCC table corresponds to a single B-tree on the pager,
which naturally has a root page.

We cannot use root page as the MVCC table ID directly because:
- We assign table IDs during MVCC commit, but
- we commit pages to the pager only during checkpoint
which means the root page is not easily knowable ahead of time.

Hence, we:

- store the mapping between table id and btree rootpage
- sqlite_schema rows will have a negative rootpage column if the
  table has not been checkpointed yet.
2025-09-30 16:53:12 +03:00
Jussi Saurio
a1bdad58b6 mvcc: add test to verify that reading both checkpointed and non-checkpointed tables works 2025-09-30 16:53:11 +03:00
Avinash Sajjanshetty
eb438237fe update documentation 2025-09-30 19:15:48 +05:30
Pere Diaz Bou
9993a83be4 stress: add option to choose how many tables to generate 2025-09-30 15:41:57 +02:00
Avinash Sajjanshetty
a360efa6e0 enable encryption feature flag by default 2025-09-30 19:04:25 +05:30
Nikita Sivukhin
c211fd1359 handle btree-table search properly
- btree-table doesn't have nulls in keys - so seek operation do some conversions and we shouldn't emit SeekGT { Null } in this case
2025-09-30 17:05:39 +04:00