Commit Graph

1402 Commits

Author SHA1 Message Date
Jussi Saurio
d2f5e67b25 Merge 'Fix COLLATE' from Jussi Saurio
Fixes the following problems with COLLATE:
- Fix: incorrectly used e.g. `x COLLATE NOCASE = 'fOo'` as index
constraint on an index whose column was not case-insensitively collated
- Fix: various ephemeral indexes (in GROUP BY, ORDER BY, DISTINCT) and
subqueries did not retain proper collation information of columns
- Fix: collation of a given expression was not determined properly
according to SQLite's rules
Adds TCL tests and fuzz test
Closes #3476
Closes #1524
Closes #3305

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3538
2025-10-03 09:34:24 +03:00
Preston Thorpe
990740ab73 Merge 'core/vdbe: Don't clear cursors in ProgramState::reset()' from Pekka Enberg
We don't need to clear the cursors explicitly because OpenRead and
OpenWrite will anyway replace them.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3526
2025-10-02 17:32:05 -04:00
Jussi Saurio
58ea9e4c3c clippy 2025-10-02 21:49:33 +03:00
Jussi Saurio
8e2e557da4 Collate: fix Insn::Compare to use collation seq of each compared column 2025-10-02 21:49:33 +03:00
Pekka Enberg
16f1c1ac8b core/vdbe: Don't clear cursors in ProgramState::reset()
We don't need to clear the cursors explicitly because OpenRead and
OpenWrite will anyway replace them.
2025-10-02 14:36:46 +03:00
Jussi Saurio
0e3132d24b Dont try to destroy btree in mvcc mode 2025-10-02 14:12:12 +03:00
Pekka Enberg
4c5a7cda08 core/vdbe: Avoid cloning Arc<MvStore> on every VDBE step
The VDBE step() function was taking Arc<MvStore> by value, causing it to
be cloned on every single step of query execution. This resulted in
thousands of atomic reference count increments/decrements per query,
showing up as a major hotspot in profiling.

Changed step() and related functions to take Option<&Arc<MvStore>>
instead, passing a reference rather than cloning the Arc. This eliminates
the unnecessary atomic operations while maintaining the same semantics.
2025-10-02 12:28:11 +03:00
Jussi Saurio
f48165eb72 fix/vdbe: reset op_transaction state properly
essentially after the first runthrough of `op_transaction` per a
given `ProgramState`, we weren't resetting the instruction state
to `Start´ at all, which means we didn't do any transaction state
checking/updating.

PR includes a rust bindings regression test that used to panic before
this change, and I bet it also fixes this issue in turso-go:

https://github.com/tursodatabase/turso-go/issues/28
2025-10-02 08:40:41 +03:00
Jussi Saurio
78cccdd87a Merge 'Substr fix UTF-8' from Pedro Muniz
Fixes:
- `start_value` and `length_value` should be casted to integers
- proper handling of utf-8 characters
- do not need to cast blob to string, as substr in blobs refers to byte
indexes and not char-indexes

Closes #3465
2025-10-02 06:55:38 +03:00
Jussi Saurio
3bc6311bfd mvcc: dont use mv store for ephemeral tables 2025-10-01 10:16:02 +03:00
Jussi Saurio
3ff6b44de2 Merge 'Fix index bookkeeping in DROP COLUMN' from Jussi Saurio
Closes #3448. Nasty bug - see issue for details

Closes #3449
2025-10-01 08:57:08 +03:00
Jussi Saurio
fb7e3918b3 Merge 'simplify exec_trim code + only pattern match on whitespace char' from Pedro Muniz
Consolidates the `exec_trim`, `exec_rtrim`, `exec_ltrim` code and only
pattern matches on whitespace character.
Fixes #3319

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3437
2025-10-01 08:56:39 +03:00
Jussi Saurio
27b1c1a1db Merge 'Fix self-insert with nested subquery' from Mikaël Francoeur
There were 2 problems:
1. The SELECT wasn't propagating which register it used for its results,
so sometimes the INSERT read bad data.
2. `TableReferences::contains_table` was only checking the top-level
tables, not the nested tables in FROM queries. This condition is used to
emit "template 4", the bytecode template for self-inserts.
Closes https://github.com/tursodatabase/turso/issues/3312

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3436
2025-10-01 08:56:16 +03:00
Jussi Saurio
65abe3efdc Merge 'MVCC: Handle table ID / rootpages properly for both checkpointed and non-checkpointed tables' from Jussi Saurio
**Handle table ID / rootpages properly for both checkpointed and non-
checkpointed tables**
Table ID is an opaque identifier that is only meaningful to the MV
store.
Each checkpointed MVCC table corresponds to a single B-tree on the
pager,
which naturally has a root page.
**We cannot use root page as the MVCC table ID directly because:**
- We assign table IDs during MVCC commit, but
- we commit pages to the pager only during checkpoint
which means the root page is not easily knowable ahead of time.
**Hence:**
- MVCC table ids are always negative
- sqlite_schema rows will have a negative rootpage column if the
  table has not been checkpointed yet.
- on checkpoint when the table is allocated a real root page, we update
the row in sqlite_schema and in MV store's internal mapping
**On recovery:**
- All sqlite_schema tables are read directly from disk and assigned
`table_id = -1 * root_page` -- root_page on disk must be positive
- Logical log is deserialized and inserted into MV store
- Schema changes from logical_log are captured into the DB's global
schema
**Note about recovery:**
I changed MVCC recovery to happen on DB initialization which should
prevent any races, so no need for `recover_lock`, right @pereman2 ?

Closes #3419
2025-10-01 08:55:10 +03:00
pedrocarlo
d5e365def4 add test 2025-09-30 15:26:13 -03:00
Pekka Enberg
25ffd4f01e core/vdbe: Don't clear parameters in Statement::reset()
As per SQLite API, sqlite3_reset() does *not* clear bind parameters.
Instead they're persistent across statement reset and only cleared with
sqlite3_clear_bindings().
2025-09-30 20:22:09 +03:00
pedrocarlo
ddfe56bbb9 fix substr handling with utf-8 and blobs 2025-09-30 13:38:32 -03:00
pedrocarlo
642679889a simplify exec_trim code + only pattern match on whitespace char 2025-09-30 11:09:47 -03:00
Jussi Saurio
64ce33bd5c Move resolution of tableid/rootpage inside MvCursor constructor 2025-09-30 17:04:37 +03:00
Jussi Saurio
a52dbb7842 Handle table ID / rootpages properly for both checkpointed and non-checkpointed tables
Table ID is an opaque identifier that is only meaningful to the MV store.
Each checkpointed MVCC table corresponds to a single B-tree on the pager,
which naturally has a root page.

We cannot use root page as the MVCC table ID directly because:
- We assign table IDs during MVCC commit, but
- we commit pages to the pager only during checkpoint
which means the root page is not easily knowable ahead of time.

Hence, we:

- store the mapping between table id and btree rootpage
- sqlite_schema rows will have a negative rootpage column if the
  table has not been checkpointed yet.
2025-09-30 16:53:12 +03:00
Jussi Saurio
35b584f050 Merge 'core: change root_page to i64' from Pere Diaz Bou
Closes #3454
2025-09-30 12:50:23 +03:00
Pekka Enberg
1b991156f3 Merge 'core/vdbe: Fix BEGIN after BEGIN CONCURRENT check' from Pekka Enberg
We're supposed to error out only when "is_begin" is true.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3450
2025-09-30 11:34:40 +03:00
Pekka Enberg
9b83fe7abf core/vdbe: Fix BEGIN after BEGIN CONCURRENT check
We're supposed to error out only when "is_begin" is true.
2025-09-30 10:55:13 +03:00
Jussi Saurio
9681377c51 Merge 'sum() can throw integer overflow' from Duy Dang
close #3311

Closes #3416
2025-09-30 10:38:37 +03:00
Jussi Saurio
594bdce999 Merge 'sum should identify if there is num in strings/prefix of strings' from Pavan Nambi
closes #3285
maybe adding seperate func for it is stupid but i kept running into
issues with not closing some random `}` and it got annoying real quick
so i just moved tht into its own func. - and as i am using same logic in
3 places i think it's ok.

Closes #3412
2025-09-30 10:37:17 +03:00
Jussi Saurio
dc1861d806 Assert we have the only strong reference instead of falling back to COW 2025-09-30 10:12:20 +03:00
Jussi Saurio
6bff9e53e5 Fix index bookkeeping in DROP COLUMN
See #3448 which this issue closes.
2025-09-30 10:00:16 +03:00
Diego Reis
c9421e034d fix(3306): substr scalar should also work with non-text values 2025-09-29 21:42:21 -03:00
Mikaël Francoeur
dc231abb2e fix self-insert bug 2025-09-29 17:18:19 -04:00
Pere Diaz Bou
0f631101df core: change page idx type from usize to i64
MVCC is like the annoying younger cousin (I know because I was him) that
needs to be treated differently. MVCC requires us to use root_pages that
might not be allocated yet, and the plan is to use negative root_pages
for that case. Therefore, we need i64 in order to fit this change.
2025-09-29 18:38:43 +02:00
Preston Thorpe
da599a1fb8 Merge 'quoting fix' from Nikita Sivukhin
This PR moves part of string normalization to the parser layer.
Now, we dequote and unescape values in the parser, but we still need to
lowercase them for proper ignore-case comparison logic in the planner.
The reason to not lowercase in the parser is following:
1. SQLite (and tursodb) have ident->string conversion rule and by
lowercasing value early we will loose original representation
2. Some things like column names are preserve the case right now and we
better to not change this behaviour.

Closes #3344
2025-09-29 12:33:42 -04:00
Pekka Enberg
1f86400cec Merge 'core/vdbe: Wrap Program::n_change with AtomicI64' from Pekka Enberg
Closes #3424
2025-09-29 18:11:18 +03:00
Pekka Enberg
5f9287304b core/vdbe: Wrap Program::n_change with AtomicI64 2025-09-29 17:09:33 +03:00
Nikita Sivukhin
a142c59de4 use explicit null if it set instead of column default value 2025-09-29 16:28:09 +04:00
Nikita Sivukhin
86a95e813d Merge branch 'main' into quoting-fix-attempt-2 2025-09-29 10:58:51 +04:00
Duy Dang
8562737d77 sum() can throw integer overflow 2025-09-29 00:27:59 +07:00
Pavan-Nambi
074a363c30 sum should identify if there is num in strings/prefix of strings 2025-09-28 17:23:55 +05:30
Pekka Enberg
3d3e39a958 Merge 'Make Sorter Send and Sync' from Pekka Enberg
Closes #3398
2025-09-27 16:51:27 +03:00
Pekka Enberg
5ff0044961 Merge 'length shall not count when it sees nullc' from Pavan Nambi
fixes #3317

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3356
2025-09-27 16:50:50 +03:00
Pekka Enberg
7b6fc0f3b6 core/vdbe: Wrap SortedChunk::total_bytes_read with AtomicUsize 2025-09-27 14:35:31 +03:00
Pekka Enberg
61b3f56997 core/vdbe: Wrap SortedChunk::io_state with RwLock 2025-09-27 14:28:55 +03:00
Pekka Enberg
5f39987ec0 core/vdbe: Wrap SortedChunk::buffer_len with AtomicUsize 2025-09-27 14:23:02 +03:00
Pekka Enberg
b31818f77c core/vdbe: Wrap SortedChunk::buffer with RwLock 2025-09-27 14:23:02 +03:00
Jussi Saurio
283fba2e0d use normalized table name 2025-09-27 09:53:11 +03:00
Jussi Saurio
67d320960d ALTER TABLE: prevent dropping/renaming column referenced in VIEW 2025-09-27 09:45:15 +03:00
Jussi Saurio
3137357092 ALTER TABLE: prevent dropping indexed column in VDBE layer 2025-09-27 09:45:15 +03:00
Preston Thorpe
e43184fb98 Merge 'translate: refactor arguments and centralize parameter context' from Preston Thorpe
This PR contains _no semantic changes_.
I made this cleanup on another branch when I was working on subqueries,
as the inconsistency with passing around `Schema` and `SymbolTable` had
been kinda bothering me. By adding `param_ctx` to the program, it
prevents accidentally resetting it to zero.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3382
2025-09-26 13:05:29 -04:00
PThorpe92
5fcc187434 translate: refactor arguments and centralize parameter context 2025-09-26 12:06:44 -04:00
Nikita Sivukhin
52f3216211 fix avg aggregation
- ignore NULL rows as SQLite do
- emit NULL instead of NaN when no rows were aggregated
- adjust agg column alias name
2025-09-26 17:11:06 +04:00
Pavan-Nambi
fdabbed539 length shall not count when it sees nullc 2025-09-26 15:07:33 +05:30