Commit Graph

1688 Commits

Author SHA1 Message Date
Jussi Saurio
d2f5e67b25 Merge 'Fix COLLATE' from Jussi Saurio
Fixes the following problems with COLLATE:
- Fix: incorrectly used e.g. `x COLLATE NOCASE = 'fOo'` as index
constraint on an index whose column was not case-insensitively collated
- Fix: various ephemeral indexes (in GROUP BY, ORDER BY, DISTINCT) and
subqueries did not retain proper collation information of columns
- Fix: collation of a given expression was not determined properly
according to SQLite's rules
Adds TCL tests and fuzz test
Closes #3476
Closes #1524
Closes #3305

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3538
2025-10-03 09:34:24 +03:00
Jussi Saurio
58ea9e4c3c clippy 2025-10-02 21:49:33 +03:00
Jussi Saurio
8e2e557da4 Collate: fix Insn::Compare to use collation seq of each compared column 2025-10-02 21:49:33 +03:00
Jussi Saurio
edd4651b97 Collate: add proper collation info for GROUP BY sorter columns 2025-10-02 21:49:33 +03:00
Jussi Saurio
f02757fe11 Collate: add proper collation to FROM-clause subquery result cols 2025-10-02 21:49:33 +03:00
Jussi Saurio
edfe0cb4fe Collate: prevent using an index if collation sequences don't match 2025-10-02 21:49:33 +03:00
Jussi Saurio
d42f3c7cbb Collate: compute collations properly for ORDER BY 2025-10-02 21:49:33 +03:00
Jussi Saurio
5a5f49933d Collate: add proper collation info to DISTINCT indexes 2025-10-02 21:49:33 +03:00
Jussi Saurio
f4ee0457b2 Collate: add proper collation info to compound select deduplication indexes 2025-10-02 21:49:33 +03:00
Jussi Saurio
e1fcd7b5e9 Collate: add get_collseq_from_expr()
Determines collation sequence to use for a given Expr
based on SQLite collation rules.
2025-10-02 21:49:33 +03:00
PThorpe92
43aba0ee95 Fix integer affinity for rowid expr type 2025-10-02 14:29:53 -04:00
Pekka Enberg
dc1463c70d Merge 'Improve error handling for cyclic views' from Duy Dang
The cycle is detected by marking a seen view, if a seen view is process
again, that's a cycle and we throw an error.
Close #3404

Closes #3467
2025-10-02 19:33:12 +03:00
Jussi Saurio
fa6ee6b850 Merge 'Fix: JOIN USING should pick columns from left table, not right' from Jussi Saurio
Closes #3468
Closes #3479

Closes #3485
2025-10-02 10:16:38 +03:00
Jussi Saurio
e65eae764c Merge 'Resolve appropriate column name for rowid alias/PK' from Preston Thorpe
closes https://github.com/tursodatabase/turso/issues/3512

Closes #3513
2025-10-02 06:59:18 +03:00
Jussi Saurio
30e6524c4e Fix: JOIN USING should pick columns from left table, not right
Closes #3468
Closes #3479
2025-10-02 06:56:52 +03:00
Jussi Saurio
c0da38e24a Merge 'Clear WhereTerm 'from_outer_join' state when LEFT JOIN is optimized to INNER JOIN' from Jussi Saurio
Closes #3470
## Background
In a query like `SELECT * FROM t LEFT JOIN s ON t.a=s.a WHERE s.a =
'foo'` we can remove the LEFT JOIN and replace it with an `INNER JOIN`
because NULL values will never be equal to 'foo'. Rewriting as `INNER
JOIN` allows the optimizer to also reorder the table join order to come
up with a more efficient query plan. In fact, we have this optimization
already.
## Problem
However, there is a dumb bug where `WhereTerm`s involving this join
still retain their `from_outer_join` state, resulting in forcing the
evaluation of those terms at the original join index, which results in
completely wrong bytecode if the join optimizer decides to reorder the
join as `s JOIN t` instead. Effectively it will evaluate `t.a=s.a` after
table `s` is open but table `t` is not open yet.
## Fix
This PR fixes that issue by clearing `from_outer_join` properly from the
relevant `WhereTerm`s.

Closes #3475
2025-10-02 06:56:07 +03:00
PThorpe92
efac598232 Resolve appropriate column name for rowid alias/PK 2025-10-01 21:49:42 -04:00
Mikaël Francoeur
6307774201 reject FROM clauses 2025-10-01 14:20:23 -04:00
Jussi Saurio
b2f9854b1c Add more documentation for WhereTerm::from_outer_join 2025-10-01 13:42:36 +03:00
Jussi Saurio
3ff6b44de2 Merge 'Fix index bookkeeping in DROP COLUMN' from Jussi Saurio
Closes #3448. Nasty bug - see issue for details

Closes #3449
2025-10-01 08:57:08 +03:00
Jussi Saurio
27b1c1a1db Merge 'Fix self-insert with nested subquery' from Mikaël Francoeur
There were 2 problems:
1. The SELECT wasn't propagating which register it used for its results,
so sometimes the INSERT read bad data.
2. `TableReferences::contains_table` was only checking the top-level
tables, not the nested tables in FROM queries. This condition is used to
emit "template 4", the bytecode template for self-inserts.
Closes https://github.com/tursodatabase/turso/issues/3312

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3436
2025-10-01 08:56:16 +03:00
Jussi Saurio
65abe3efdc Merge 'MVCC: Handle table ID / rootpages properly for both checkpointed and non-checkpointed tables' from Jussi Saurio
**Handle table ID / rootpages properly for both checkpointed and non-
checkpointed tables**
Table ID is an opaque identifier that is only meaningful to the MV
store.
Each checkpointed MVCC table corresponds to a single B-tree on the
pager,
which naturally has a root page.
**We cannot use root page as the MVCC table ID directly because:**
- We assign table IDs during MVCC commit, but
- we commit pages to the pager only during checkpoint
which means the root page is not easily knowable ahead of time.
**Hence:**
- MVCC table ids are always negative
- sqlite_schema rows will have a negative rootpage column if the
  table has not been checkpointed yet.
- on checkpoint when the table is allocated a real root page, we update
the row in sqlite_schema and in MV store's internal mapping
**On recovery:**
- All sqlite_schema tables are read directly from disk and assigned
`table_id = -1 * root_page` -- root_page on disk must be positive
- Logical log is deserialized and inserted into MV store
- Schema changes from logical_log are captured into the DB's global
schema
**Note about recovery:**
I changed MVCC recovery to happen on DB initialization which should
prevent any races, so no need for `recover_lock`, right @pereman2 ?

Closes #3419
2025-10-01 08:55:10 +03:00
Jussi Saurio
63f9913dbb Clear WhereTerm 'from_outer_join' state when LEFT JOIN is optimized to INNER JOIN
Closes #2470

In a query like `SELECT * FROM t LEFT JOIN s ON t.a=s.a WHERE s.a = 'foo'` we can
remove the LEFT JOIN because NULL values will be equal to 'foo'. In fact, we have
this optimization already.

However, there was a dumb bug where `WhereTerm`s involving this join still retained
their `from_outer_join` state, resulting in forcing the evaluation of those terms
at the original join index, which results in completely wrong bytecode if the join
optimizer decides to reorder the join as `s JOIN t` instead. Effectively it will
evaluate `t.a=s.a` after table `s` is open but table `t` is not open yet.

This PR fixes that issue by clearing `from_outer_join` properly from the relevant
`WhereTerm`s.
2025-10-01 00:33:22 +03:00
Duy Dang
5ceab1b3f4 Circle detection for views 2025-10-01 02:12:21 +07:00
Nikita Sivukhin
f4263bf472 fix clippy 2025-09-30 22:43:58 +04:00
Nikita Sivukhin
9ef05adc5e fix upsert conflict handling 2025-09-30 22:39:55 +04:00
Nikita Sivukhin
73f68dfcfb remove unnecessary log 2025-09-30 20:47:39 +04:00
Nikita Sivukhin
f6d829f52d simplify upsert codegen 2025-09-30 20:47:39 +04:00
Nikita Sivukhin
3590f9882d support multiple conflict clauses in upsert 2025-09-30 20:47:39 +04:00
Preston Thorpe
3456d61ac0 Merge 'Index search fixes' from Nikita Sivukhin
This PR bundles 2 fixes:
1. Index search must skip NULL values
2. UPDATE must avoid using index which column is used in the SET clause
    * This was an optimization to not do full scan in case of `UPDATE t
SET ... WHERE col = ?` but instead of doing this hacks we must properly
load updated row set to the ephemeral index and flush it after update
will be finished instead of modifying BTree inplace
    * So, for now we completely remove this optimization and quitely
wait for proper optimization to land

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3459
2025-09-30 12:34:52 -04:00
Nikita Sivukhin
c84486c411 clippy logged in as jussi - so I need to fix more stuff 2025-09-30 18:45:17 +04:00
Nikita Sivukhin
bf5567de35 fix clippy
- the proper fix is to nuke it actually :)
2025-09-30 18:06:42 +04:00
Nikita Sivukhin
4a9309fe31 fix clippy 2025-09-30 17:58:12 +04:00
Nikita Sivukhin
f1597dea90 fix all combinations of iteration direction and index order to properly handle nulls 2025-09-30 17:57:03 +04:00
Jussi Saurio
a52dbb7842 Handle table ID / rootpages properly for both checkpointed and non-checkpointed tables
Table ID is an opaque identifier that is only meaningful to the MV store.
Each checkpointed MVCC table corresponds to a single B-tree on the pager,
which naturally has a root page.

We cannot use root page as the MVCC table ID directly because:
- We assign table IDs during MVCC commit, but
- we commit pages to the pager only during checkpoint
which means the root page is not easily knowable ahead of time.

Hence, we:

- store the mapping between table id and btree rootpage
- sqlite_schema rows will have a negative rootpage column if the
  table has not been checkpointed yet.
2025-09-30 16:53:12 +03:00
Nikita Sivukhin
c211fd1359 handle btree-table search properly
- btree-table doesn't have nulls in keys - so seek operation do some conversions and we shouldn't emit SeekGT { Null } in this case
2025-09-30 17:05:39 +04:00
Jussi Saurio
81e7c26f55 Merge 'Anonymous params fix' from Nikita Sivukhin
This PR auto-assign ids for anonymous variables straight into parser.
Otherwise - it's pretty easy to mess up with traversal order in the core
code and assign ids incorrectly.
For example, before the fix, following code worked incorrectly because
parameter values were assigned first to conflict clause instead of
values:
```rs
let mut stmt = conn.prepare("INSERT INTO test VALUES (?, ?), (?, ?) ON CONFLICT DO UPDATE SET v = ?")?;
stmt.bind_at(1.try_into()?, Value::Integer(1));
stmt.bind_at(2.try_into()?, Value::Integer(20));
stmt.bind_at(3.try_into()?, Value::Integer(3));
stmt.bind_at(4.try_into()?, Value::Integer(40));
stmt.bind_at(5.try_into()?, Value::Integer(66));
```

Closes #3455
2025-09-30 15:48:35 +03:00
Nikita Sivukhin
a32ed53bd8 remove optimization
- even if index search will return only 1 row - it will call next in the loop - and we incorrecty can process same row values multiple times
- the following query failed with this optimization:

turso> CREATE TABLE t (id INTEGER PRIMARY KEY AUTOINCREMENT, k TEXT, c0 INT);
turso> CREATE UNIQUE INDEX idx_p1_0 ON t(c0);
turso> insert into t values (null, 'uu', -1);
turso> insert into t values (null, 'uu', -2);
turso> UPDATE t SET c0 = NULL WHERE c0 = -1;
turso> SELECT * FROM t
┌────┬────┬────┐
│ id │ k  │ c0 │
├────┼────┼────┤
│  1 │ uu │    │
├────┼────┼────┤
│  2 │ uu │    │
└────┴────┴────┘
2025-09-30 16:37:41 +04:00
Nikita Sivukhin
e9b8b0265d skip NULL in case of search over index 2025-09-30 16:16:04 +04:00
Nikita Sivukhin
e111226f3b add comment 2025-09-30 15:28:50 +04:00
Nikita Sivukhin
ab92102cd8 remove parameter id assign logic from core 2025-09-30 13:58:59 +04:00
Jussi Saurio
35b584f050 Merge 'core: change root_page to i64' from Pere Diaz Bou
Closes #3454
2025-09-30 12:50:23 +03:00
Jussi Saurio
e6a2e2a9cf Merge 'Remove double-quoted identifier assert' from Diego Reis
Closes #3301
Not every identifier should be double-quoted

Closes #3440
2025-09-30 10:34:49 +03:00
Jussi Saurio
6bff9e53e5 Fix index bookkeeping in DROP COLUMN
See #3448 which this issue closes.
2025-09-30 10:00:16 +03:00
Pekka Enberg
9a08fb9e43 core/translate: Remove useless comment from logical.rs 2025-09-30 07:35:12 +03:00
Pekka Enberg
6053bb6556 Merge 'Fix materialized views with complex expressions' from Glauber Costa
SQLite supports complex expressions in group by columns - because of
course it does...
So we need to make sure that a column is created for this expression if
it doesn't exist already, and compute it, the same way we compute pre-
projections in the filter operator.
Fixes #3363
Fixes #3366
Fixes #3365

Closes #3429
2025-09-30 07:34:51 +03:00
Diego Reis
90f4d69774 fix(3301): Remove identifier assert assumption
Not every identifier should be double-quoted
2025-09-29 22:33:21 -03:00
Mikaël Francoeur
dc231abb2e fix self-insert bug 2025-09-29 17:18:19 -04:00
Pekka Enberg
75f088740d Merge 'core: Disallow CREATE INDEX when MVCC is enabled' from Pekka Enberg
MVCC does currently not support indexes. Therefore,
- Fail if a database with indexes is opened with MVCC
- Disallow `CREATE INDEX` when MVCC is enabled
Fixes: #3108

Closes #3426
2025-09-29 20:53:16 +03:00
Glauber Costa
2fde976605 Fix materialized views with complex expressions
SQLite supports complex expressions in group by columns - because of
course it does...

So we need to make sure that a column is created for this expression if
it doesn't exist already, and compute it, the same way we compute
pre-projections in the filter operator.

Fixes #3363
Fixes #3366
Fixes #3365
2025-09-29 11:56:21 -05:00