Commit Graph

175 Commits

Author SHA1 Message Date
Jussi Saurio
0173d31c04 clippy: collapse nested if 2025-10-14 15:51:31 +03:00
Jussi Saurio
4b80678898 Allow case where cursor for btree is already opened
When populating an ephemeral table for UPDATE, it may open a cursor
on the (permanent) table - in this case we don't need to open it
again in the UPDATE loop
2025-10-14 15:32:48 +03:00
Jussi Saurio
f5ee4807da Properly differentiate between source and target in UPDATE
- Encode information about ephemeral source table in OperationMode::UPDATE
  if present
- Use OperationMode information to correctly resolve cursors in UPDATE
2025-10-14 14:17:28 +03:00
Jussi Saurio
691dce6b8a Make decision about UpdatePlan::ephemeral_plan _after_ optimizer
An ephemeral table is required if the b-tree key of the table (rowid)
or the index (index key) is affected by the UPDATE.
2025-10-14 14:17:28 +03:00
Jussi Saurio
c2fe13ad4f Update documentation of UpdatePlan::ephemeral_plan
It now better reflects when it is used.
2025-10-14 12:18:53 +03:00
Jussi Saurio
3669437482 Add vibecoded tests for ColumnUsedMask 2025-10-13 14:03:34 +03:00
Jussi Saurio
e055ed9a8d Allow arbitrarily many columns in a table
Use roaring bitmaps because ColumnUsedMask is likely to be
sparsely populated.
2025-10-13 13:30:26 +03:00
Jussi Saurio
59a1c2ae2e Disallow joining more than 63 tables
Returns an error instead of panicing
2025-10-13 13:30:03 +03:00
Nikita Sivukhin
4313f57ecb Optimize range scans 2025-10-09 11:47:41 +03:00
Jussi Saurio
f02757fe11 Collate: add proper collation to FROM-clause subquery result cols 2025-10-02 21:49:33 +03:00
Jussi Saurio
c0da38e24a Merge 'Clear WhereTerm 'from_outer_join' state when LEFT JOIN is optimized to INNER JOIN' from Jussi Saurio
Closes #3470
## Background
In a query like `SELECT * FROM t LEFT JOIN s ON t.a=s.a WHERE s.a =
'foo'` we can remove the LEFT JOIN and replace it with an `INNER JOIN`
because NULL values will never be equal to 'foo'. Rewriting as `INNER
JOIN` allows the optimizer to also reorder the table join order to come
up with a more efficient query plan. In fact, we have this optimization
already.
## Problem
However, there is a dumb bug where `WhereTerm`s involving this join
still retain their `from_outer_join` state, resulting in forcing the
evaluation of those terms at the original join index, which results in
completely wrong bytecode if the join optimizer decides to reorder the
join as `s JOIN t` instead. Effectively it will evaluate `t.a=s.a` after
table `s` is open but table `t` is not open yet.
## Fix
This PR fixes that issue by clearing `from_outer_join` properly from the
relevant `WhereTerm`s.

Closes #3475
2025-10-02 06:56:07 +03:00
Jussi Saurio
b2f9854b1c Add more documentation for WhereTerm::from_outer_join 2025-10-01 13:42:36 +03:00
Jussi Saurio
27b1c1a1db Merge 'Fix self-insert with nested subquery' from Mikaël Francoeur
There were 2 problems:
1. The SELECT wasn't propagating which register it used for its results,
so sometimes the INSERT read bad data.
2. `TableReferences::contains_table` was only checking the top-level
tables, not the nested tables in FROM queries. This condition is used to
emit "template 4", the bytecode template for self-inserts.
Closes https://github.com/tursodatabase/turso/issues/3312

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3436
2025-10-01 08:56:16 +03:00
Nikita Sivukhin
a32ed53bd8 remove optimization
- even if index search will return only 1 row - it will call next in the loop - and we incorrecty can process same row values multiple times
- the following query failed with this optimization:

turso> CREATE TABLE t (id INTEGER PRIMARY KEY AUTOINCREMENT, k TEXT, c0 INT);
turso> CREATE UNIQUE INDEX idx_p1_0 ON t(c0);
turso> insert into t values (null, 'uu', -1);
turso> insert into t values (null, 'uu', -2);
turso> UPDATE t SET c0 = NULL WHERE c0 = -1;
turso> SELECT * FROM t
┌────┬────┬────┐
│ id │ k  │ c0 │
├────┼────┼────┤
│  1 │ uu │    │
├────┼────┼────┤
│  2 │ uu │    │
└────┴────┴────┘
2025-09-30 16:37:41 +04:00
Mikaël Francoeur
dc231abb2e fix self-insert bug 2025-09-29 17:18:19 -04:00
Nikita Sivukhin
c4b3074575 format 2025-09-26 13:01:49 +04:00
Nikita Sivukhin
fdf8ca88fd introduce exact(...) function - because enum variant will disappear 2025-09-26 13:01:49 +04:00
PThorpe92
6dc7d04c5a Replace translate_epxr with translate_condition_expr and fix constraint error 2025-09-20 15:02:06 -04:00
TcMits
88119888d0 reduce allocation needed for break_predicate_at_and_boundaries 2025-09-18 10:52:29 +07:00
Piotr Rzysko
f5efcbe745 Add support for window functions
Adds initial support for window functions. For now, only existing
aggregate functions can be used as window functions—no specialized
window-specific functions are supported yet.

Currently, only the default frame definition is implemented:
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW EXCLUDE NO OTHERS.
2025-09-13 11:12:44 +02:00
Piotr Rzysko
c81cd16230 Extract QueryDestination::placeholder_for_subquery 2025-09-13 10:49:14 +02:00
Piotr Rzysko
5f2a3e1242 Handle dummy argument for count() and count(*) in translation
Two main reasons for this change:
* Improve readability by moving the logic for this special case closer
  to the code that relies on it.
* Decouple AggFunc from the Aggregate struct. In the future, window
  function processing will use AggFunc directly, without necessarily
  depending on Aggregate.
2025-09-13 10:49:14 +02:00
Pekka Enberg
f88f39082a core/vdbe: Fix MakeRecord affinity handling
The MakeRecord instruction now accepts an optional affinity_str
parameter that applies column-specific type conversions before creating
records. When provided, the affinity string is applied
character-by-character to each register using the existing
apply_affinity_char() function, matching SQLite's behavior.

Fixes #2040
Fixes #2041
2025-09-08 18:49:13 +03:00
Glauber Costa
08b2e685d5 Persistence for DBSP-based materialized views
This fairly long commit implements persistence for materialized view.
It is hard to split because of all the interdependencies between components,
so it is a one big thing. This commit message will at least try to go into
details about the basic architecture.

Materialized Views as tables
============================

Materialized views are now a normal table - whereas before they were a virtual
table.  By making a materialized view a table, we can reuse all the
infrastructure for dealing with tables (cursors, etc).

One of the advantages of doing this is that we can create indexes on view
columns.  Later, we should also be able to write those views to separate files
with ATTACH write.

Materialized Views as Zsets
===========================

The contents of the table are a ZSet: rowid, values, weight. Readers will
notice that because of this, the usage of the ZSet data structure dwindles
throughout the codebase. The main difference between our materialized ZSet and
the standard DBSP ZSet, is that obviously ours is backed by a BTree, not a Hash
(since SQLite tables are BTrees)

Aggregator State
================

In DBSP, the aggregator nodes also have state. To store that state, there is a
second table.  The table holds all aggregators in the view, and there is one
table per view. That is __turso_internal_dbsp_state_{view_name}. The format of
that table is similar to a ZSet: rowid, serialized_values, weight. We serialize
the values because there will be many aggregators in the table. We can't rely
on a particular format for the values.

The Materialized View Cursor
============================

Reading from a Materialized View essentially means reading from the persisted
ZSet, and enhancing that with data that exists within the transaction.
Transaction data is ephemeral, so we do not materialize this anywhere: we have
a carefully crafted implementation of seek that takes care of merging weights
and stitching the two sets together.
2025-09-05 07:04:33 -05:00
Pekka Enberg
44357f93a2 Merge branch 'main' into 2025-08-21-make-limit-and-offset-expr 2025-09-04 09:54:45 +03:00
Piotr Rzysko
3ad4016080 Fix handling of zero-argument grouped aggregations
This commit consolidates the creation of the Aggregate struct, which was
previously handled differently in `prepare_one_select_plan` and
`resolve_aggregates`. That discrepancy caused inconsistent handling of
zero-argument aggregates.

The queries added in the new tests would previously trigger a panic.
2025-08-31 12:02:09 +02:00
bit-aloo
a16bee4574 move to new parser 2025-08-26 19:56:24 +05:30
bit-aloo
28439efd09 make offset and limit Expr 2025-08-26 19:56:11 +05:30
Pekka Enberg
26ba09c45f Revert "Merge 'Remove double indirection in the Parser' from Pedro Muniz"
This reverts commit 71c1b357e4, reversing
changes made to 6bc568ff69 because it
actually makes things slower.
2025-08-26 14:58:21 +03:00
pedrocarlo
d3240844ec refactor Core to remove the double indirection 2025-08-25 22:59:31 -03:00
Levy A.
4ba1304fb9 complete parser integration 2025-08-21 15:23:59 -03:00
Levy A.
186e2f5d8e switch to new parser 2025-08-21 15:19:16 -03:00
Jussi Saurio
5da76c9125 Allow index in UPDATE for point queries (i.e. max 1 row affected) 2025-08-14 15:58:01 +03:00
Nikita Sivukhin
5d0ada9fb9 add "updates" column for cdc table 2025-08-11 12:46:15 +04:00
Jussi Saurio
c498196c7b fix/perf: fix regression in SELECT 1 benchmark
Do not start a read transaction when a SELECT is not going to access
the database, which means we can avoid checking whether the schema has
changed.
2025-08-05 15:10:55 +03:00
Piotr Rzysko
8fb4fbf8af Make WhereTerm::consumed a plain bool
Now that virtual tables are integrated into the optimizer, this field no
longer needs to be wrapped in Cell<bool>.
2025-08-05 05:48:28 +02:00
Piotr Rzysko
82491ceb6a Integrate virtual tables with optimizer
This change connects virtual tables with the query optimizer.
The optimizer now considers virtual tables during join order search
and invokes their best_index callbacks to determine feasible access
paths.

Currently, this is not a visible change, since none of the existing
extensions return information indicating that a plan is invalid.
2025-08-05 05:48:28 +02:00
Piotr Rzysko
718598eab8 Introduce scan type
Different scan parameters are required for different table types.
Currently, index and iteration direction are only used by B-tree tables,
while the remaining table types don’t require any parameters. Planning
access to virtual tables, however, will require passing additional
information from the planner, such as the virtual table index (distinct
from a B-tree index) and the constraints that must be forwarded to the
`filter` method.
2025-08-04 20:27:22 +02:00
Jussi Saurio
574c15b5e4 perf: fix logic error in is_simple_count() 2025-07-29 09:11:54 +03:00
Pere Diaz Bou
752a876f9a change every Rc to Arc in schema internals 2025-07-28 10:51:17 +02:00
Pekka Enberg
669b231714 Merge 'parser: Distinguish quoted identifiers and unify Id into Name enum' from bit-aloo
Closes: #1947
This PR replaces the `Name(pub String)` struct with a `Name` enum that
explicitly models how the name appeared in the source either as an
unquoted identifier (`Ident`) or a quoted string (`Quoted`).
In the process, the separate `Id` wrapper type has been coalesced into
the `Name` enum, simplifying the AST and reducing duplication in
identifier handling logic.
While this increases the size of some AST nodes (notably
`yyStackEntry`).
cc: @levydsa

Reviewed-by: Levy A. (@levydsa)
Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2251
2025-07-25 12:08:54 +03:00
Glauber Costa
988b16f962 Support ATTACH (read only)
Support for attaching databases. The main difference from SQLite is that
we support an arbitrary number of attached databases, and we are not
bound to just 100ish.

We for now only support read-only databases. We open them as read-only,
but also, to keep things simple, we don't patch any of the insert
machinery to resolve foreign tables.  So if an insert is tried on an
attached database, it will just fail with a "no such table" error - this
is perfect for now.

The code in core/translate/attach.rs is written by Claude, who also
played a key part in the boilerplate for stuff like the .databases
command and extending the pragma database_list, and also aided me in
the test cases.
2025-07-24 19:19:48 -05:00
bit-aloo
9a54ef214e parser: Distinguish quoted identifiers and unify Id into Name enum
This commit replaces the `Name(pub String)` struct with a `Name` enum that
explicitly models how the name appeared in the source either as an
unquoted identifier (`Ident`) or a quoted string (`Quoted`).

In the process, the separate `Id` wrapper type has been coalesced into the
`Name` enum, simplifying the AST and reducing duplication in identifier
handling logic.

While this increases the size of some AST nodes (notably `yyStackEntry`),
it improves correctness and makes source structure more explicit for
later phases.
2025-07-24 14:40:19 +05:30
Piotr Rzysko
000d70f1f3 Propagate info about hidden columns 2025-07-14 07:16:53 +02:00
Nils Koch
828d4f5016 fix clippy errors for rust 1.88.0 (auto fix) 2025-07-12 18:58:41 +03:00
meteorgan
4a516ab414 Support except operator for compound select 2025-07-08 22:57:20 +08:00
Levy A.
ffd6844b5b refactor: remove PseudoTable from Table
the only reason for `PseudoTable` to exist, is to provide column
information for `PseudoCursor` creation. this should not be part of the
schema.
2025-06-30 14:31:58 -03:00
Pekka Enberg
725c3e4ddc Rename limbo_sqlite3_parser crate to turso_sqlite3_parser 2025-06-29 12:34:46 +03:00
Pekka Enberg
eb0de4066b Rename limbo_ext crate to turso_ext 2025-06-29 12:14:08 +03:00
Nils Koch
2827b86917 chore: fix clippy warnings 2025-06-23 19:52:13 +01:00