Commit Graph

1297 Commits

Author SHA1 Message Date
Jussi Saurio
eb7fa9693d Merge 'Return error on attempting to drop index associated with PRIMARY KEY and UNIQUE constraints' from
Closes issue #2455. Also includes tests.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2461
2025-08-07 09:00:02 +03:00
Glauber Costa
071330a739 implement the JournalMode vdbe instruction
We do this already, but not through any opcode.
Move it to an opcode for compatibility reasons.
2025-08-06 19:30:19 -05:00
Glauber Costa
f36974f086 implement the MaxPgCount opcode
It is used by the pragma max_page_count, which is also implemented.
2025-08-06 13:20:15 -05:00
Levy A.
c9e1eca8dc feat: add DropColumn instruction 2025-08-06 13:39:30 -03:00
Pekka Enberg
0c9216d1cc Merge 'cdc: emit entries for schema changes' from Nikita Sivukhin
This PR emit CDC entries as changes in `sqlite_schema` table for DDL
statements: `CREATE TABLE` / `CREATE INDEX` / etc.
The logic is a bit tricky as under the hood `turso` can do some implicit
DDL operations like:
1. Creating auto-indexes in case of `CREATE TABLE`
2. Deletion of all attached indices in case of `DROP TABLE`
```
turso> PRAGMA unstable_capture_data_changes_conn('full');
turso> CREATE TABLE t(x, y, z UNIQUE, q, PRIMARY KEY (x, y));
turso> CREATE INDEX t_xy ON t(x, y);
turso> CREATE TABLE q(a, b, c);
turso> ALTER TABLE q DROP COLUMN b;

turso> SELECT
change_id, 
id,
change_type, 
table_name,
bin_record_json_object(table_columns_json_array(table_name), before) AS before,
bin_record_json_object(table_columns_json_array(table_name), after) AS after
FROM turso_cdc;
┌───────────┬────┬─────────────┬───────────────┬─────────────────────────────────────────────────────────────────────┬─────────────────────────────────────────────────────────────────────┐
│ change_id │ id │ change_type │ table_name    │ before                                                              │ after                                                               │
├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│         1 │  2 │           1 │ sqlite_schema │                                                                     │ {"type":"table","name":"t","tbl_name":"t","rootpage":3,"sql":"CREA… │
├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│         2 │  5 │           1 │ sqlite_schema │                                                                     │ {"type":"index","name":"t_xy","tbl_name":"t","rootpage":6,"sql":"C… │
├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│         3 │  6 │           1 │ sqlite_schema │                                                                     │ {"type":"table","name":"q","tbl_name":"q","rootpage":7,"sql":"CREA… │
├───────────┼────┼─────────────┼───────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────┤
│         4 │  6 │           0 │ sqlite_schema │ {"type":"table","name":"q","tbl_name":"q","rootpage":7,"sql":"CREA… │ {"type":"table","name":"q","tbl_name":"q","rootpage":7,"sql":"CREA… │
└───────────┴────┴─────────────┴───────────────┴─────────────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────┘
```
For now, CDC capture only all explicit operations and ignore all
implicit operations. The reasoning for that is that one use case for CDC
is to apply logical changes as is with simple SQL statements - but if
implicit operations will be logged to the CDC table too - we can have
hard times using simple SQL statement (for example, creation of
`autoindices` will always work; implicit deletion of indices for `DROP
TABLE` also can lead to some troubles and force us to is `DROP INDEX IF
EXISTS ...` statements + we will need to filter out autoindices in this
case too).
Also, to simplify PR, for now `DatabaseTape` from `turso-sync` package
just ignore all schema changes from CDC table.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2426
2025-08-06 14:48:27 +03:00
rajajisai
3dd08864f9 Edit comment 2025-08-05 21:27:47 -07:00
rajajisai
f49575d547 Return error when trying to drop an index associated with UNIQUE or PRIMARY KEY constraints 2025-08-05 21:17:25 -07:00
Nikita Sivukhin
c0d5c55d5c fix tests and clippy 2025-08-06 01:03:49 +04:00
Nikita Sivukhin
c6a87d61c7 emit CDC entries if necessary for schema changes 2025-08-06 01:03:49 +04:00
Nikita Sivukhin
0b4c1ac802 refactor code a little bit 2025-08-06 01:03:48 +04:00
Glauber Costa
d1be7ad0bb implement the collseq bytecode instruction
SQLite generates those in aggregations like min / max with collation
information either in the table definition or in the column expression.

We currently generate the wrong result here, and properly generating the
bytecode instruction fixes it.
2025-08-05 13:49:04 -05:00
Jussi Saurio
c498196c7b fix/perf: fix regression in SELECT 1 benchmark
Do not start a read transaction when a SELECT is not going to access
the database, which means we can avoid checking whether the schema has
changed.
2025-08-05 15:10:55 +03:00
Jussi Saurio
c9c5565867 Merge 'Integrate virtual tables with optimizer' from Piotr Rżysko
This PR integrates virtual tables into the query optimizer. It is a
follow-up to https://github.com/tursodatabase/turso/pull/1727.
The most immediate improvement is better support for inner joins
involving TVFs, particularly when TVF arguments are column references.
### Example
The following two queries are semantically equivalent, but require
different join orders to be valid:
```sql
-- TVF depends on `t.id`, so `t` must be evaluated in outer loop
SELECT t.id, series.value
FROM target t, generate_series(t.id, 3) series;

-- Equivalent query, but with reversed table order in the FROM clause
SELECT t.id, series.value
FROM generate_series(t.id, 3) series, target t;
```
Without optimizer integration, the second query would fail because the
planner would attempt to evaluate `generate_series` before `t`. With
this change, the optimizer detects column dependencies and produces the
correct join order in both cases.
### TODO
Support for outer joins with TVFs is still missing and will be addressed
in a follow-up PR.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2439
2025-08-05 09:22:08 +03:00
Piotr Rzysko
59ec2d3949 Replace ConstraintInfo::plan_info with ConstraintInfo::index
The side of the binary expression no longer needs to be stored in
`ConstraintInfo`, since the optimizer now guarantees that it is always
on the right. As a result, only the index of the corresponding constraint
needs to be preserved.
2025-08-05 05:48:29 +02:00
Piotr Rzysko
8fb4fbf8af Make WhereTerm::consumed a plain bool
Now that virtual tables are integrated into the optimizer, this field no
longer needs to be wrapped in Cell<bool>.
2025-08-05 05:48:28 +02:00
Piotr Rzysko
99f87c07c1 Support column references in table-valued function arguments
This change extends table-valued function support by allowing arguments
to be column references, not only literals.

Virtual tables can now reject a plan by returning an error from
best_index (e.g., when a TVF argument references a table that appears
later in the join order). The planner using this information excludes
invalid plans during join order search.
2025-08-05 05:48:28 +02:00
Piotr Rzysko
82491ceb6a Integrate virtual tables with optimizer
This change connects virtual tables with the query optimizer.
The optimizer now considers virtual tables during join order search
and invokes their best_index callbacks to determine feasible access
paths.

Currently, this is not a visible change, since none of the existing
extensions return information indicating that a plan is invalid.
2025-08-05 05:48:28 +02:00
Piotr Rzysko
521eb2368e Return error when no valid plan exists
Replace panics with proper errors when a valid plan does not exist.
Currently, this never happens because a naive plan is always available.
However, once virtual tables are integrated into the planner, it may
occur—for example, when table-valued function arguments are column
references, and the function cannot be placed in the join order so that
its arguments can be evaluated.

Although this change is effectively a no-op for now, it is extracted
into a separate commit to avoid polluting the one that introduces
virtual table integration with the planner.
2025-08-04 20:27:23 +02:00
Piotr Rzysko
c80cd370cb Remove cost_upper_bound_ordered
It was redundant, as it was always equal to cost_upper_bound.
2025-08-04 20:27:23 +02:00
Piotr Rzysko
718598eab8 Introduce scan type
Different scan parameters are required for different table types.
Currently, index and iteration direction are only used by B-tree tables,
while the remaining table types don’t require any parameters. Planning
access to virtual tables, however, will require passing additional
information from the planner, such as the virtual table index (distinct
from a B-tree index) and the constraints that must be forwarded to the
`filter` method.
2025-08-04 20:27:22 +02:00
Piotr Rzysko
9167b30c7c Introduce AccessMethodParams
Previously, AccessMethod stored fields like `iter_dir`, `index`, and
`constraint_refs` directly, but these only applied to BTree tables.
Other table types (virtual tables, subqueries) either ignored these
fields or required different parameters entirely.

This change prepares the planner to handle virtual table access methods
with their own specialized parameters.
2025-08-04 20:23:44 +02:00
Piotr Rzysko
61234eeb19 Add ResultCode to best_index result
The `best_index` implementation now returns a ResultCode along with the
IndexInfo. This allows it to signal specific outcomes, such as errors or
constraint violations. This change aligns better with SQLite’s xBestIndex
contract, where cases like missing constraints or invalid combinations of
constraints must not result in a valid plan.
2025-08-04 20:18:44 +02:00
Piotr Rzysko
c465ce6e7b Clarify semantics of argv_index
Extend the documentation of `argv_index` and add validations enforcing
the requirements it must meet.
2025-08-04 19:31:18 +02:00
Piotr Rzysko
b0460a589f Ensure argv_index is either None or >= 1
Previously, there were two ways to indicate that a constraint should not
be passed to the filter function: setting `argv_index` to `None` or to
a value less than 1. This was redundant, so now only `None` is used.
2025-08-04 19:27:53 +02:00
Piotr Rzysko
c6f398122d Add validation for constraint usage length returned by best_index
Additional changes:
- Update IndexInfo documentation to clarify that constraint_usages must
  have exact 1:1 correspondence with input ConstraintInfo array. The code
  translating constraints into VFilter arguments heavily relies on this.
- Fix best_index implementation in test extension to comply with new
  validation requirements by returning usage entry for each constraint
2025-08-04 19:25:10 +02:00
pedrocarlo
0779c23bbf fix merge conflicts 2025-08-04 12:32:34 -03:00
pedrocarlo
d2019e95f3 pass schema to epilogue for schema_version checking + do not Pragma Schema Version in open_with_flags to avoid infinite loop in reprepare. Just access the database header directly 2025-08-04 12:32:34 -03:00
pedrocarlo
736748cdf7 Simplify program epilogue by tracking the transaction mode and rollback status in the ProgramBuilder and then calling epilogue just once 2025-08-04 12:32:34 -03:00
pedrocarlo
c567636deb Adjust Transaction OpCode to accept schema cookie + check if cookie changed 2025-08-04 12:32:34 -03:00
pedrocarlo
54636241c2 store Sql String inside Program for reprepare 2025-08-04 12:32:34 -03:00
Jussi Saurio
506bb5f67f Merge 'Direct schema mutation – add instruction' from Levy A.
Resolves #2378.
```
`ALTER TABLE _ RENAME TO _`/limbo_rename_table/
                        time:   [15.645 ms 15.741 ms 15.850 ms]
Found 12 outliers among 100 measurements (12.00%)
  8 (8.00%) high mild
  4 (4.00%) high severe
`ALTER TABLE _ RENAME TO _`/sqlite_rename_table/
                        time:   [34.728 ms 35.260 ms 35.955 ms]
Found 15 outliers among 100 measurements (15.00%)
  8 (8.00%) high mild
  7 (7.00%) high severe
  ```
<img width="1000" height="199" alt="image" src="https://github.com/user-
attachments/assets/ad943355-b57d-43d9-8a84-850461b8af41" />

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2399
2025-08-04 16:55:38 +03:00
Jussi Saurio
8a1723b3c8 fix/core/translate: ALTER TABLE DROP COLUMN: ensure schema cookie is updated even when target table is empty 2025-08-04 15:05:00 +03:00
Levy A.
1e177053cb feat: add RenameTable instruction
direct schema mutation, no reparsing
2025-08-01 21:11:25 -03:00
Diego Reis
7c70ac2c4a Fix #2390
Single quotes inside a string literal have to be doubled
2025-08-01 11:37:13 -03:00
Jussi Saurio
86b1232268 chore: enable indexes by default 2025-08-01 15:44:56 +03:00
Jussi Saurio
7259751eba Merge 'Support the OFFSET clause for Compound select' from meteorgan
Closes #2376
2025-08-01 10:18:13 +03:00
Jussi Saurio
77666b1eb5 Merge 'Fix parser error for repetition in row values' from Diego Reis
Closes #1948
This PR also adds pretty basic support for [row values in UPDATE stateme
nts](https://sqlite.org/rowvalue.html#row_values_in_update_statements),
but it only accepts expressions like:
```sql
UPDATE t SET (a, b) = (2 + 2, 'joe');
```
While SQLite accepts whole new statements, like:
```sql
UPDATE tab3 
   SET (a,b,c) = (SELECT x,y,z
                    FROM tab4
                   WHERE tab4.w=tab3.d)
 WHERE tab3.e BETWEEN 55 AND 66;
 ```
I noticed we don't explicitly have the concept of row values, maybe
doing some plumbing in that matter could solve it?
If there is a way to implement that with our current infrastructure
(a.k.a skill issue from my side) please comment here.

Closes #2355
2025-08-01 10:17:05 +03:00
meteorgan
6262ff4267 support offset for values 2025-08-01 00:46:46 +08:00
Glauber Costa
9d41fa4489 implement IN patterns for non-conditional SELECT queries
Extracts the core logic of IN from the conditional version, and uses the
conditional metadata to determine the jump. Then Uses the AddImm
operator we just added to force the integer conversion at the end (like
SQLite does).
2025-07-31 08:11:41 -05:00
meteorgan
a0f5554b08 support the OFFSET clause for Compound select 2025-07-31 17:43:54 +08:00
Jussi Saurio
f619556344 Merge 'Direct DatabaseHeader reads and writes – with_header and with_header_mut' from Levy A.
This PR introduces two methods to pager. Very much inspired by
`with_schema` and `with_schema_mut`. `Pager::with_header` and
`Pager::with_header_mut` will give to the closure a shared and unique
reference respectively that are transmuted references from the `PageRef`
buffer.
This PR also adds type-safe wrappers for `Version`, `PageSize`,
`CacheSize` and `TextEncoding`, as they have special in-memory
representations.
Writing the `DatabaseHeader` is just a single `memcpy` now.
```rs
pub fn write_database_header(&self, header: &DatabaseHeader) {
    let buf = self.as_ptr();
    buf[0..DatabaseHeader::SIZE].copy_from_slice(bytemuck::bytes_of(header));
}
```
`HeaderRef` and `HeaderRefMut` are used in the `with_header*` methods,
but also can be used on its own when there are multiple reads and writes
to the header, where putting everything in a closure would add too much
nesting.

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2234
2025-07-31 10:02:47 +03:00
Diego Reis
ab01b4e8ca Refactor UPDATE .. SET row values logic and add some comments 2025-07-31 00:08:15 -03:00
Diego Reis
31c73f3c9a Add basic support for row values in UPDATE .. SET statements
e.g `.. SET (a, b) = (1, 2)` is equivalent to `.. SET a = 1, b = 2`.

Alongside, to repeated lhs values, `(a, a)`, the last rhs prevail; so
`.. SET (a, a) = (1, 2)` is equivalent to `.. SET a = 2`
2025-07-31 00:08:15 -03:00
Diego Reis
3834f441c4 Accept parsing SET statements with repeated names, like `.. SET (a, a) =
(1, 2)`
2025-07-31 00:08:12 -03:00
PThorpe92
07137c7aaf Merge 'Implement the Cast opcode' from Glauber Costa
Our compat matrix mentions a couple of opcodes: ToInt, ToBlob, etc.
Those opcodes do not exist.
Instead, there is a single Cast opcode, that takes the affinity as a
parameter.
Currently we just call a function when we need to cast. This PR fixes
the compat file, implements the cast opcode, and in at least one
instance, when explicitly using the CAST keyword, uses that opcode
instead of a function in the generated bytecode.

Reviewed-by: Preston Thorpe (@PThorpe92)
Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2352
2025-07-30 22:32:09 -04:00
Glauber Costa
4bd1582e7d Implement the Cast opcode
Our compat matrix mentions a couple of opcodes: ToInt, ToBlob, etc.
Those opcodes do not exist.

Instead, there is a single Cast opcode, that takes the affinity as a
parameter.

Currently we just call a function when we need to cast. This PR fixes
the compat file, implements the cast opcode, and in at least one
instance, when explicitly using the CAST keyword, uses that opcode
instead of a function in the generated bytecode.
2025-07-30 20:44:54 -05:00
Levy A.
e35fdb8263 feat: zero-copy DatabaseHeader 2025-07-30 17:33:59 -03:00
Jussi Saurio
e128bd477e Merge 'Support VALUES clauses for compound select' from meteorgan
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2293
2025-07-30 21:34:40 +03:00
Jussi Saurio
7caef278a5 Merge 'Rewrite the WAL' from Preston Thorpe
closes #1893
Adds some fairly extensive tests but I'll continue to add some python
tests on top of the unit tests.
## Restart:
tested 
- open new DB
- create table and do a bunch of inserts
- `pragma wal_checkpoint(RESTART);`
- close db file
- re-open and verify we can read the wal/repopulate the frame cache
- verify min|max frame
tested 
- open same DB
- add more inserts
- `pragma wal_checkpoint(RESTART);`
- do _more_ inserts
- close
- re-open
- verify checksums/max_frame are valid
- verify row count
## Truncate
tested 
- open new db
- create table and add inserts
- `pragma wal_checkpoint(truncate);`
- close file
- verify WAL file is empty (32 bytes, header only)
- re-open file
- verify content/row count
tested 
- open db
- create table and insert many rows
- `pragma wal_checkpoint(truncate);`
- insert _more_ rows
- close db file
- verify WAL file is valid
- re-open file
- verify we can read entire file/repopulate the frame cache
<img width="541" height="315" alt="image" src="https://github.com/user-
attachments/assets/0470c795-5116-4866-b913-78c07b06b68c" />
```
# header
magic=0x377f0682
version=3007000
page_size=4096
seq=2
salt=ec475ff2-7ea94342
checksum=c9464aff-c571cc22
```

Closes #2179
2025-07-30 18:50:49 +03:00
PThorpe92
f4becd1296 Allow using !passive checkpoint methods in pragma wal_checkpoint 2025-07-30 14:08:33 +03:00