Commit Graph

229 Commits

Author SHA1 Message Date
Pekka Enberg
506908e648 Merge 'translate: disallow creating/dropping internal tables' from Jussi Saurio
edit: we can't disallow 'turso_' prefix though, because turso-sync-
engine uses it
Closes #3313

Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #3338
2025-09-26 10:40:09 +03:00
Pekka Enberg
9461e22c06 Merge 'Improve DBSP view serialization' from Glauber Costa
Improve serialization for DBSP views.
The serialization code was written organically, without much forward
thinking about stability as we evolved the table and operator format.
Now that this is done, we are at at point where we can actually make it
suck less and take a considerable step towards making this production
ready.
We also add a simple version check (in the table name, because that is
much easier than reading contents in parse_schema_row) to prevent views
to be used if we had to do anything to evolve the format of the circuit
(including the operators)

Closes #3351
2025-09-26 09:18:45 +03:00
Jussi Saurio
cfa449a0c0 Merge 'Disallow multiple primary keys in table definition' from Jussi Saurio
Closes #3309

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #3340
2025-09-26 09:16:14 +03:00
Jussi Saurio
abb0c704af translate: disallow creating/dropping internal tables
we can't disallow 'turso_' prefix though, because turso-sync-engine
uses it
2025-09-26 09:15:32 +03:00
Jussi Saurio
00b69467f3 Merge 'Add CAST to fuzzer' from Levy A.
Adds `CAST` to the fuzzer while fixing some incompatibility bugs.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3314
2025-09-26 09:13:49 +03:00
Jussi Saurio
252da9254a fix another incorrect test 2025-09-26 08:59:37 +03:00
Jussi Saurio
3170077952 Fix incorrect test 2025-09-26 08:59:37 +03:00
Jussi Saurio
6d6fc91da3 Disallow multiple primary keys in table definition 2025-09-26 08:59:36 +03:00
Glauber Costa
1b5e74060a make sure that we are able to prevent views from being corrupted
as we make changes to the way materialized views are generated (think
adding new operators, changing the id of existing operators, etc), we
will need to persist the topology of the circuit itself. This is a
change that I believe to be premature. For now, it is enough to reserve
the first operator id for it, and add a version number to the table
name. We can just detect that something changed, and ask the user to
drop the view. We can get away with it due to the fact that the views
are experimental.
2025-09-25 22:52:08 -03:00
PThorpe92
52e9bb8949 Create sentinel constant value for rowid to handle direct update of rowid col with no alias 2025-09-25 19:15:28 -04:00
Pere Diaz Bou
91cff65e44 Merge 'Autoincrement' from Pavan Nambi
fixes #1976
and #1605
```zsh
turso> DROP TABLE IF EXISTS t;
CREATE TABLE t (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT
);
turso> INSERT INTO t (name) VALUES ('A'); SELECT * FROM sqlite_sequence;
┌──────┬─────┐
│ name │ seq │
├──────┼─────┤
│ t    │   1 │
└──────┴─────┘
turso> DROP TABLE IF EXISTS t;
CREATE TABLE t (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT
);
turso> INSERT INTO t (name) VALUES ('A'); SELECT * FROM sqlite_sequence;
┌──────┬─────┐
│ name │ seq │
├──────┼─────┤
│ t    │   1 │
└──────┴─────┘
turso> INSERT INTO t (name) VALUES ('A'); SELECT * FROM sqlite_sequence;
┌──────┬─────┐
│ name │ seq │
├──────┼─────┤
│ t    │   2 │
└──────┴─────┘
turso>
```

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #2983
2025-09-25 18:57:24 +02:00
Jussi Saurio
c18c44b032 fix: result columns have varying binding precedence
In e.g. `SELECT x AS y, y AS x FROM t ORDER BY x;`, the `x` in the
`ORDER BY` should reference t.y, which has been aliased as `x` for this
query. The same goes for GROUP BY, JOIN ON etc. but NOT for WHERE.

Previously we had wrong precedence in `bind_and_rewrite_expr`.
2025-09-25 08:07:37 +03:00
Levy A.
5dfd67b118 feat: add CAST to fuzzer 2025-09-24 18:06:55 -03:00
Jussi Saurio
726bc24e78 Support referring to rowid as _rowid_ or oid 2025-09-24 09:17:28 +03:00
Pavan Nambi
f1ac855441 Merge branch 'main' into cdc_fail_autoincrement 2025-09-22 21:11:26 +05:30
Pavan-Nambi
215307a9bd cleanup - remove comments 2025-09-22 21:08:02 +05:30
Jussi Saurio
eada24b508 Store in-memory index definitions most-recently-seen-first
This solves an issue where an INSERT statement conflicts with
multiple indices. In that case, sqlite iterates the linked list
`pTab->pIndex` in order and handles the first conflict encountered.
The newest parsed index is always added to the head of the list.

To be compatible with this behavior, we also need to put the most
recently parsed index definition first in our indexes list for a given
table.
2025-09-22 10:11:50 +03:00
Pavan Nambi
47194d7658 Merge branch 'tursodatabase:main' into cdc_fail_autoincrement 2025-09-21 16:03:38 +05:30
PThorpe92
03149bc92d Remove unused imports 2025-09-20 18:32:37 -04:00
PThorpe92
1ed3fc52f7 Add method to validate the Where Expr from a partial index 2025-09-20 17:41:56 -04:00
PThorpe92
a0f574d279 Add where_clause expr field to Index 2025-09-20 14:38:47 -04:00
Pavan-Nambi
c9ab268bf5 remove extra checks for int column autoincr 2025-09-20 12:59:14 +05:30
Glauber Costa
0b3317d449 extract columns from all tables in case of joins.
Our code for view needs to extract the list of columns used in the view.
We currently extract only from "the base table", but once we have joins,
we need a more complex structure, that keeps the mapping of
(tables, columns).

This actually affects both views and materialized views: for views, the
queries with joins work just fine, because views are just aliases for
a query. But the list of columns returned by pragma table_info on the
view is incorrect. We add a test to make sure it is fixed.

For materialized views, we add extensive tests to make sure that the
columns are extracted correctly.
2025-09-19 03:57:11 -05:00
Pere Diaz Bou
d53c64e84b core/schema: parse schema rows for MVCC transactions 2025-09-18 18:22:06 +02:00
Pavan-Nambi
020921f803 Merge remote-tracking branch 'upstream/main' into cdc_fail_autoincrement 2025-09-18 19:27:19 +05:30
Pekka Enberg
182565fe0c core: Wrap MvCursor in Arc<RwLock<>>
Make it Send and Sync.
2025-09-17 12:46:55 +03:00
Jussi Saurio
9a2797963a Merge 'Remove LimboResult enum and InsnFunctionStepResult::Busy variant' from Jussi Saurio
We can just use `LimboError::Busy` for both of these.

Reviewed-by: Pekka Enberg <penberg@iki.fi>

Closes #3170
2025-09-17 12:06:54 +03:00
Jussi Saurio
dc103da2ed Remove LimboResult
this is only used for returning LimboResult::Busy, and we already
have LimboError::Busy, so it only adds confusion.

Moreover, the current busy handler was not handling LimboError::Busy,
because it's returned as an error, not as Ok. So this may fix the
"busy handler not working" issue in the perf thrpt benchmark.
2025-09-17 11:04:44 +03:00
Pekka Enberg
17e9f05ea4 core: Convert Rc<Pager> to Arc<Pager> 2025-09-17 09:32:49 +03:00
Glauber Costa
3565e7978a Add an index to the dbsp internal table
And also change the schema of the main table. I have come to see the
current key-value schema as inadequate for non-aggregate operators.
Calculating Min/Max, for example, doesn't feat in this schema because
we have to be able to track existing values and index them.

Another alternative is to keep one table per operator type, but this
quickly leads to an explosion of tables.
2025-09-15 22:30:48 -05:00
Pavan-Nambi
0effb981e6 autoincrement functionality works as good as sqlite now, handled all edge cases that we are aware of
- The code now prevents dropping or indexing `sqlite_sequence`
- make sure that AUTOINCREMENT only works on a single `INTEGER PRIMARY KEY`
-  handles `i64::MAX` gracefully by returning `SQLITE_FULL`
- also AUTOINCREMENT now works in both column and table constraints.

fmt
2025-09-13 16:35:36 +05:30
Pavan-Nambi
fdb4f98e11 Merge remote-tracking branch 'upstream/main' into cdc_fail_autoincrement 2025-09-13 07:17:18 +05:30
Preston Thorpe
b09dcceeef Merge 'Fixes views' from Glauber Costa
This is a collection of fixes for materialized views ahead of adding
support for JOINs.
It is mostly issues with how we assume there is a single table, with a
single delta, but we have to send more than one.
Those are things that are just objectively wrong, so I am sending it
separately to make the JOIN PR smaller.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3009
2025-09-12 07:43:32 -04:00
Pavan-Nambi
7191f1cc1c Merge remote-tracking branch 'upstream/main' into cdc_fail_autoincrement 2025-09-12 15:17:12 +05:30
Pekka Enberg
d80814fa2c core/schema: Optimize get_dependent_materialized_views() when no views
Eliminates get_dependent_materialized_views() overhead when there are no
views. Note that we need to optimize the case when there are views as
well because this ends up being pretty hot in write-intensive workloads.
2025-09-12 07:22:18 +03:00
Glauber Costa
8997670936 include dbsp tables in the list of tables that cannot be modified 2025-09-11 05:30:46 -07:00
Jussi Saurio
e3bd00883b Fix creation of automatic indexes
indexes with the naming scheme "sqlite_autoindex_<tblname>_<number>"
are automatically created when a table is created with UNIQUE or
PRIMARY KEY definitions.

these indexes must map to the table definition SQL in definition order,
i.e. sqlite_autoindex_foo_1 must be the first instance of UNIQUE or
PRIMARY KEY and so on.

this commit fixes our autoindex creation / parsing so that this invariant
is upheld.
2025-09-11 14:11:30 +03:00
Jussi Saurio
f17997fc5d Extract methods for populating indices/views from schema 2025-09-11 09:51:46 +03:00
Jussi Saurio
07944e23b5 Extract common logic for handling sqlite_schema rows 2025-09-11 09:45:40 +03:00
Pavan-Nambi
e5d3594fa2 fmt 2025-09-10 07:35:20 +05:30
Pavan-Nambi
a04bde12a9 resolve errors that came after merging 2025-09-10 07:34:59 +05:30
Pavan-Nambi
6728384b47 Merge remote-tracking branch 'origin/main' into cdc_fail_autoincrement 2025-09-10 07:30:22 +05:30
Pavan-Nambi
b833e71c20 inserting ain't working
hell yeah

concurrency tests passing now woosh

finally write tests passed

Most of the cdc tests are passing yay

autoincremeent draft

remove shared schema code that broke transactions

sequnce table should reset if table is drop

fmt

fmt

fmt
2025-09-09 20:07:52 +05:30
Jussi Saurio
a0613ef781 Avoid allocating and then immediately fallbacking errors in affinity
On the syscall IO backend, on TPC-H query 12, the _dominating_ part
of the stack trace is trying to construct affinities from a character,
failing, allocating an error&string, and then immediately falling back to
Blob affinity and dropping the error&string.

Since I'm on vacation I won't spend cycles on figuring out why we are passing
an incorrect affinity in `flags.get_affinity()` and instead make this lazy PR
just to improve performance and stop doing silly things :]
2025-09-05 18:34:23 +03:00
Glauber Costa
08b2e685d5 Persistence for DBSP-based materialized views
This fairly long commit implements persistence for materialized view.
It is hard to split because of all the interdependencies between components,
so it is a one big thing. This commit message will at least try to go into
details about the basic architecture.

Materialized Views as tables
============================

Materialized views are now a normal table - whereas before they were a virtual
table.  By making a materialized view a table, we can reuse all the
infrastructure for dealing with tables (cursors, etc).

One of the advantages of doing this is that we can create indexes on view
columns.  Later, we should also be able to write those views to separate files
with ATTACH write.

Materialized Views as Zsets
===========================

The contents of the table are a ZSet: rowid, values, weight. Readers will
notice that because of this, the usage of the ZSet data structure dwindles
throughout the codebase. The main difference between our materialized ZSet and
the standard DBSP ZSet, is that obviously ours is backed by a BTree, not a Hash
(since SQLite tables are BTrees)

Aggregator State
================

In DBSP, the aggregator nodes also have state. To store that state, there is a
second table.  The table holds all aggregators in the view, and there is one
table per view. That is __turso_internal_dbsp_state_{view_name}. The format of
that table is similar to a ZSet: rowid, serialized_values, weight. We serialize
the values because there will be many aggregators in the table. We can't rely
on a particular format for the values.

The Materialized View Cursor
============================

Reading from a Materialized View essentially means reading from the persisted
ZSet, and enhancing that with data that exists within the transaction.
Transaction data is ephemeral, so we do not materialize this anywhere: we have
a carefully crafted implementation of seek that takes care of merging weights
and stitching the two sets together.
2025-09-05 07:04:33 -05:00
Preston Thorpe
2ea2be6f85 Merge 'prevent modification to system tables.' from Glauber Costa
SQLite does not allow us to modify system tables, but we do. Let's fix
it.

Reviewed-by: Preston Thorpe <preston@turso.tech>
Reviewed-by: Avinash Sajjanshetty (@avinassh)

Closes #2855
2025-09-04 19:57:04 -04:00
Glauber Costa
032eabb3a4 prevent modification to system tables.
SQLite does not allow us to modify system tables, but we do.
Let's fix it.
2025-09-04 17:34:47 -05:00
TcMits
bfff05faba merge main 2025-09-02 18:25:20 +07:00
TcMits
6e87b08d64 faster type_from_name 2025-09-01 14:38:38 +07:00
TcMits
37f33dc45f add eq/contains/starts_with/ends_with_ignore_ascii_case 2025-08-31 16:18:42 +07:00