Commit Graph

86 Commits

Author SHA1 Message Date
Levy A.
4ba1304fb9 complete parser integration 2025-08-21 15:23:59 -03:00
Levy A.
186e2f5d8e switch to new parser 2025-08-21 15:19:16 -03:00
Jussi Saurio
dd2e0ea596 Fix: always emit rowid when column is rowid alias
SQLite does not store the rowid alias column in the record at all
when it is a rowid alias, because the rowid is always stored anyway
in the record header.
2025-08-21 16:40:10 +03:00
Pekka Enberg
2fa501158c Merge 'turso-cdc: add updates column for cdc table' from Nikita Sivukhin
This PR adds new `updates` column to the CDC table. This column holds
updated fields of the row in the following format:
```
[C boolean values where true set for changed columns]
[C values with updates where NULL is set for not-changed columns]
```
For example:
```
turso> UPDATE t SET y = 'turso', q = 'db' WHERE rowid = 1;
turso> SELECT bin_record_json_object('["x","y","z","q","x","y","z","q"]', updates) as updates FROM turso_cdc;
┌──────────────────────────────────────────────────────────────────┐
│ updates                                                          │
├──────────────────────────────────────────────────────────────────┤
│ {"x":0,"y":1,"z":0,"q":1,"x":null,"y":"turso","z":null,"q":"db"} │
└──────────────────────────────────────────────────────────────────┘
```
Also, this column works differently for `ALTER TABLE` statements where
update value for `sql` will be equal to the original `ALTER TABLE`:
```
turso> ALTER TABLE t ADD COLUMN t;
turso> SELECT bin_record_json_object('["type","name","tbl_name","rootpage","sql","type","name","tbl_name","rootpage","sql"]', updates) as updates FROM turso_cdc WHERE rowid = 2;
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ updates                                                                                                                                           │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ {"type":0,"name":0,"tbl_name":0,"rootpage":0,"sql":1,"type":null,"name":null,"tbl_name":null,"rootpage":null,"sql":"ALTER TABLE t ADD COLUMN t;"} │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
```
This will help turso-db to implement logical replication which supports
both column-level updates and schema changes

Closes #2538
2025-08-12 09:50:16 +03:00
Nikita Sivukhin
5d0ada9fb9 add "updates" column for cdc table 2025-08-11 12:46:15 +04:00
Glauber Costa
145d6eede7 Implement very basic views using DBSP
This is just the bare minimum that I needed to convince myself that this
approach will work. The only views that we support are slices of the
main table: no aggregations, no joins, no projections.

drop view is implemented.
view population is implemented.
deletes, inserts and updates are implemented.

much like indexes before, a flag must be passed to enable views.
2025-08-10 23:34:04 -05:00
Glauber Costa
e255fc9a81 Add table name to the delete bytecode
When building views (soon), it will be important to know which table
is being deleted. Getting from the cursor id is very cumbersome.

What we are doing here is symmetrical to op_insert, and sqlite also
passes table information in one of the registers (p4)
2025-08-10 22:50:25 -05:00
Glauber Costa
e9b8f6fba9 Add table name to the delete bytecode
When building views (soon), it will be important to know which table
is being deleted. Getting from the cursor id is very cumbersome.

What we are doing here is symmetrical to op_insert, and sqlite also
passes table information in one of the registers (p4)
2025-08-10 16:10:45 -05:00
Nikita Sivukhin
c0d5c55d5c fix tests and clippy 2025-08-06 01:03:49 +04:00
Nikita Sivukhin
c6a87d61c7 emit CDC entries if necessary for schema changes 2025-08-06 01:03:49 +04:00
pedrocarlo
736748cdf7 Simplify program epilogue by tracking the transaction mode and rollback status in the ProgramBuilder and then calling epilogue just once 2025-08-04 12:32:34 -03:00
Jussi Saurio
86b1232268 chore: enable indexes by default 2025-08-01 15:44:56 +03:00
Pere Diaz Bou
752a876f9a change every Rc to Arc in schema internals 2025-07-28 10:51:17 +02:00
bit-aloo
3cb2db933d remove Id 2025-07-24 14:40:24 +05:30
bit-aloo
9a54ef214e parser: Distinguish quoted identifiers and unify Id into Name enum
This commit replaces the `Name(pub String)` struct with a `Name` enum that
explicitly models how the name appeared in the source either as an
unquoted identifier (`Ident`) or a quoted string (`Quoted`).

In the process, the separate `Id` wrapper type has been coalesced into the
`Name` enum, simplifying the AST and reducing duplication in identifier
handling logic.

While this increases the size of some AST nodes (notably `yyStackEntry`),
it improves correctness and makes source structure more explicit for
later phases.
2025-07-24 14:40:19 +05:30
Glauber Costa
2a2468026c emit SetCookie after DropTable
The SetCookie opcode is used, among other things, to notify the
transaction of schema changes. We are not issuing it on DropTable.

Without it, the transaction thinks the schema hasn't changed, and does
not update the schema of the connection back to the database.

SQLite will, of course, issue it:

35    DropTable      0     0     0     foo            0
36    SetCookie      0     1     2                    0

Unfortunately I don't have a unit test that breaks with this, because
the one that is supposed to break is having, let's put it this way,
bigger problems.
2025-07-23 19:34:41 -05:00
Glauber Costa
57a1113460 make readonly a property of the database
There's no such thing as a read-only connection.
In a normal connection, you can have many attached databases. Some
r/o, some r/w.

To properly fix that, we also need to fix the OpenWrite opcode. Right
now we are passing a name, which is the name of the table. That
parameter is not used anywhere. That is also not what the SQLite opcode
specifies. Same as OpenRead, the p3 register should be the database
index.

With that change, we can - for now - pass the index 0, which is all
we support anyway, and then use that to test if we are r/o.
2025-07-22 09:41:32 -05:00
Glauber Costa
65312baee6 fix opcodes missing a database register
Two of the opcodes we implement (OpenRead and Transaction) should have
an opcode specifying the database to use, but they don't.

Add it, and for now always use 0 (the main database).
2025-07-20 12:27:26 -05:00
Nils Koch
8dc066503e chore: fix clippy errors 2025-07-16 19:34:42 +01:00
Nikita Sivukhin
f7fb2aac5e adjust extra_amount for schema translation code 2025-07-16 00:47:59 +04:00
Nikita Sivukhin
be0a607ba8 rename amount -> extra_amount 2025-07-16 00:46:17 +04:00
Jussi Saurio
beaf393476 Merge 'Treat table-valued functions as tables' from Piotr Rżysko
First step toward resolving
https://github.com/tursodatabase/limbo/issues/1643.
### This PR
With this change, the following two queries are considered equivalent:
```sql
SELECT value FROM generate_series(5, 50);
SELECT value FROM generate_series WHERE start = 5 AND stop = 50;
```
Arguments passed in parentheses to the virtual table name are now
matched to hidden columns.
Additionally, I fixed two bugs related to virtual tables.
### TODO (I'll handle this in a separate PR)
Column references are still not supported as table-valued function
arguments. The only difference is that previously, a query like:
```sql
SELECT one.value, series.value
FROM (SELECT 1 AS value) one, generate_series(one.value, 3) series;
```
would cause a panic. Now, it returns a proper error message instead.
Adding support for column references is more nuanced for two main
reasons:
* We need to ensure that in joins where a TVF depends on other tables,
those other tables are processed first. For example, in:
```sql
SELECT one.value, series.value
FROM generate_series(one.value, 3) series, (SELECT 1 AS value) one;
```
the one table must be processed by the top-level loop, and series must
be nested.
* For outer joins involving TVFs, the arguments must be treated as `ON`
predicates, not `WHERE` predicates.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1727
2025-07-15 12:23:45 +03:00
Piotr Rzysko
000d70f1f3 Propagate info about hidden columns 2025-07-14 07:16:53 +02:00
Zaid Humayun
9dd746d1ce fixes issues where double quotes are not removed from around table nam 2025-07-13 17:26:58 +05:30
Jussi Saurio
a48b6d049a Another post-rebase clippy round with 1.88.0 2025-07-12 19:10:56 +03:00
Pere Diaz Bou
ba988685cf set cookie create virtual table 2025-07-03 12:36:48 +02:00
Pere Diaz Bou
abf1699dd2 set scheam version and update shared schema in txn 2025-07-03 12:36:48 +02:00
Nikita Sivukhin
c9c5ef4e25 remote query_mode from ProgramBuilderOpts and from function arguments
- mode never changes and ProgramBuilder already created with proper mode set correctly
2025-07-02 13:24:12 +04:00
Levy A.
ffd6844b5b refactor: remove PseudoTable from Table
the only reason for `PseudoTable` to exist, is to provide column
information for `PseudoCursor` creation. this should not be part of the
schema.
2025-06-30 14:31:58 -03:00
Pekka Enberg
9c1b7897ac Fix URLs to point to github.com/tursodatabase/turso 2025-06-30 11:23:53 +03:00
Pekka Enberg
725c3e4ddc Rename limbo_sqlite3_parser crate to turso_sqlite3_parser 2025-06-29 12:34:46 +03:00
Pekka Enberg
eb0de4066b Rename limbo_ext crate to turso_ext 2025-06-29 12:14:08 +03:00
Pekka Enberg
2fc5c0ce5c Switch to runtime flag for enabling indexes
Makes it easier to test the feature:

```
$ cargo run --  --experimental-indexes
Limbo v0.0.22
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database
limbo> CREATE TABLE t(x);
limbo> CREATE INDEX t_idx ON t(x);
limbo> DROP INDEX t_idx;
```
2025-06-26 10:07:28 +03:00
Nils Koch
2827b86917 chore: fix clippy warnings 2025-06-23 19:52:13 +01:00
Pere Diaz Bou
48ae6766d7 fix comp errors 2025-06-17 19:33:23 +02:00
Pere Diaz Bou
f91d2c5e99 fix disable in write cases 2025-06-17 19:33:23 +02:00
Pere Diaz Bou
b5f2f375b8 disable alter, delete, create index, insert and update for indexes 2025-06-17 19:33:23 +02:00
Pekka Enberg
db4945eada Merge 'Fix update queries to set n_changes ' from Kim Seon Woo
- `Update` query doesn't update `n_changes`. Let's make it work
- Add `InsertFlags` to add meta information related to insert operations
- For update query, add `UPDATE` flag
- Currently, the update query executes `Insn::Delete` and `Insn::Insert`
internally, it increases `n_change` by 2. So, for the update query,
let's skip increasing `n_change` for the `Insn::Insert`
https://github.com/tursodatabase/limbo/issues/1681

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #1683
2025-06-16 16:30:20 +03:00
Levy A.
b88cb99ff0 fix warnings and some refactoring 2025-06-11 14:19:06 -03:00
Levy A.
15e0cab8d8 refactor+fix: precompute default values from schema 2025-06-11 14:18:39 -03:00
Levy A.
de2ac89ad2 feat: complete ALTER TABLE implementation 2025-06-11 14:17:36 -03:00
김선우
a9c096bb01 Skip increasing n_changes for insn::Insert when it's UPDATE query 2025-06-07 17:23:23 +09:00
Jussi Saurio
77ce4780d9 Fix ProgramBuilder::cursor_ref not having unique keys
Currently we have this:

program.alloc_cursor_id(Option<String>, CursorType)`

where the String is the table's name or alias ('users' or 'u' in
the query).

This is problematic because this can happen:

`SELECT * FROM t WHERE EXISTS (SELECT * FROM t)`

There are two cursors, both with identifier 't'. This causes a bug
where the program will use the same cursor for both the main query
and the subquery, since they are keyed by 't'.

Instead introduce `CursorKey`, which is a combination of:

1. `TableInternalId`, and
2. index name (Option<String> -- in case of index cursors.

This should provide key uniqueness for cursors:

`SELECT * FROM t WHERE EXISTS (SELECT * FROM t)`

here the first 't' will have a different `TableInternalId` than the
second `t`, so there is no clash.
2025-05-29 00:59:24 +03:00
Zaid Humayun
bc5d93c18a Addresses comment https://github.com/tursodatabase/limbo/pull/1548#discussion_r2103333264 by @jussisaurio
this commit uses more descriptive names for registers
2025-05-23 10:19:08 +05:30
Zaid Humayun
780d0cb5a8 addresses https://github.com/tursodatabase/limbo/pull/1548#discussion_r2102599487, https://github.com/tursodatabase/limbo/pull/1548#discussion_r2102601189 & https://github.com/tursodatabase/limbo/pull/1548#discussion_r2102602774 by @jussisaurio
this commit addresses comments regarding using decsriptive variable names for the loops and loop labels. Also, adds documentation for instructions that cause jumps in both loops
2025-05-23 00:31:19 +05:30
Zaid Humayun
3a8b18c481 addressed comment https://github.com/tursodatabase/limbo/pull/1548#discussion_r2102595246 by @jussisaurio
this commit addresses the comment regarding inline code documentation
2025-05-23 00:12:27 +05:30
Zaid Humayun
1999eac891 Fixes test Testing: can drop kv_store vtable
Earlier this test broke because the code to translate the drop table was not checking to see if a table was not a virtual table.
SQLite does this with a macro called IsVirtualTable. Here, I check with the existing methods on the BTree struct
2025-05-22 23:53:48 +05:30
Zaid Humayun
4072a41c9c Drop Table now uses an ephemeral table as a scratch table
Now when dropping a table, an ephemeral table is created as a scratch table. If a root page of some other table is moved into the page occupied by the root page of the table being dropped, that row is first written into an ephemeral table. Then on a next pass, it is deleted from the schema table and then re-inserted with the new root page.

This happens during AUTOVACUUM when deleting a root page will force the last root page to move into the slot being vacated by the root page of the table being deleted
2025-05-22 19:39:46 +05:30
Jussi Saurio
8bec75d804 Merge 'Initial Support for Nested Translation' from Pedro Muniz
This PR introduces some modifications to the Program Builder to allow us
to use nested parsing. By focusing the emission of Init and the last
Goto (prologue and epilogue), inside the ProgramBuilder, we can just not
emit them if we are parsing/translating in a nested context. For this
PR, I only migrated insert to use these functions as I need them to
support Insert statements that use `SELECT FROM` syntax. Nested parsing
overall enables code reuse for us and arguably is one of the only ways
to parse deeply nested queries without a lot of code duplication.
#1528

Closes #1543
2025-05-22 10:52:00 +03:00
pedrocarlo
53bf5d5ef5 adjust translate functions to take a program instead of Option<ProgramBuilder> + remove any Init emission in traslate functions + use epilogue in all places necessary 2025-05-21 16:41:10 -03:00