Commit Graph

66 Commits

Author SHA1 Message Date
Jussi Saurio
77ce4780d9 Fix ProgramBuilder::cursor_ref not having unique keys
Currently we have this:

program.alloc_cursor_id(Option<String>, CursorType)`

where the String is the table's name or alias ('users' or 'u' in
the query).

This is problematic because this can happen:

`SELECT * FROM t WHERE EXISTS (SELECT * FROM t)`

There are two cursors, both with identifier 't'. This causes a bug
where the program will use the same cursor for both the main query
and the subquery, since they are keyed by 't'.

Instead introduce `CursorKey`, which is a combination of:

1. `TableInternalId`, and
2. index name (Option<String> -- in case of index cursors.

This should provide key uniqueness for cursors:

`SELECT * FROM t WHERE EXISTS (SELECT * FROM t)`

here the first 't' will have a different `TableInternalId` than the
second `t`, so there is no clash.
2025-05-29 00:59:24 +03:00
Jussi Saurio
7c07c09300 Add stable internal_id property to TableReference
Currently our "table id"/"table no"/"table idx" references always
use the direct index of the `TableReference` in the plan, e.g. in
`SelectPlan::table_references`. For example:

```rust
Expr::Column { table: 0, column: 3, .. }
```

refers to the 0'th table in the `table_references` list.

This is a fragile approach because it assumes the table_references
list is stable for the lifetime of the query processing. This has so
far been the case, but there exist certain query transformations,
e.g. subquery unnesting, that may fold new table references from
a subquery (which has its own table ref list) into the table reference
list of the parent.

If such a transformation is made, then potentially all of the Expr::Column
references to tables will become invalid. Consider this example:

```sql
-- Assume tables: users(id, age), orders(user_id, amount)

-- Get total amount spent per user on orders over $100
SELECT u.id, sub.total
FROM users u JOIN
     (SELECT user_id, SUM(amount) as total
      FROM orders o
      WHERE o.amount > 100
      GROUP BY o.user_id) sub
WHERE u.id = sub.user_id

-- Before subquery unnesting:
-- Main query table_references: [users, sub]
-- u.id refers to table 0, column 0
-- sub.total refers to table 1, column 1
--
-- Subquery table_references: [orders]
-- o.user_id refers to table 0, column 0
-- o.amount refers to table 0, column 1
--
-- After unnesting and folding subquery tables into main query,
-- the query might look like this:

SELECT u.id, SUM(o.amount) as total
FROM users u JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100
GROUP BY u.id;

-- Main query table_references: [users, orders]
-- u.id refers to table index 0 (correct)
-- o.amount refers to table index 0 (incorrect, should be 1)
-- o.user_id refers to table index 0 (incorrect, should be 1)
```

We could ofc traverse every expression in the subquery and rewrite
the table indexes to be correct, but if we instead use stable identifiers
for each table reference, then all the column references will continue
to be correct.

Hence, this PR introduces a `TableInternalId` used in `TableReference`
as well as `Expr::Column` and `Expr::Rowid` so that this kind of query
transformations can happen with less pain.
2025-05-25 20:26:17 +03:00
Pekka Enberg
5ed187ba61 sqlite3-parser: Remove scanner trace-logging
It spams the logs like no tomorrow, and is mostly useless.
2025-05-22 12:37:28 +03:00
pedrocarlo
4dc1431428 handling edge case when passing duplicate a multi-column unique index 2025-05-14 11:46:24 -03:00
Anton Harniakou
525b7fdbaa Add PRAGMA schema_version 2025-04-30 09:41:04 +03:00
Anton Harniakou
51fc1773ea Fix missing documentation warning; improve the documentation message 2025-04-24 10:36:23 +03:00
Anton Harniakou
0a69ea0138 Support reading db page size using PRAGMA page_size 2025-04-24 10:12:02 +03:00
pedrocarlo
946b59f4ee even better BadNumber 2025-04-11 11:01:35 -03:00
pedrocarlo
a2ca9e5a46 better BadNumber 2025-04-11 10:09:00 -03:00
pedrocarlo
4d1ecd2d50 better MalformedHexInteger 2025-04-11 09:44:02 -03:00
Pekka Enberg
16bc28b0af sqlite3-parser: Change debug logging to trace level
SQL scanner at debug level spams the logs pretty hard when debugging...
2025-04-03 07:46:07 +03:00
PThorpe92
9c8083231c Implement create virtual table and VUpdate opcode 2025-02-17 20:44:44 -05:00
Glauber Costa
fbe439f6c2 Implement the legacy_file_format pragma
easy implementation, sqlite claims it is a noop now

"This pragma no longer functions. It has become a no-op. The capabilities
formerly provided by PRAGMA legacy_file_format are now available using
the SQLITE_DBCONFIG_LEGACY_FILE_FORMAT option to the sqlite3_db_config()
C-language interface."
2025-02-14 09:50:29 -05:00
Pekka Enberg
ac54c35f92 Switch to workspace dependencies
...makes it easier to specify a version, which is needed for `cargo publish`.
2025-02-12 17:28:04 +02:00
Pekka Enberg
d221f158cc Merge 'Add read implementation of user_version pragma with ReadCookie opcode' from Jonathan Webb
Just a bare bones implementation of ReadCookie and support for the
user_version pragma

Closes #909
2025-02-10 12:12:15 +02:00
Aarni Koskela
eaea02c567 Fix a handful of typos 2025-02-09 18:08:29 +02:00
Jussi Saurio
a8685c8086 sqlite3-parser: box the Expr in Vacuum 2025-02-09 14:24:55 +02:00
Jussi Saurio
9b0997a60d sqlite3-parser: separate boxed CreateVirtualTable struct 2025-02-09 14:24:55 +02:00
Jussi Saurio
36a3cb1d5e sqlite3-parser: box AlterTable 2025-02-09 14:11:34 +02:00
Jussi Saurio
72a055e5fe sqlite3-parser: box Pragma 2025-02-09 13:10:52 +02:00
Jussi Saurio
23c4106433 sqlite3-parser: separate boxed Insert struct 2025-02-09 13:10:21 +02:00
Jussi Saurio
f0d7d82e1d sqlite3-parser: Box the Expr in Detach 2025-02-09 13:10:21 +02:00
Jussi Saurio
32887518ce sqlite3-parser: separate boxed Delete struct 2025-02-09 13:10:21 +02:00
Jussi Saurio
40a8dc14cd sqlite3-parser: separate boxed SelectInner struct 2025-02-09 12:54:30 +02:00
Jussi Saurio
f75aca67bb sqlite3-parser: separate boxed TriggerCmd struct variants 2025-02-09 12:53:12 +02:00
Jussi Saurio
af920a317c sqlite3-parser: separate boxed Update struct 2025-02-09 12:53:12 +02:00
Jussi Saurio
575b484740 sqlite3-parser: separate boxed CreateTrigger struct 2025-02-09 12:50:00 +02:00
Jussi Saurio
358fda2ec7 sqlite3-parser: box the create table body 2025-02-09 12:42:53 +02:00
Jussi Saurio
d177f6195b sqlite3-parser: box big members of createindex 2025-02-09 12:34:53 +02:00
Jussi Saurio
4faadd86b0 sqlite3-parser: box the InsertBody 2025-02-08 18:10:26 +02:00
Jussi Saurio
781aa3b5d6 sqlite3-parser: box the having clause in GroupBy 2025-02-08 18:10:26 +02:00
Jussi Saurio
2a82091cb3 sqlite3-parser: box the where clause in Update 2025-02-08 18:10:26 +02:00
Jussi Saurio
7426204204 sqlite3-parser: box Following and Preceding in FrameBound 2025-02-08 18:10:26 +02:00
Jussi Saurio
ac7f9d67b7 sqlite3-parser: box large members of Upsert 2025-02-08 18:10:25 +02:00
Jussi Saurio
f341474fee sqlite3-parser: box large members of CreateTrigger 2025-02-08 18:10:25 +02:00
Jussi Saurio
0dba39b025 sqlite3-parser: box everything in Attach 2025-02-08 18:10:25 +02:00
Jussi Saurio
670dac5939 sqlite3-parser: box the where clause in Delete 2025-02-08 18:10:25 +02:00
Jonathan Webb
98e9d33478 Add read implementation of user_version pragma with ReadCookie opcode 2025-02-07 09:23:48 -05:00
krishvishal
a88a5353a3 Add unit test for checking SELECT FROM foo;
returns only one error and stops parsing.
2025-02-04 04:51:38 +05:30
krishvishal
2faaa9f719 Add fix for paser not stopping at first error encounter.
Fix is to track the state of encountering error in the `Parser` struct and set it to true whereever/whenever
we encounter an error. And the beginning of the next() we return `Ok(None)` to signal to `FallibleIterator` that we should stop parsing.

Fixes https://github.com/tursodatabase/limbo/issues/865
2025-02-04 04:45:34 +05:30
Glauber Costa
a3387cfd5f implement the pragma page_count
To do that, we also have to implement the vdbe opcode Pagecount.
2025-02-01 19:39:46 -05:00
Glauber Costa
b37317f68b avoid allocations during pragma_list
If we keep the pragma list sorted when declaring it, we can avoid
a vector allocation when printing the pragma_list.
2025-01-31 11:35:51 -05:00
Glauber Costa
62efbde661 use strum package to simplify PragmaName enum management
The pragma list will only grow. The strum crate can be used to:
* automatically convert to string from enum
* automatically convert to enum from string
* implement an iterator over all elements of the enum
2025-01-31 06:44:56 -05:00
Glauber Costa
016b815b59 implement pragma table_info
Both () and = variants covered. It is important to make sure that
the transaction is a read transaction, so we cannot hide all that logic
inside update_pragma, and have to make our decision before that.
2025-01-30 20:00:20 -05:00
Jussi Saurio
a26fc1619a Merge 'Fix select X'1'; causes limbo to go in infinite loop' from Krishna Vishal
Closes https://github.com/tursodatabase/limbo/issues/730.
Fixed `blog_literal` function to make it return `TK_ILLEGAL` token which
in turn now causes parser to stop and return `UnrecognizedTokenError`.
Added `TK_ILLEGAL` to `TokenType` Enum.
Behavior now:
```sql
SELECT X'1';  -- Odd number of hex digits -> TK_ILLEGAL -> unrecognized token error

  × unrecognized token at (1, 12)
   ╭────
 1 │ SELECT X'1';
   ·        ▲
   ·        ╰── here
   ╰────

SELECT X'AB'; -- Valid blob -> TK_BLOB -> parses successfully
�
SELECT X'G1'; -- Invalid hex digit -> TK_ILLEGAL -> unrecognized token error

  × unrecognized token at (1, 13)
   ╭────
 1 │ SELECT X'G1';
   ·        ▲
   ·        ╰── here
   ╰────

```

Reviewed-by: Jussi Saurio (@jussisaurio)

Closes #736
2025-01-21 11:32:10 +02:00
Krishna Vishal
b43c1544d4 Tokenizer::split() function returns TK_ILLEGAL instead of Error. 2025-01-20 19:37:15 +05:30
Krishna Vishal
04fd5a40d6 Finalize the parser in the case of Error while running queries. This resets the parser stack and prevents triggering the assertion and thereby panic.
Closes https://github.com/tursodatabase/limbo/issues/742
2025-01-20 16:10:35 +05:30
sonhmai
66d6291f32 add scaffolding for supporting wal checkpoint 2025-01-20 08:34:13 +07:00
Krishna Vishal
ee88781c2a chore: cargo fmt 2025-01-19 06:23:19 +05:30
Krishna Vishal
5aee588078 Fixed blog_literal function to make it return UnrecognisedToken error.
Added `TK_ILLEGAL` to `TokenType` Enum
2025-01-19 06:11:45 +05:30