Commit Graph

109 Commits

Author SHA1 Message Date
pedrocarlo
0f2849f7e1 serde and serde_json as workspace dependencies 2025-06-09 11:38:15 -03:00
Pere Diaz Bou
8cd7c7e82e Merge 'fix: make keyword_token safe by validating UTF-8 input' from ankit
This PR fixes an unsound usage of unsafe {
str::from_utf8_unchecked(word) } in the public function keyword_token in
mod.rs.
The function now uses std::str::from_utf8(word).ok()? to safely handle
invalid UTF-8, eliminating the unsoundness.
No logic or API changes.
Code compiles and tests pass (where possible).
Closes: https://github.com/tursodatabase/libsql/issues/1859

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #1677
2025-06-09 16:07:37 +02:00
Jussi Saurio
eec7c0529c Merge 'Beginnings of AUTOVACUUM' from Zaid Humayun
This PR adds the beginnings of
[AUTOVACUUM](https://www.sqlite.org/lang_vacuum.html) to Limbo. It adds
a feature flag called `omit_autovacuum` which is analogous to
`SQLITE_OMIT_AUTOVACUUM`. It is off by default, same as SQLite.
It introduces the concept of [pointer map pages](https://www.sqlite.org/
fileformat.html#pointer_map_or_ptrmap_pages) which are reverse index
pages used to map pages to their parents. This is used to swap pages
(when a table is deleted for instance) to keep root pages clustered at
the beginning of the file. It's also used while creating a table to
ensure that root pages are clustered at the beginning (although, this
isn't completely implemented yet)
Finally, it also adds a couple of missing instructions like `Int64` that
are required for `PRAGMA` commands related to `auto_vacuum` settings
<img width="1512" alt="Screenshot 2025-05-28 at 8 47 51 PM"
src="https://github.com/user-
attachments/assets/d52eb74f-5b79-4d52-9401-1bdc2dcc304d" />

Closes #1600
2025-06-09 08:20:24 +03:00
Zaid Humayun
5827a33517 Beginnings of AUTOVACUUM
This commit introduces AUTOVACUUM to Limbo. It introduces the concept of ptrmap pages and also adds some additional instructions that are required to make AUTOVACUUM PRAGMA work
2025-06-06 23:14:22 +05:30
ankit
4c3c72b666 fix: make keyword_token safe by validating UTF-8 input 2025-06-06 16:25:49 +05:30
pedrocarlo
22d1a1eaa8 fix blob printing 2025-06-04 12:06:43 -03:00
pedrocarlo
ebee9516ba clippy 2025-06-04 12:06:43 -03:00
pedrocarlo
5f379fe2d6 when no context is needed use Display Impl 2025-06-04 12:06:43 -03:00
pedrocarlo
f90bebbfbc small fix and remove dbg 2025-06-04 12:06:43 -03:00
pedrocarlo
6773dca595 impl ToSqlString for VACUUM stmt + tests 2025-06-04 12:06:43 -03:00
pedrocarlo
d86ff4dea3 impl ToSqlString for UPDATE stmt + tests 2025-06-04 12:06:42 -03:00
pedrocarlo
659ef8fcf7 impl ToSqlString for REINDEX, RELEASE, ROLLBACK, SAVEPOINT stmt + tests 2025-06-04 12:06:42 -03:00
pedrocarlo
e9cbd29dd1 impl ToSqlString for PRAGMA stmt + tests 2025-06-04 12:06:42 -03:00
pedrocarlo
260a26d612 impl ToSqlString for INSERT stmt + tests 2025-06-04 12:06:42 -03:00
pedrocarlo
5710976d95 impl ToSqlString for DROP INDEX, TABLE, TRIGGER, and VIEW stmt 2025-06-04 12:06:42 -03:00
pedrocarlo
8ff0a3c780 impl ToSqlString for DETACH stmt 2025-06-04 12:06:42 -03:00
pedrocarlo
0dc5ca668c test for DELETE + fixes 2025-06-04 12:06:42 -03:00
pedrocarlo
3b1da29b50 impl ToSqlString for DELETE stmt 2025-06-04 12:06:42 -03:00
pedrocarlo
6ba6fae2c6 CREATE VIRTUAL TABLE tests 2025-06-04 12:06:42 -03:00
pedrocarlo
c8f9e29262 impl ToSqlString for CREATE VIRTUAL TABLE stmt + fixes 2025-06-04 12:06:42 -03:00
pedrocarlo
563ab2fdf3 impl ToSqlString for CREATE VIEW stmt + tests 2025-06-04 12:06:42 -03:00
pedrocarlo
87b9540b4a test for create trigger + fixes 2025-06-04 12:06:42 -03:00
pedrocarlo
a22d06cd66 impl ToSqlString for CREATE TRIGGER stmt 2025-06-04 12:06:42 -03:00
pedrocarlo
ba215c8ba9 test for create table + fixes 2025-06-04 12:06:42 -03:00
pedrocarlo
8a7cc7669d impl ToSqlString for CREATE TABLE stmt 2025-06-04 12:06:42 -03:00
pedrocarlo
a8f5257240 impl ToSqlString for CREATE INDEX stmt 2025-06-04 12:06:42 -03:00
pedrocarlo
b47e3a990e impl ToSqlString for COMMIT stmt 2025-06-04 12:06:42 -03:00
pedrocarlo
4f736daa7c impl ToSqlString for BEGIN stmt 2025-06-04 12:06:42 -03:00
pedrocarlo
2a2132e479 impl ToSqlString for Attach stmt 2025-06-04 12:06:42 -03:00
pedrocarlo
355e9a2c96 impl ToSqlString for analyze stmt 2025-06-04 12:06:42 -03:00
pedrocarlo
43b1d4f5da alter table tests + fixes 2025-06-04 12:06:42 -03:00
pedrocarlo
7fb3d40ec2 implement ToSqlString for AlterTable 2025-06-04 12:06:42 -03:00
pedrocarlo
e0c2a09d71 more tests for select + fixes 2025-06-04 12:06:42 -03:00
pedrocarlo
5b6ed60133 simpler select tests + fixes to printing 2025-06-04 12:06:42 -03:00
pedrocarlo
fb01541708 impl ToSqlString for Select 2025-06-04 12:06:42 -03:00
pedrocarlo
0b0e724f54 implement ToSqlString for Expr 2025-06-04 12:06:42 -03:00
pedrocarlo
1dc73bc49e initial stubs for ast::Select 2025-06-04 12:06:42 -03:00
pedrocarlo
2ac2990b4c to_sql_string trait definition 2025-06-04 12:06:42 -03:00
Jussi Saurio
77ce4780d9 Fix ProgramBuilder::cursor_ref not having unique keys
Currently we have this:

program.alloc_cursor_id(Option<String>, CursorType)`

where the String is the table's name or alias ('users' or 'u' in
the query).

This is problematic because this can happen:

`SELECT * FROM t WHERE EXISTS (SELECT * FROM t)`

There are two cursors, both with identifier 't'. This causes a bug
where the program will use the same cursor for both the main query
and the subquery, since they are keyed by 't'.

Instead introduce `CursorKey`, which is a combination of:

1. `TableInternalId`, and
2. index name (Option<String> -- in case of index cursors.

This should provide key uniqueness for cursors:

`SELECT * FROM t WHERE EXISTS (SELECT * FROM t)`

here the first 't' will have a different `TableInternalId` than the
second `t`, so there is no clash.
2025-05-29 00:59:24 +03:00
Jussi Saurio
7c07c09300 Add stable internal_id property to TableReference
Currently our "table id"/"table no"/"table idx" references always
use the direct index of the `TableReference` in the plan, e.g. in
`SelectPlan::table_references`. For example:

```rust
Expr::Column { table: 0, column: 3, .. }
```

refers to the 0'th table in the `table_references` list.

This is a fragile approach because it assumes the table_references
list is stable for the lifetime of the query processing. This has so
far been the case, but there exist certain query transformations,
e.g. subquery unnesting, that may fold new table references from
a subquery (which has its own table ref list) into the table reference
list of the parent.

If such a transformation is made, then potentially all of the Expr::Column
references to tables will become invalid. Consider this example:

```sql
-- Assume tables: users(id, age), orders(user_id, amount)

-- Get total amount spent per user on orders over $100
SELECT u.id, sub.total
FROM users u JOIN
     (SELECT user_id, SUM(amount) as total
      FROM orders o
      WHERE o.amount > 100
      GROUP BY o.user_id) sub
WHERE u.id = sub.user_id

-- Before subquery unnesting:
-- Main query table_references: [users, sub]
-- u.id refers to table 0, column 0
-- sub.total refers to table 1, column 1
--
-- Subquery table_references: [orders]
-- o.user_id refers to table 0, column 0
-- o.amount refers to table 0, column 1
--
-- After unnesting and folding subquery tables into main query,
-- the query might look like this:

SELECT u.id, SUM(o.amount) as total
FROM users u JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100
GROUP BY u.id;

-- Main query table_references: [users, orders]
-- u.id refers to table index 0 (correct)
-- o.amount refers to table index 0 (incorrect, should be 1)
-- o.user_id refers to table index 0 (incorrect, should be 1)
```

We could ofc traverse every expression in the subquery and rewrite
the table indexes to be correct, but if we instead use stable identifiers
for each table reference, then all the column references will continue
to be correct.

Hence, this PR introduces a `TableInternalId` used in `TableReference`
as well as `Expr::Column` and `Expr::Rowid` so that this kind of query
transformations can happen with less pain.
2025-05-25 20:26:17 +03:00
Pekka Enberg
5ed187ba61 sqlite3-parser: Remove scanner trace-logging
It spams the logs like no tomorrow, and is mostly useless.
2025-05-22 12:37:28 +03:00
pedrocarlo
510c70e919 Create CollationSeq enum and functions. Move strum to workspace dependency to avoid version mismatch with Parser 2025-05-19 15:22:14 -03:00
pedrocarlo
4dc1431428 handling edge case when passing duplicate a multi-column unique index 2025-05-14 11:46:24 -03:00
Anton Harniakou
525b7fdbaa Add PRAGMA schema_version 2025-04-30 09:41:04 +03:00
Anton Harniakou
51fc1773ea Fix missing documentation warning; improve the documentation message 2025-04-24 10:36:23 +03:00
Anton Harniakou
0a69ea0138 Support reading db page size using PRAGMA page_size 2025-04-24 10:12:02 +03:00
pedrocarlo
c99c6a4be5 Activate Bench for comparison 2025-04-11 13:40:56 -03:00
pedrocarlo
946b59f4ee even better BadNumber 2025-04-11 11:01:35 -03:00
pedrocarlo
a2ca9e5a46 better BadNumber 2025-04-11 10:09:00 -03:00
pedrocarlo
4d1ecd2d50 better MalformedHexInteger 2025-04-11 09:44:02 -03:00