Commit Graph

1199 Commits

Author SHA1 Message Date
Jussi Saurio
483dc27539 Merge 'make most instrumentation levels to be Debug or Trace instead' from Pedro Muniz
Span creation in debug mode is very slow and impacts our ability to run
the Simulator faster.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2146
2025-07-17 23:45:07 +03:00
Jussi Saurio
1b52b5c764 Merge 'chore: update rust to version 1.88.0' from Nils Koch
This PR updates to version Rust 1.88.0 ([Release
notes](https://releases.rs/docs/1.88.0/)) and fixes all the clippy
errors that come with the new Rust version.
This is possible in the latest Rust version:
```rust
if let Some(foo) = bar && foo.is_cool() {
  ...
}
```
There are three complications in the migration (so far):
- A BUNCH of Clippy warnings (mostly fixed in
https://github.com/tursodatabase/limbo/pull/1827)
- Windows cross compilation failed; linking `advapi32` on windows fixes
it
  - Since Rust 1.87.0, advapi32 is not linked by default anymore
([Release notes](https://github.com/rust-
lang/rust/blob/master/RELEASES.md#compatibility-notes-1),
[PR](https://github.com/rust-lang/rust/pull/138233))
- Rust is more strict with FFIs and aligning pointers now. CI checks
failed with error below
  - Fixed in https://github.com/tursodatabase/turso/pull/2064
```
thread 'main' panicked at
core/ext/vtab_xconnect.rs:64:25:
misaligned pointer dereference: address must be
a multiple of 0x8 but is 0x7ffd9d901554
```

Closes #1807
2025-07-17 23:35:33 +03:00
pedrocarlo
c15f1e02d3 make most instrumentation levels to be Debug or Trace instead. Span creation in debug mode is very slow and impacts our ability to run the Simulator fast enough 2025-07-17 16:48:24 -03:00
Jussi Saurio
a45ac11462 translate/create index: fix wrong collations 2025-07-17 21:07:48 +03:00
Pekka Enberg
d2158ff201 Merge 'Clean up AST unparsing, remove ToSqlString' from Levy A.
Enables formatting `Expr::Column` by adding the context to `ToTokens`
instead of creating a new unparsing implementation for each node.
`ToTokens` implemented for:
- [x] `UpdatePlan`
- [x] `Plan`
- [x] `JoinedTable`
- [x] `SelectPlan`
- [x] `DeletePlan`

Reviewed-by: Pedro Muniz (@pedrocarlo)

Closes #1949
2025-07-17 08:44:31 +03:00
Nils Koch
8dc066503e chore: fix clippy errors 2025-07-16 19:34:42 +01:00
Nikita Sivukhin
97b82fe6d8 rename operation_xxx to change_xxx to make naming more consistent 2025-07-16 20:16:24 +04:00
Levy A.
714225b9f0 remove ToSqlString trait 2025-07-16 12:16:34 -03:00
Levy A.
6fe2505425 add more ToTokens impls 2025-07-16 12:16:31 -03:00
Levy A.
373a4a26c4 fix: comma function 2025-07-16 12:16:28 -03:00
Levy A.
765b90aeb9 feat: implement ToTokens for UpdatePlan 2025-07-16 12:16:23 -03:00
Nikita Sivukhin
c018b06bf5 fix bug in concat_ws translation 2025-07-16 00:48:17 +04:00
Nikita Sivukhin
f7fb2aac5e adjust extra_amount for schema translation code 2025-07-16 00:47:59 +04:00
Nikita Sivukhin
be0a607ba8 rename amount -> extra_amount 2025-07-16 00:46:17 +04:00
Jussi Saurio
beaf393476 Merge 'Treat table-valued functions as tables' from Piotr Rżysko
First step toward resolving
https://github.com/tursodatabase/limbo/issues/1643.
### This PR
With this change, the following two queries are considered equivalent:
```sql
SELECT value FROM generate_series(5, 50);
SELECT value FROM generate_series WHERE start = 5 AND stop = 50;
```
Arguments passed in parentheses to the virtual table name are now
matched to hidden columns.
Additionally, I fixed two bugs related to virtual tables.
### TODO (I'll handle this in a separate PR)
Column references are still not supported as table-valued function
arguments. The only difference is that previously, a query like:
```sql
SELECT one.value, series.value
FROM (SELECT 1 AS value) one, generate_series(one.value, 3) series;
```
would cause a panic. Now, it returns a proper error message instead.
Adding support for column references is more nuanced for two main
reasons:
* We need to ensure that in joins where a TVF depends on other tables,
those other tables are processed first. For example, in:
```sql
SELECT one.value, series.value
FROM generate_series(one.value, 3) series, (SELECT 1 AS value) one;
```
the one table must be processed by the top-level loop, and series must
be nested.
* For outer joins involving TVFs, the arguments must be treated as `ON`
predicates, not `WHERE` predicates.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1727
2025-07-15 12:23:45 +03:00
meteorgan
f123c77ee8 fix set page_size in pager 2025-07-15 16:34:07 +08:00
meteorgan
e2ab673624 fix self.pager.replace() panic 2025-07-15 16:34:07 +08:00
meteorgan
bf69b86e94 fix: not all pragma need transaction 2025-07-15 16:34:07 +08:00
meteorgan
a6faab17e9 fix query page size 2025-07-15 16:34:07 +08:00
meteorgan
cf126824de Support set page size 2025-07-15 16:34:07 +08:00
Pekka Enberg
1653b0883a Merge 'core/vector: Euclidean distance support for vector search' from KarinaMilet
This PR provides Euclidean distance support for limbo's vector search.
At the same time, some type abstractions are introduced, such as
`DistanceCalculator`, etc. This is because I hope to unify the current
vector module in the future to make it more structured, clearer, and
more extensible.
While practicing Euclidean distance for Limbo, I discovered that many
checks could be done using the type system or in advance, rather than
waiting until the distance is calculated. By building these checks into
the type system or doing them ahead of time, this would allow us to
explore more efficient computations, such as automatic vectorization or
SIMD acceleration, which is future work.

Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #1986
2025-07-14 13:07:20 +03:00
Pekka Enberg
0b544717a1 Merge 'do not check rowid alias for null' from Nikita Sivukhin
Simple PR to check minor issue that `INTEGER PRIMARY KEY NOT NULL` (`NOT
NULL` is redundant here obviously) will prevent user to insert anything
to the table as rowid-alias column always set to null by `turso-db`

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2063
2025-07-14 11:55:06 +03:00
Nikita Sivukhin
e94ebbad04 remove unwanted changes 2025-07-14 11:27:51 +04:00
Nikita Sivukhin
81cd04dd65 add bin_record_json_object and table_columns_json_array functions 2025-07-14 11:19:45 +04:00
Nikita Sivukhin
eed89993f9 fix clippy 2025-07-14 11:17:32 +04:00
Nikita Sivukhin
5409812610 properly implement generation of before/after records for new modes 2025-07-14 11:17:32 +04:00
Nikita Sivukhin
fabb00f385 fix test 2025-07-14 11:16:06 +04:00
Nikita Sivukhin
b258c10c9a generate before/after row values in modification statements 2025-07-14 11:16:06 +04:00
Nikita Sivukhin
9129991b62 add id,before,after,full modes 2025-07-14 11:16:06 +04:00
Piotr Rzysko
30ae6538ee Treat table-valued functions as tables
With this change, the following two queries are considered equivalent:
```sql
SELECT value FROM generate_series(5, 50);
SELECT value FROM generate_series WHERE start = 5 AND stop = 50;
```
Arguments passed in parentheses to the virtual table name are now
matched to hidden columns.

Column references are still not supported as table-valued function
arguments. The only difference is that previously, a query like:
```sql
SELECT one.value, series.value
FROM (SELECT 1 AS value) one, generate_series(one.value, 3) series;
```
would cause a panic. Now, it returns a proper error message instead.

Adding support for column references is more nuanced for two main
reasons:
- We need to ensure that in joins where a TVF depends on other tables,
those other tables are processed first. For example, in:
```sql
SELECT one.value, series.value
FROM generate_series(one.value, 3) series, (SELECT 1 AS value) one;
```
the one table must be processed by the top-level loop, and series must
be nested.
- For outer joins involving TVFs, the arguments must be treated as ON
predicates, not WHERE predicates.
2025-07-14 07:16:53 +02:00
Piotr Rzysko
9102f4a2f4 Extract table parsing into separate method
This is in preparation for reusing the method when parsing `TableCall`s.
2025-07-14 07:16:53 +02:00
Piotr Rzysko
44b1b1852a Fix referencing virtual table predicates
We need to enumerate first and filter afterward — not the other way
around — because we later use the indexes produced by `enumerate` to
access the original `predicates` slice.
2025-07-14 07:16:53 +02:00
Piotr Rzysko
319cdbe3af Don't use search for virtual tables
Previously, the test queries added in this commit would fail with:
thread 'main' panicked at core/schema.rs:129:34:
not implemented
stack backtrace:
   0: rust_begin_unwind
             at /rustc/90b35a6239c3d8bdabc530a6a0816f7ff89a0aaf/library/std/src/panicking.rs:665:5
   1: core::panicking::panic_fmt
             at /rustc/90b35a6239c3d8bdabc530a6a0816f7ff89a0aaf/library/core/src/panicking.rs:74:14
   2: core::panicking::panic
             at /rustc/90b35a6239c3d8bdabc530a6a0816f7ff89a0aaf/library/core/src/panicking.rs:148:5
   3: limbo_core::schema::Table::get_root_page
             at ./core/schema.rs:129:34
   4: limbo_core::translate::main_loop::init_loop
             at ./core/translate/main_loop.rs:260:44
   5: limbo_core::translate::emitter::emit_query
             at ./core/translate/emitter.rs:568:5
   6: limbo_core::translate::emitter::emit_program_for_select
             at ./core/translate/emitter.rs:496:5
   7: limbo_core::translate::emitter::emit_program
             at ./core/translate/emitter.rs:187:31
   8: limbo_core::translate::select::translate_select
             at ./core/translate/select.rs:82:5
   9: limbo_core::translate::translate_inner
             at ./core/translate/mod.rs:241:13
  10: limbo_core::translate::translate
             at ./core/translate/mod.rs:95:17
  11: limbo_core::Connection::run_cmd
             at ./core/lib.rs:416:31
  12: <limbo_core::QueryRunner as core::iter::traits::iterator::Iterator>::next
             at ./core/lib.rs:916:22
  13: limbo::app::Limbo::run_query
             at ./cli/app.rs:442:27
  14: limbo::app::Limbo::handle_input_line
             at ./cli/app.rs:544:13
  15: limbo::main
             at ./cli/main.rs:51:31
  16: core::ops::function::FnOnce::call_once
2025-07-14 07:16:53 +02:00
Piotr Rzysko
000d70f1f3 Propagate info about hidden columns 2025-07-14 07:16:53 +02:00
Zaid Humayun
9dd746d1ce fixes issues where double quotes are not removed from around table nam 2025-07-13 17:26:58 +05:30
Jussi Saurio
a48b6d049a Another post-rebase clippy round with 1.88.0 2025-07-12 19:10:56 +03:00
Nils Koch
828d4f5016 fix clippy errors for rust 1.88.0 (auto fix) 2025-07-12 18:58:41 +03:00
Nikita Sivukhin
6d3bdf5b9e do not check rowid alias for null 2025-07-12 14:07:26 +04:00
Pekka Enberg
f24e254ec6 core/translate: Fix "misuse of aggregate function" error message
```
sqlite> CREATE TABLE test1(f1, f2);
sqlite> SELECT SUM(min(f1)) FROM test1;
Parse error: misuse of aggregate function min()
  SELECT SUM(min(f1)) FROM test1;
             ^--- error here
```

Spotted by SQLite TCL tests.
2025-07-10 14:29:59 +03:00
KaguraMilet
9d6ae78786 Merge branch 'tursodatabase:main' into distance 2025-07-10 19:15:08 +08:00
Mikaël Francoeur
89b0574fac return error if no tables 2025-07-09 14:58:24 -04:00
Pekka Enberg
3f10427f52 core: Fix resolve_function() error messages
We need to return the original function name, not normalized one to be
compatible with SQLite.

Spotted by SQLite TCL tests.
2025-07-09 15:30:57 +03:00
meteorgan
c6ef4898b0 fix: IdxDelete shouldn't raise error if P5 == 0 2025-07-08 22:57:20 +08:00
meteorgan
4a516ab414 Support except operator for compound select 2025-07-08 22:57:20 +08:00
Jussi Saurio
cb8a576501 op_idx_insert: introduce flag for ignoring duplicates 2025-07-08 12:14:20 +03:00
Pekka Enberg
84771f17f7 Merge 'core/translate: Fix aggregate star error handling in prepare_one_sele…' from Pekka Enberg
…ct_plan()
For example, if we attempt to do `max(*)`, let's return the error
message from `resolve_function()` to be compatible with SQLite:
```
sqlite> CREATE TABLE test1(f1, f2);
sqlite> SELECT max(*) FROM test1;
Parse error: wrong number of arguments to function max()
  SELECT max(*) FROM test1;
         ^--- error here
```
Spotted by SQLite TCL tests.

Closes #1990
2025-07-08 10:19:59 +03:00
Pekka Enberg
341f963a8e Merge 'Fix infinite loops, rollback problems, and other bugs found by I/O fault injection' from Pedro Muniz
Was running the sim with I/O faults enabled and fixed some nasty bugs.
Now, there are some more nasty bugs to fix as well. This is the command
that I use to run the simulator `cargo run -p limbo_sim -- --minimum-
tests 10 --maximum-tests 1000`
This PR mainly fixes the following bugs:
- Not decrementing in flight write counter when `pwrite` fails
- not rolling back the transaction on `step` error
- not rolling back the transaction on `run_once` error
- some functions were just being unwrapped when they could suffer io
errors
- Only change max_frame after wal sync's

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #1946
2025-07-07 21:31:26 +03:00
Pekka Enberg
d4c03d426c core/translate: Fix aggregate star error handling in prepare_one_select_plan()
For example, if we attempt to do `max(*)`, let's return the error
message from `resolve_function()` to be compatible with SQLite:

```
sqlite> CREATE TABLE test1(f1, f2);
sqlite> SELECT max(*) FROM test1;
Parse error: wrong number of arguments to function max()
  SELECT max(*) FROM test1;
         ^--- error here
```

Spotted by SQLite TCL tests.
2025-07-07 19:56:09 +03:00
Pekka Enberg
0d17d35ef4 Merge 'Change data capture' from Nikita Sivukhin
This PR add basic CDC functionality to the `turso-db`.
### Feature components
1. `unstable_capture_data_changes_conn` pragma which allow user to turn
on/off CDC logging for **specific connection**
    * CDC will have multiple modes, but for now only `off` / `rowid-
only` are supported
    * Default CDC table is `turso_cdc` but user can override this with
`PRAGMA` update syntax and use arbitrary table for the CDC needs
      * This can be helpful in future if turso will need to break table
format compatibility and custom tables can be a way to migrate between
different schemas
    * Update syntax for the pragma accepts one string argument in
format, where only mode is set or custom cdc table name is provided as
second part of the string, separated with comma from the mode
```sql
turso> PRAGMA unstable_capture_data_changes_conn('rowid-only');
turso> PRAGMA unstable_capture_data_changes_conn('off');
turso> PRAGMA unstable_capture_data_changes_conn('rowid-only,custom_cdc_table');
turso> PRAGMA unstable_capture_data_changes_conn;
┌────────────┬──────────────────┐
│ mode       │ table            │
├────────────┼──────────────────┤
│ rowid-only │ custom_cdc_table │
└────────────┴──────────────────┘
```
2. CDC table schema right now is simple but it will be evolved soon to
support logging of row values before/after the change:
```sql
CREATE TABLE custom_cdc_table (
  operation_id INTEGER PRIMARY KEY AUTOINCREMENT,
  operation_time INTEGER, -- unixepoch() at the moment of insert, can drift if machine clocks is not monotonic
  operation_type INTEGER, -- -1 = delete, 0 = update, 1 = insert
  table_name TEXT,
  id
)
```
  * Note, that `operation_id` is marked as `AUTOINCREMENT` but `turso-
db` needs to implement
https://github.com/tursodatabase/turso/issues/1976 in order to properly
support that keyword
3. Query planner changes are made in `INSERT`/`UPDATE`/`DELETE` plans in
order to emit updates to the CDC table for changes in the table
  * Note, that row `UPDATE` which change primary key generate `DELETE` +
`INSERT` statement instead of single `UPDATE`
### Implementation details
- `PRAGMA` to enable CDC is **unstable** which means that publicly
visible side-effects/public API can change in future (and it will change
soon in order to support more rich CDC modes)
- CDC table is just a regular table with its benefits and downsides:
  * benefits: user can perform maintenance operations with that table
just with regular SQL like `DELETE FROM turso_cdc WHERE operation_id <
?` to cleanup old not needed CDC entries
  * downsides: user can accidentally make unwanted change to CDC table
- Changes to CDC table is not logged to itself
  * Note, that different connections (e.g. `C1`, `C2`) can have
different CDC tables set (e.g. `A` and `B`) - in which case changes made
to CDC table `B` through connection `C1` will be reflected in CDC table
`A`

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #1926
2025-07-07 18:03:07 +03:00
pedrocarlo
b85687658d change instrumentation level to INFO 2025-07-07 11:53:45 -03:00