Commit Graph

6171 Commits

Author SHA1 Message Date
Pekka Enberg
b0971f98c2 Merge 'sim: ignore fsync faults' from Jussi Saurio
`FaultyQuery` causes frequent false positives in simulator due to the
following chain of events:
- we write rows and flush wal to disk
- inject fault during fsync which fails
- error is returned to caller, simulator thinks those rows dont exist
because the query failed
- we reopen the database i.e. read the WAL back to memory from disk, it
has those extra rows we think we didn't write
- assertion fails because table has more rows than simulator expected
More discussion about fsync behavior in issue #2091

Closes #2110
2025-07-16 11:15:23 +03:00
Jussi Saurio
bb0cad459e sim: ignore fsync faults
`FaultyQuery` causes frequent false positives in simulator due to
the following chain of events:

- we write rows and flush wal to disk
- inject fault during fsync which fails
- error is returned to caller, simulator thinks those rows dont exist because the query failed
- we reopen the database i.e. read the WAL back to memory from disk, it has those extra rows we think we didn't write
- assertion fails because table has more rows than simulator expected

More discussion about fsync behavior in issue #2091
2025-07-16 11:09:54 +03:00
Pekka Enberg
1a8bade9d5 Merge 'Updates to the simulator' from Alperen Keleş
- Add generation for UNION/JOIN
- Rearchitect the oracle calling conventions to simplify the code paths
- Add brute force shrinking option by @echoumcp1

Closes #2049
2025-07-16 11:03:41 +03:00
Pekka Enberg
f72ceaf177 Merge 'extensions/vtab: fix i32 being passed as i64 across FFI boundary' from Jussi Saurio
as nilskch points out in #1807, Rust 1.88.0 is stricter about alignment
checks.
because rust integers default to `i32`, we were casting a pointer to an
`i32` as a pointer to an `i64` causing a panic when dereferenced due to
misalignment as rust expects it to be 8 byte aligned.

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #2064
2025-07-16 08:28:24 +03:00
Pekka Enberg
f4e82df00e Merge 'Fix CSV import in the shell' from Jussi Saurio
- Fix not being able to create table while importing
    * The behavior now aligns with SQLite so that if the table already
exists, all the rows are treated as data. If the table doesn't exist,
the first row is treated as the header from which column names for the
new table are populated.
- Insert in batches instead of one at a time
This was a pretty quick vibecoding effort tbh :]
Closes #2079

Closes #2094
2025-07-16 08:26:30 +03:00
alpaylan
04f5b91e87 fix faulty Update generation within delete_select 2025-07-16 00:06:35 -04:00
Jussi Saurio
f482424d77 Merge 'small refactor: rename "amount" to "extra_amount"' from Nikita Sivukhin
Small refactoring to reduce confusion (I was caught in this trap and set
`amount` to one in CDC branch during development)
Also, this PR slightly fix broken `concat_ws` emit logic.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2100
2025-07-16 06:51:35 +03:00
Jussi Saurio
3aae46ccc7 Merge 'refactor: Changes CursorResult to IOResult' from Diego Reis
This PR unify the concept of a result that either have something done or
yields to IO, into a single type.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2103
2025-07-16 06:50:13 +03:00
alpaylan
28ecb083e1 fix faulty Insert::Select generation within delete_select 2025-07-15 22:35:05 -04:00
Diego Reis
0e9771ac07 refactor: Change redundant "Status" enums to IOResult
Let's unify the semantics of "something done" or yields I/O into a
single type
2025-07-15 20:56:18 -03:00
Diego Reis
d0af54ae77 refactor: Change CursorResult to IOResult
The reasoning here is to treat I/O operations (Either is "Done" or
yields to IO) with the same generic type.
2025-07-15 20:52:25 -03:00
Nikita Sivukhin
e15f72da2d add simple test for concat_ws bug 2025-07-16 00:52:14 +04:00
Nikita Sivukhin
c018b06bf5 fix bug in concat_ws translation 2025-07-16 00:48:17 +04:00
Nikita Sivukhin
f7fb2aac5e adjust extra_amount for schema translation code 2025-07-16 00:47:59 +04:00
Nikita Sivukhin
be0a607ba8 rename amount -> extra_amount 2025-07-16 00:46:17 +04:00
Jussi Saurio
86b1b0d009 Merge 'fix record header size calculations and incorrect assumptions' from Jussi Saurio
- remove assumptions that record header size fits into 1 byte or serial
type fits into 1 byte
- add tests for record header size calculation
```sql
turso> CREATE TABLE t(x TEXT, y);
CREATE INDEX t_idx ON t(x);
INSERT INTO t VALUES (replace(zeroblob(1000), x'00', 'a') || 'a', 1); -- 1000 bytes of 'a'
INSERT INTO t VALUES (replace(zeroblob(1000), x'00', 'a') || 'b', 2);
INSERT INTO t VALUES (replace(zeroblob(1000), x'00', 'a') || 'c', 3);
INSERT INTO t VALUES (replace(zeroblob(1000), x'00', 'a') || 'd', 4);
INSERT INTO t VALUES (replace(zeroblob(1000), x'00', 'a') || 'e', 5);
INSERT INTO t VALUES (replace(zeroblob(1000), x'00', 'a') || 'f', 6);
INSERT INTO t VALUES (replace(zeroblob(1000), x'00', 'a') || 'g', 7);
INSERT INTO t VALUES (replace(zeroblob(1000), x'00', 'a') || 'h', 8);
SELECT COUNT(*) FROM t WHERE x >= replace(hex(zeroblob(100)), '00', 'a');
┌───────────┐
│ COUNT (*) │
├───────────┤
│         8 │
└───────────┘
```
Fixes #2096
Fixes #2088

Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #2098
2025-07-15 19:09:31 +03:00
Jussi Saurio
fda92d43a2 adjust comment in header size test 2025-07-15 18:52:27 +03:00
Jussi Saurio
38183d3b3b tcl: add regression test for large text keys 2025-07-15 18:48:06 +03:00
Jussi Saurio
025ddd98a6 Merge 'bench: add insert benchmark (batch sizes: 1,10,100)' from Jussi Saurio
```
Insert rows in batches/limbo_insert_1_rows
                        time:   [344.71 µs 363.45 µs 379.31 µs]
Insert rows in batches/sqlite_insert_1_rows
                        time:   [575.12 µs 769.16 µs 983.30 µs]

Insert rows in batches/limbo_insert_10_rows
                        time:   [1.4964 ms 1.5694 ms 1.6334 ms]
Insert rows in batches/sqlite_insert_10_rows
                        time:   [510.79 µs 766.56 µs 1.0677 ms]

Insert rows in batches/limbo_insert_100_rows
                        time:   [5.5177 ms 5.6806 ms 5.8619 ms]
Insert rows in batches/sqlite_insert_100_rows
                        time:   [439.91 µs 879.43 µs 1.4260 ms]
```

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #2092
2025-07-15 18:12:59 +03:00
Jussi Saurio
927a1f158a Merge 'btree: unify table&index seek page boundary handling' from Jussi Saurio
## Background
PR #2065 fixed a bug with table btree seeks concerning boundaries of
leaf pages.
The issue was that if we were e.g. looking for the first key greater
than (GT) 100, we always assumed the key would either be found on the
left child page of a given divider (e.g. divider 102) or not at all,
which is incorrect. #2065 has more discussion and documentation about
this, so read that one for more context.
## This PR
We already had similar handling for index btrees as #2065 introduced for
table btrees, but it was baked into the `BTreeCursor` struct's seek
handling itself, whereas #2065 handled this on the VDBE side.
This PR unifies this handling for both table and index btrees by always
doing the additional cursor advancement in the VDBE.
Unfortunately, unlike table btrees, index btrees may also need to do an
additional advance when they are looking for an exact match. This
resulted in a bigger refactor than anticipated, since there are quite a
few VDBE instructions that may perform a seek, e.g.: `IdxInsert`,
`IdxDelete`, `Found`, `NotFound`, `NoConflict`. All of these can
potentially end up in a similar situation where the cursor needs one
more advance after the initial seek, and they were currently calling
`cursor.seek()` directly and expecting the `BTreeCursor` to handle the
auto-advance fallback internally.
For this reason, I have 1. removed the "TryAdvance"-ish logic from the
index btree internals and 2. extracted a common VDBE helper `fn
seek_internal()` - heavily based on the existing `op_seek_internal()`,
but decoupled from instructions and the program counter - which all the
interested VDBE instructions will call to delegate their seek logic.
Closes #2083

Reviewed-by: Nikita Sivukhin (@sivukhin)
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #2084
2025-07-15 18:02:52 +03:00
Jussi Saurio
932536a03f compare_records: fix assumption that header size is 1 byte and serial type is 1 byte 2025-07-15 17:57:52 +03:00
Jussi Saurio
7c353095ed types: fix and unify record header size calculation 2025-07-15 17:38:02 +03:00
alpaylan
9347e43dfc clippy + fmt 2025-07-15 09:57:55 -04:00
alpaylan
9a921ed4b9 make the large table smaller 2025-07-15 09:56:27 -04:00
Jussi Saurio
cc47bfba02 CSV import fixes
- Fix not being able to create table while importing
    * The behavior now aligns with SQLite so that if the table already
      exists, all the rows are treated as data. If the table doesn't exist,
      the first row is treated as the header from which column names for the
      new table are populated.
- Insert in batches instead of one at a time
2025-07-15 16:44:11 +03:00
Jussi Saurio
3a861e1618 bench: add insert benchmark (batch sizes: 1,10,100) 2025-07-15 13:41:56 +03:00
Jussi Saurio
beaf393476 Merge 'Treat table-valued functions as tables' from Piotr Rżysko
First step toward resolving
https://github.com/tursodatabase/limbo/issues/1643.
### This PR
With this change, the following two queries are considered equivalent:
```sql
SELECT value FROM generate_series(5, 50);
SELECT value FROM generate_series WHERE start = 5 AND stop = 50;
```
Arguments passed in parentheses to the virtual table name are now
matched to hidden columns.
Additionally, I fixed two bugs related to virtual tables.
### TODO (I'll handle this in a separate PR)
Column references are still not supported as table-valued function
arguments. The only difference is that previously, a query like:
```sql
SELECT one.value, series.value
FROM (SELECT 1 AS value) one, generate_series(one.value, 3) series;
```
would cause a panic. Now, it returns a proper error message instead.
Adding support for column references is more nuanced for two main
reasons:
* We need to ensure that in joins where a TVF depends on other tables,
those other tables are processed first. For example, in:
```sql
SELECT one.value, series.value
FROM generate_series(one.value, 3) series, (SELECT 1 AS value) one;
```
the one table must be processed by the top-level loop, and series must
be nested.
* For outer joins involving TVFs, the arguments must be treated as `ON`
predicates, not `WHERE` predicates.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1727
2025-07-15 12:23:45 +03:00
Jussi Saurio
0ab0af912c Merge 'bindings/js: fix more tests' from Mikaël Francoeur
Six more tests passing on Turso. The commits can be reviewed separately.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2085
2025-07-15 12:17:15 +03:00
Jussi Saurio
5ea35d700a Merge 'Support page_size pragma setting' from meteorgan
Closes: #1379
This PR consists of three main changes:
1. Rebuild `Pager` to set the correct page size for `buffer_pool`, `wal`
and other components when the database is uninitialized.
2. Persist the latest page size when allocate page 1.
3. Ensure all pragmas emit the correct transaction instructions,
preventing even a `page_size`  read from triggering database
initialization.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2053
2025-07-15 12:14:52 +03:00
meteorgan
d7bdfeb711 reinitialize WalFileShare when reset page size 2025-07-15 16:34:07 +08:00
meteorgan
b42a1ef272 minor improvements based on PR comments 2025-07-15 16:34:07 +08:00
meteorgan
39d79d7420 add tests for page_size pragma 2025-07-15 16:34:07 +08:00
meteorgan
f123c77ee8 fix set page_size in pager 2025-07-15 16:34:07 +08:00
meteorgan
e2ab673624 fix self.pager.replace() panic 2025-07-15 16:34:07 +08:00
meteorgan
bf69b86e94 fix: not all pragma need transaction 2025-07-15 16:34:07 +08:00
meteorgan
a6faab17e9 fix query page size 2025-07-15 16:34:07 +08:00
meteorgan
cf126824de Support set page size 2025-07-15 16:34:07 +08:00
Pekka Enberg
b7db07cf2d Turso 0.1.2 2025-07-15 11:01:25 +03:00
Pekka Enberg
f4bc0ca77e Update CHANGELOG 2025-07-15 11:01:18 +03:00
alpaylan
6b96789b6d add random_expr for SELECT <expr>; 2025-07-14 18:48:02 -04:00
Mikaël Francoeur
68134fa186 support named bind parameters 2025-07-14 15:36:12 -04:00
Mikaël Francoeur
093140d84c throw on empty statement 2025-07-14 15:28:07 -04:00
Mikaël Francoeur
e25064959b return info object 2025-07-14 14:35:48 -04:00
Pekka Enberg
f15fa91695 Merge 'Gopher is biologically closer to beavers than hamsters' from David Shekunts
Biologically gopher is closer to beavers, than to hamsters, so it will
be much more correct to use beaver emoji.
And yes, if you merge this MR I would be proud of my contribution into
open source.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #2067
2025-07-14 21:34:26 +03:00
Mikaël Francoeur
99614a3c7c support open property 2025-07-14 14:03:57 -04:00
Jussi Saurio
553396e9ca btree: unify table&index seek page boundary handling
PR #2065 fixed a bug with table btree seeks concerning boundaries
of leaf pages.

The issue was that if we were e.g. looking for the first key greater than
(GT) 100, we always assumed the key would either be found on the left child
page of a given divider (e.g. divider 102), which is incorrect. #2065 has more
discussion and documentation about this, so read that one for more context.

Anyway:

We already had similar handling for index btrees, but it was baked into
the `BTreeCursor` struct's seek handling itself, whereas #2065 handled this
on the VDBE side.

This PR unifies this handling for both table and index btrees by always doing
the additional cursor advancement in the VDBE.

Unfortunately, since indexes may also need to do an additional advance when they
are looking for an exact match, this resulted in a bigger refactor than anticipated,
since there are quite a few VDBE instructions that may perform a seek, e.g.:
`IdxInsert`, `IdxDelete`, `Found`, `NotFound`, `NoConflict`.

All of these can potentially end up in a similar situation where the cursor needs
one more advance after the initial seek.

For this reason, I have extracted a common VDBE helper `fn seek_internal()` which
all the interested VDBE instructions will call to delegate their seek logic.
2025-07-14 16:46:43 +03:00
Pekka Enberg
363c29af2e Merge 'test/fuzz: fix rowid_seek_fuzz not being a proper fuzz test' from Jussi Saurio
The original `rowid_seek_fuzz` test had a design flaw: it inserted
contiguous non-random integers (so not really fuzzing), which prevented
issues such as the one fixed in #2065 from being discovered.
Further, the test has at some point also been neutered a bit by only
inserting 100 values which makes the btree very small, hiding
interactions between interior pages and neighboring leaf pages.
This should not be merged until #2065 is merged.

Closes #2081
2025-07-14 14:42:42 +03:00
Pekka Enberg
03d170ca05 Turso 0.1.2-pre.4 2025-07-14 13:21:41 +03:00
Pekka Enberg
d5d48db304 Merge 'build: Update cargo-dist to 0.28.6' from Pekka Enberg
Update `cargo-dist` to version 0.28.6. It should make installers more
robust to $HOME not being defined.
Refs #2073

Closes #2082
2025-07-14 13:20:54 +03:00
Pekka Enberg
fd4deda556 Merge 'Add fuzz to CI checks' from Levy A.
Closes #1869
2025-07-14 13:10:36 +03:00