Commit Graph

1500 Commits

Author SHA1 Message Date
Pekka Enberg
db0c59413b Merge 'Implement json_array_length' from Peter Sooley
In line with [other work](#127) for JSON support, this PR adds support
for [`json_array_length`](https://www.sqlite.org/json1.html#jarraylen).
This includes a first pass at supporting the JSON path for accessing
values within the JSON.
I've added tests in rust and tcl.
![image](https://github.com/user-
attachments/assets/0d0e3319-317b-4783-bff4-241eb2902255)

Closes #555
2024-12-27 10:08:22 +02:00
Pekka Enberg
5065074617 Merge 'core: disk serialization changes to align with sqlite' from Jussi Saurio
This PR's genesis is from investigating #532, but I still can't reliably
reproduce it on either `main` or this branch so I don't know if this PR
_fixes_ anything, but I guess it aligns us more with sqlite anyway
---
Anyway: I looked at DBs created with limbo and with sqlite using
[ImHex](https://github.com/WerWolv/ImHex) and the differences seem to
be:
1. SQLite uses varint according to [the
spec](https://www.sqlite.org/fileformat.html#record_format), whereas
limbo always encodes integers as i64
2. Limbo adds 4 bytes of zeros for overflow page pointer (even in cases
where the cell doesnt overflow)
3. Limbo adds a space after `CREATE TABLE name` before the `(` even when
user doesn't specify it?
I implemented the following:
- Fix 1: Varint serialization of i8, i16, i24, i32, i48 and i64
according to payload, instead of always using i64
- Fix 2: Removed the 4 bytes reserved for overflow page pointer in non-
overflow cases

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #550
2024-12-27 10:06:52 +02:00
Pekka Enberg
937779b8c0 Merge 'core/btree: small refactoring + documentation tweaks' from Jussi Saurio
small follow up to https://github.com/tursodatabase/limbo/pull/539
contains:
- Variable renaming and comments to `btreecursor.insert_into_cell()`
- New utility methods `pagecontent.header_size()`,
`pagecontent.cell_pointer_array_size()`,
`pagecontent.unallocated_region_start()` and
`pagecontent.unallocated_region_size()`
- Refactor of `btreecursor.compute_free_space()` (plus comments and
variable renaming)
- Rename `pagecontent.cell_get_raw_pointer_region()` to
`pagecontent.cell_pointer_array_offset_and_size()` and remove its usage
in `btreecursor.defragment_page()`

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #543
2024-12-27 10:06:30 +02:00
Pekka Enberg
cf1a3fb3e1 Merge 'Fix: core/translate/insert: fix four issues with inserts' from Jussi Saurio
Closes #436
This PR fixes four issues:
1. Not respecting user-provided column names (e.g. `INSERT INTO foo
(b,c) values (1,2);` would just insert into the first two columns
regardless of what index `b` and `c` have)
2. Limbo would get in an infinite loop when inserting too many values
(too many i.e. more columns than the table has)
3. False positive unique constraint error on non-primary key columns
when inserting multiple values, e.g.
```
limbo> create table t1(v1 int);
limbo> insert into t1 values (1),(2);
Runtime error: UNIQUE constraint failed: t1.v1 (19)
```
as seen [here](https://github.com/tursodatabase/limbo/pull/490#issuecomm
ent-2545545562)
4. Limbo no longer uses a coroutine for INSERT when only inserting one
row. See [this comment](https://github.com/tursodatabase/limbo/issues/43
6#issuecomment-2533937845). For the equivalent query, Limbo now
generates:
```
limbo> EXPLAIN INSERT INTO users (name, email) VALUES ('John Doe', 'john@example.com');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     10    0                    0   Start at 10
1     OpenWriteAsync     0     2     0                    0
2     OpenWriteAwait     0     0     0                    0
3     String8            0     3     0     John Doe       0   r[3]='John Doe'
4     String8            0     4     0     john@example.com  0   r[4]='john@example.com'
5     NewRowId           0     1     0                    0
6     MakeRecord         2     3     5                    0   r[5]=mkrec(r[2..4])
7     InsertAsync        0     5     1                    0
8     InsertAwait        0     0     0                    0
9     Halt               0     0     0                    0
10    Transaction        0     1     0                    0
11    Null               0     2     0                    0   r[2]=NULL
12    Goto               0     1     0                    0
```
---
Note that this PR doesn't fix e.g. #472 which requires creating an index
on the non-rowid primary key column(s), nor does it implement rollback
(e.g. inserting two rows where one fails to unique constraint still
inserts the other row)
---
**EXAMPLES OF ERRONEOUS BEHAVIOR -- current head of main:**
wrong column inserted
```
limbo> create table rowidalias_b (a, b INTEGER PRIMARY KEY, c, d);

limbo> insert into rowidalias_b (d) values ('d only');
limbo> select * from rowidalias_b;
d only|1||   <-- gets inserted into column a
```
wrong column inserted
```
limbo> create table textpk (a, b text primary key, c);
limbo> insert into textpk (a,b,c) values ('a','b','c');
limbo> select * from textpk;
a|b|c
limbo> insert into textpk (b,c) values ('b','c');
limbo> select * from textpk;
a|b|c
b|c|  <--- b gets inserted into column a
```
false positive from integer check due to attempting to insert wrong
column
```
limbo> create table rowidalias_b (a, b INTEGER PRIMARY KEY, c, d);
limbo> insert into rowidalias_b (a,c) values ('lol', 'bal');
Parse error: MustBeInt: the value in the register is not an integer  <-- tries to insert c into b column
```
false positive from integer check due to attempting to insert wrong
column
```
limbo> CREATE TABLE users (
    id INTEGER PRIMARY KEY,
    name TEXT,
    email TEXT
);
limbo> INSERT INTO users (name, email) VALUES ('John Doe', 'john@example.com');
Parse error: MustBeInt: the value in the register is not an integer.   <-- tries to insert name into id column
```
allows write of nonexistent column
```
limbo> create table a(b);
limbo> insert into a (nonexistent_col) values (1);
limbo> select * from a;
1
```
hangs forever when inserting too many values
```
limbo> create table a (b integer primary key);
limbo> insert into a values (1,2);  <-- spinloops forever at 100% cpu
```
unique constraint error on non-unique column
```
limbo> create table t1(v1 int);
limbo> insert into t1 values (1),(2);
Runtime error: UNIQUE constraint failed: t1.v1 (19)
```
**EXAMPLES OF CORRECT BEHAVIOR -- this branch:**
correct column inserted
```
limbo> create table rowidalias_b (a, b INTEGER PRIMARY KEY, c, d);
limbo> insert into rowidalias_b (d) values ('d only');
limbo> select * from rowidalias_b;
|1||d only
```
correct column inserted
```
limbo> create table textpk (a, b text primary key, c);
limbo> insert into textpk (a,b,c) values ('a','b','c');
limbo> select * from textpk;
a|b|c
limbo> insert into textpk (b,c) values ('b','c');
limbo> select * from textpk;
a|b|c
|b|c
```
correct columns inserted, PK autoincremented
```
limbo> create table rowidalias_b (a, b INTEGER PRIMARY KEY, c, d);
limbo> insert into rowidalias_b (a,c) values ('lol', 'bal');
limbo> select * from rowidalias_b;
lol|1|bal|
```
correct column inserted, PK autoincremented
```
limbo> CREATE TABLE users (
    id INTEGER PRIMARY KEY,
    name TEXT,
    email TEXT
);
limbo> INSERT INTO users (name, email) VALUES ('John Doe', 'john@example.com');
limbo> select * from users;
1|John Doe|john@example.com
```
reports parse error correctly about wrong number of values
```
limbo> create table a (b integer primary key);
limbo> insert into a values (1,2);
Parse error: table a has 1 columns but 2 values were supplied
```
reports parse error correctly about nonexistent column
```
limbo> create table a(b);
limbo> insert into a (nonexistent_col) values (1);
Parse error: table a has no column named nonexistent_col
```
no unique constraint error on non-unique column
```
limbo> create table t1(v1 int);
limbo> insert into t1 values (1),(2);
limbo> select * from t1;
1
2
```
**Also, added multi-row inserts to simulator and ran into at least
this:**
```
Seed: 9444323279823516485
path to db '"/var/folders/qj/r6wpj6657x9cj_1jx_62cpgr0000gn/T/.tmpcYczRv/simulator.db"'
Initial opts SimulatorOpts { ticks: 3474, max_connections: 1, max_tables: 79, read_percent: 61, write_percent: 12, delete_percent: 27, max_interactions: 2940, page_size: 4096 }
thread 'main' panicked at core/storage/sqlite3_ondisk.rs:332:36:
called `Result::unwrap()` on an `Err` value: Corrupt("Invalid page type: 83")
```

Closes #533
2024-12-27 10:00:37 +02:00
Peter Sooley
28244b10d6 implement json_array_length 2024-12-26 15:08:11 -08:00
jussisaurio
c7448d2917 no allocation for serial types 2024-12-26 12:23:53 +02:00
jussisaurio
80933a32e9 remove space allocated for overflow pointer in non-overflow cases 2024-12-25 23:09:23 +02:00
jussisaurio
381335724a add tests for serialize() 2024-12-25 22:57:55 +02:00
jussisaurio
6bf1ab7726 add consts for integer lo/hi values and serial types 2024-12-25 22:34:34 +02:00
jussisaurio
78da71c72a encode integers with proper varint types 2024-12-25 22:08:11 +02:00
jussisaurio
c4e2a344ae parse error instead of assert! for unsupported features 2024-12-25 21:14:58 +02:00
jussisaurio
050b8744ea Dont use coroutine when inserting a single row 2024-12-25 21:14:58 +02:00
jussisaurio
c78a3e952a clean up implementation 2024-12-25 21:14:58 +02:00
jussisaurio
fa5ca68eec Add multi-row insert to simulator 2024-12-25 21:14:55 +02:00
jussisaurio
51541dd8dc fix issues with insert 2024-12-25 21:14:08 +02:00
Pekka Enberg
548f66e1cd Merge 'fix empty range error when 0 interactions are produced by creating at least 1 interaction' from Alperen Keleş
Fixes the panicking case in
https://github.com/tursodatabase/limbo/issues/548

Closes #549
2024-12-25 19:45:09 +02:00
alpaylan
e49ba4f982 fix empty range error when 0 interactions are produced by creating at least 1 interaction 2024-12-25 09:55:28 -05:00
Pekka Enberg
37e1f35df8 Fix Cargo.toml in macros crate 2024-12-25 11:54:16 +02:00
Pekka Enberg
37445ee5a9 Merge 'simulator: Kill dead code' from Pekka Enberg
...the old maybe_add_table() codepath as it is not used.

Closes #547
2024-12-25 11:50:05 +02:00
Pekka Enberg
652283efc1 simulator: Kill dead code
...the old maybe_add_table() codepath as it is not used.
2024-12-25 10:41:05 +02:00
Pekka Enberg
f873493d62 Merge 'switch the seed, database path, and plan path prints to println instead of log::info' from Alperen Keleş
Fixes https://github.com/tursodatabase/limbo/issues/545

Closes #546
2024-12-25 10:19:09 +02:00
alpaylan
28ae691bf7 switch the seed, database path, and plan path prints to println instead of log::info 2024-12-25 03:04:57 -05:00
Pekka Enberg
49b235cc92 Merge 'core: wal transaction start' from Pere Diaz Bou
This pr adds support for multiple readers and a single writer with a
custom made lock called `LimboRwLock`. Basically there are 5 allowed
read locks which store the max frame allowed in that "snapshot" and any
reader will try to acquire the biggest one possible. Writer will just
try to lock the `write_lock` and if not successful, it will return busy.
The only checkpoint mode supported for now is `PASSIVE` but it should be
trivial to add more modes.
This needs testing, but I will do it in another PR. I just wanted to do
it in another PR.

Closes #544
2024-12-25 09:42:03 +02:00
Pekka Enberg
ffd1c725ee Merge 'Simulator improvements' from Alperen Keleş
This PR makes two small incremental updates:
1- It adds a Clap CLI for simulator configuration, using the same Clap
version as the Limbo cli crate
2- It creates a new submodule called `simulator`, moving simulator
related structs from the large main file into their own files.
I am open to suggestions on the submodule name instead of `simulator` as
it's kind of weird to have `simulator/simulator` in the file tree.

Closes #540
2024-12-25 09:41:17 +02:00
Pere Diaz Bou
93e3b49f08 bench 2024-12-25 00:25:23 +01:00
Pere Diaz Bou
5cd84a407f fmt 2024-12-24 18:42:58 +01:00
Pere Diaz Bou
a2921bd32c core: add checkpoint mode passive 2024-12-24 18:30:58 +01:00
jussisaurio
42ea9041e1 rename cell_get_raw_pointer_region() and refactor a bit 2024-12-24 19:27:01 +02:00
Pere Diaz Bou
3bce282352 respect max_frame on checkpoint 2024-12-24 18:18:17 +01:00
Pere Diaz Bou
aed14117c9 core: transaction support 2024-12-24 18:04:30 +01:00
jussisaurio
25338b5cb4 refactor compute_free_space() 2024-12-24 19:00:22 +02:00
jussisaurio
c6b7ddf77a Improve comments in BTreeCursor::compute_free_space() 2024-12-24 10:30:27 +02:00
jussisaurio
91cca0d5b7 use more descriptive names in BTreeCursor::insert_into_cell() 2024-12-24 10:28:53 +02:00
jussisaurio
a94d4ca8bc Merge 'core/btree: improve documentation' from Jussi Saurio
This PR should have no functional changes, just variable renaming and
comments
Using `///` comment format for better IDE support

Reviewed-by: Pere Diaz Bou <penberg@iki.fi>

Closes #539
2024-12-24 09:44:15 +02:00
alpaylan
2186b3973b change the name of the simulator submodule into runner 2024-12-23 16:16:39 -05:00
jussisaurio
3ab7f7a0b8 Merge 'Use custom expr equality check in translation and planning' from Preston Thorpe
Idk how I missed these during the initial PR 👍

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #541
2024-12-23 22:58:03 +02:00
jussisaurio
c727ed7e8a rename cell_start to cell_pointer_array_start, part 2: electric boogaloo 2024-12-23 22:31:35 +02:00
jussisaurio
17440393f5 rename cell_start to cell_pointer_array_start 2024-12-23 22:30:05 +02:00
jussisaurio
81526089a4 add comment about cell_get_raw_pointer_region() 2024-12-23 22:26:49 +02:00
jussisaurio
668a0ecae8 comment about page header size difference between page types 2024-12-23 22:18:22 +02:00
jussisaurio
9ea4c95ee1 even more comments 2024-12-23 22:07:20 +02:00
jussisaurio
6a287ae1a9 add comment about cell_content_area 0 value meaning u16::MAX 2024-12-23 21:33:57 +02:00
jussisaurio
40a0bef0dc better fixme comments 2024-12-23 21:19:18 +02:00
jussisaurio
c417fe7880 add link to sqlite source about payload_overflows() 2024-12-23 21:14:20 +02:00
Pekka Enberg
0a479a9a4e Merge 'Fix file creation in GenericIO open_file function' from Dezhi Wu
`cargo test` is always failing on FreeBSD, the following is one of the
errors:
```
---- tests::test_simple_overflow_page stdout ----
thread 'tests::test_simple_overflow_page' panicked at test/src/lib.rs:32:84:
called `Result::unwrap()` on an `Err` value: IOError(Os { code: 2, kind: NotFound, message: "No such file or directory" })
```
After some digging, I found that the `open_file` function in
`core/io/generic.rs` does not respect the `OpenFlags::Create` flag. This
commit adds support for file creation in the `open_file` function.
`cargo test` now passes on FreeBSD.

Closes #537
2024-12-23 16:15:11 +02:00
Pekka Enberg
58292c1a42 Merge 'UUID support' from Preston Thorpe
#509
Started the discussion on discord about possibly supporting UUID types
natively. This PR only implements the `sqlean` extension's functions and
behavior, with the only changes being:
1. uuid's are returned as `blob`s by default. (that was an assumption I
made considering perf, thinking this would be preferred if UUID ended up
being a supported native type.  if `text` is preferred I can change it)
2.  `uuidv7` types here can accept an argument of seconds since epoch to
customize the embedded timestamp. The func
`uuidv7_timestamp_ms(string_or_blob_v7)` allows the user to convert
their uuid7 back into the timestamp.
![image](https://github.com/user-
attachments/assets/ca53ee9b-f1f1-410b-955f-acd140bd4989)

Closes #518
2024-12-23 13:21:13 +02:00
alpaylan
4f07342fdc catch panics, add doublecheck 2024-12-22 23:25:35 -05:00
PThorpe92
fbf42458b8 Use custom expr equality check in translation and planner 2024-12-22 21:46:31 -05:00
alpaylan
833c75080b break up the simulator primitives into their own files in the simulator submodule 2024-12-22 17:16:50 -05:00
alpaylan
9f08b621ec add clap CLI for configuring the simulator 2024-12-22 17:06:46 -05:00