Commit Graph

3601 Commits

Author SHA1 Message Date
PThorpe92
ff02d74afb Fix update queries when limit is 0 2025-03-30 12:15:25 -04:00
PThorpe92
a88ce2a4b7 Correct comment on update plan 2025-03-30 12:15:25 -04:00
PThorpe92
3f4636196c Fix error message for non btree table 2025-03-30 12:15:25 -04:00
PThorpe92
bbbd1df1ab Replace unwrap in update translation with parse error 2025-03-30 12:15:24 -04:00
PThorpe92
4ac0781991 Add update tcl tests for LIMIT clauses on update queries 2025-03-30 12:15:24 -04:00
PThorpe92
7486149643 Support LIMIT clause on update queries 2025-03-30 12:15:24 -04:00
PThorpe92
b7fca31ef6 Add comments and impl Copy on iterdir type 2025-03-30 12:15:24 -04:00
PThorpe92
3fe14f37a5 Create plan for Update queries 2025-03-30 12:15:24 -04:00
Pekka Enberg
efd537dc20 Merge 'Allocation improvements with ImmutableRecord, OwnedRecord and read_record' from Pere Diaz Bou
This pr is huge again but I will try to introduce each improvement one
by one.
## Overview
### Remove Rc for Text and Blob.
In general copying is bad, that's why we hid it with `Rc`s. With the
introduction of `ImmutableRecord` we make it less relevant because now
we will copy only once anyways, no other place should copy it so we can
avoid using `Rc`. If we we were to copy it it most likely means where
are doing something wrong.
### Reuse `Text` and `Blob` OwnedValues.
Most of the queries spend time overwriting the same register over and
over. What about we don't allocate new `OwnedValue` and we just simply
reuse the `OwnedValue` and extend the internal buffer. That's what I did
and it worked quite nicely.
### Make `Register::Record` be `ImmutableRecord`
`ImmutableRecord` basically means "serialized record", that's why all
the data is contained in a single payload buffer. There is a list of
values to reference that payload to reduce time complexity of search --
there is an argument to make a record without this vec to reduce memory
footprint.  This improvement I don't think it had a direct impact on
performance but it is a simpler way to lay the memory without any
complicated reference counted pointers, and instead we use a contiguous
piece of memory.
### Make `ImmutableRecord` reusable in `BTreeCursor`.
`BTreeCursor` allocated and deallocated records when it needed a new
one. This is obviously a big waste because we could be reusing the
internal buffer to avoid allocations. `ImmutableRecord` proved to be
useful here because now, we will only store a single `ImmutableRecord`
in the cursor that we will never deallocate -- we will just reallocate
when needed and replace the current one with the next one on demand.
## Return `Row` as a reference of Registers.
A `ResultRow` bytecode takes care of gathering all the columns of a row
and returning them to the user. Previously we could create a new
`Record` struct with all the cloned values which proved to be wasteful.
SQLite is smart about this so we must be as well. Basically a row now is
a wrapper for `struct Row { *const Register, count: usize }`, and we
basically include some QOL methods to avoid using pointers directly.
I know pointers are unsafe. That's why this row will be invalidate on
the next step of the VM and this row should be not used outside there.
### Inlining go brrr
`read_varint` and `read_value` are called in a tight loop making it easy
to see overhead of the call stack. That's why I sprinkled some
`#[inline(always)]` and saw something like a 15% speed boost.
## read_record with custom `SmallVec<T>`
We tend to overuse vectors for everything, this is quite bad because it
requires heap memory allocations. We can avoid this with a simple
`SmallVec` that simply fallsback to a vec with more complex scenarios.
## Benchmarks!
```
### before
fun/limbo » cargo bench -- limbo_execute 2>&1 | grep -B 1 "time: " | tee out.log
Execute `SELECT 1`/limbo_execute_select_1
                        time:   [43.958 ns 44.056 ns 44.154 ns]
--
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/1
                        time:   [407.82 ns 408.57 ns 409.41 ns]
--
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/10
                        time:   [2.7335 µs 2.7386 µs 2.7443 µs]
--
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/50
                        time:   [13.451 µs 13.485 µs 13.520 µs]
--
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/100
                        time:   [26.967 µs 27.077 µs 27.201 µs]after:
```
### after
```
fun/limbo (more-register) » cargo bench -- limbo_execute 2>&1 | grep -B 1 "time: " | tee out.log                                                                                                                                                                                                                                                        130 ↵
Execute `SELECT 1`/limbo_execute_select_1
                        time:   [33.386 ns 33.440 ns 33.510 ns]
--
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/1
                        time:   [326.79 ns 327.37 ns 328.03 ns]
--
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/10
                        time:   [1.5817 µs 1.5849 µs 1.5889 µs]
--
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/50
                        time:   [7.3295 µs 7.3531 µs 7.3829 µs]
--
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/100
                        time:   [14.538 µs 14.570 µs 14.606 µs]
```

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1197
2025-03-30 13:17:16 +03:00
Pere Diaz Bou
578bc9e3e6 extract constant min_header_size 2025-03-30 11:12:11 +02:00
Pere Diaz Bou
8d74f4b8ab remove unnecessary partial ord 2025-03-30 11:07:23 +02:00
Pere Diaz Bou
3899f8ca17 comment header size 2025-03-30 11:03:45 +02:00
Pere Diaz Bou
541b67bd2b rename get_lazy_immutable_record -> get_immutable_record_or_create 2025-03-30 11:00:59 +02:00
Pere Diaz Bou
6ccb2e16d1 safer api for ImmutableRecord recreation 2025-03-30 11:00:13 +02:00
Pere Diaz Bou
f2f6173670 assert capacity didn't change 2025-03-30 10:37:58 +02:00
Pere Diaz Bou
3ac1795c25 fix from_register serialization 2025-03-30 10:31:39 +02:00
Pere Diaz Bou
587cdac2c1 ignore sequential write beause it takes too long 2025-03-29 22:26:29 +01:00
Pere Diaz Bou
37ddf0946f rever testing.db change 2025-03-29 22:09:53 +01:00
Pere Diaz Bou
a13b33fec9 clippy again 2025-03-29 22:07:43 +01:00
Pere Diaz Bou
d9f5cd870d clippy 2025-03-29 22:04:08 +01:00
Pere Diaz Bou
4a9c4cff02 fix comparison of immutable records in seekgt 2025-03-29 22:04:08 +01:00
Pere Diaz Bou
9623cce986 push null refvalue too 2025-03-29 22:04:08 +01:00
Pere Diaz Bou
34c8fd7e6c fix serial_type write 2025-03-29 22:04:08 +01:00
Pere Diaz Bou
1bfec65f23 remove dbg 2025-03-29 22:04:08 +01:00
Pere Diaz Bou
e504262bd5 fix rebase 2025-03-29 22:04:08 +01:00
Pere Diaz Bou
105b421274 make read_record, read_varint and read_value faster
We make read_record faster by not allocating Vec if not needed. This is
why I introduced a simple `SmallVec<T>` that will have a stack allocated
list for the simplest workloads, and a heap allocated if we were to
require more stuff.

Both read_varint and read_value, at least in my mac m4, were not
inlined. Since these functions are called so many times it made sense to
inline them to avoid call overhead. With this I saw something like 20%
improvement over previous commit in my m4.
2025-03-29 22:04:08 +01:00
Pere Diaz Bou
3317195a53 Reusable ImmutableRecord -> allocation reduction
Improve allocation usage from ImmutableRecords by reusing them.
ImmutableRecord is basically a contigous piece of memory that holds the
current record. If we move to some other record we usually deallocate
the previous one and allocate a new one -- obviously this is wasteful.
With this commit we will reuse the ImmutableRecord to allow payload to
be extended if needed or reused if we can, making it faster to iterate
records basically.
2025-03-29 22:04:08 +01:00
Pere Diaz Bou
ee55116ca6 return row as reference to registers 2025-03-29 22:04:08 +01:00
Pere Diaz Bou
5b7fcd27bd make column reuse blob/text fields 2025-03-29 22:02:49 +01:00
Pere Diaz Bou
78e9f1c09a append writer 2025-03-29 22:02:49 +01:00
Pere Diaz Bou
bf37fd3314 wip 2025-03-29 22:02:49 +01:00
Pekka Enberg
4ee60348f2 Merge 'Fixes probably all floating point math issues and floating point display issues.' from Ihor Andrianov
Closes #1206
Closes #447
Closes #1117
Issues:
 1. Rust floating point math fucntions are non deterministic.
 2. SQLite have complex floating point display rules.
I changed rust functions to libm for math ops and implemented Display
trait for OwnedValue:Float. A lot of float formatting SQLite probably
inherits from C have to be handcrafted.

Closes #1208
2025-03-29 17:33:53 +02:00
Ihor Andrianov
09b55a9b41 fix nan to return null 2025-03-29 14:47:08 +02:00
Ihor Andrianov
8b9f34af71 fix tests and return nan as null 2025-03-29 14:46:11 +02:00
Ihor Andrianov
922945e819 all output should go through display 2025-03-29 12:46:06 +02:00
Ihor Andrianov
6e04709c4f add comprehensive float display trait 2025-03-29 12:44:34 +02:00
Ihor Andrianov
303f1b3749 replace std math functions with libm for compat 2025-03-29 12:16:14 +02:00
Pere Diaz Bou
28904e74ae Merge 'Make BTreeCell/read_payload not allocate any data + overflow fixes' from Pere Diaz Bou
This PR has two parts:
## 1. Make reading cells faster
My benchmark was simple, running `test_sequential_write` test and see
how long it takes. It is a nice benchmark because it will write a new
row and then `select * from t` for every write to check the contents.
## From:
```bash
cargo test test_sequential_write --release -- --nocapture --test-threads=1   7.21s user 0.34s system 88% cpu 8.527 total
```
### To:
```bash
cargo test test_sequential_write --release -- --nocapture --test-threads=1  3.14s user 0.31s system 82% cpu 4.161 total
```
## 2. Fix reading overflow pages.
The code to read overflow pages was wrong, `read_payload` would try to
read as many overflow pages as possible but if there weren't in cache
they would return `IO` and not read more pages after that. This PR makes
every read record to check if it needs to read from overflow pages and
if not, it will simply read from the current page.

Closes #1141
2025-03-28 12:25:34 +01:00
Pere Diaz Bou
cb85ba8e82 fix extensions.py test 2025-03-28 11:58:03 +01:00
Pekka Enberg
704b4d3baf Merge 'JavaScript binding improvements' from Pekka Enberg
Closes #1204
2025-03-28 12:39:43 +02:00
Pere Diaz Bou
c5b718ac32 fix review comments 2025-03-28 11:30:19 +01:00
Pere Diaz Bou
01cdcf719c remove ignored from sequential tests 2025-03-28 11:30:19 +01:00
Pere Diaz Bou
83d0c9a1b6 fix read overflow page procedure 2025-03-28 11:30:19 +01:00
Pere Diaz Bou
f51c20adf0 read overflow pages on demand 2025-03-28 11:12:43 +01:00
Pere Diaz Bou
dc8acf1a4a cell_get no allocations 2025-03-28 11:12:27 +01:00
Pekka Enberg
94262e4660 bindings/javascript: Fix Statement.get() implementation 2025-03-28 11:32:55 +02:00
Pekka Enberg
7348eb0aa1 bindings/javascript: Add better-sqlite3 tests 2025-03-28 11:32:55 +02:00
Pekka Enberg
11d1dcf31a bindings/javascript: Run tests in parallel 2025-03-28 10:38:40 +02:00
Pekka Enberg
387b68fc06 Merge 'Expose 'Explain' to prepared statement to allow for alternate Writer ' from Preston Thorpe
### The problem:
I often need to copy the output of an `Explain` statement to my
clipboard. Currently this is not possible because it currently will only
write to stdout.
All other limbo output, I am able to run `.output file` in the CLI, then
enter my query and in another tmux pane I simply `cat file | xclip -in
-selection clipboard`.
### The solution:
Expose a `statement.explain()` method that returns the query explanation
as a string. If the user uses something like `execute` instead of
prepare, it will default to `stdout` as expected, but this allows the
user to access the query plan on the prepared statement and do with it
what they please.

Closes #1166
2025-03-28 09:55:58 +02:00
Pekka Enberg
8caa234df3 simulator: Reduce info-level logging
Make the simulator less noisy for casual runs.
2025-03-28 08:10:57 +02:00