Pekka Enberg efd537dc20 Merge 'Allocation improvements with ImmutableRecord, OwnedRecord and read_record' from Pere Diaz Bou
This pr is huge again but I will try to introduce each improvement one
by one.
## Overview
### Remove Rc for Text and Blob.
In general copying is bad, that's why we hid it with `Rc`s. With the
introduction of `ImmutableRecord` we make it less relevant because now
we will copy only once anyways, no other place should copy it so we can
avoid using `Rc`. If we we were to copy it it most likely means where
are doing something wrong.
### Reuse `Text` and `Blob` OwnedValues.
Most of the queries spend time overwriting the same register over and
over. What about we don't allocate new `OwnedValue` and we just simply
reuse the `OwnedValue` and extend the internal buffer. That's what I did
and it worked quite nicely.
### Make `Register::Record` be `ImmutableRecord`
`ImmutableRecord` basically means "serialized record", that's why all
the data is contained in a single payload buffer. There is a list of
values to reference that payload to reduce time complexity of search --
there is an argument to make a record without this vec to reduce memory
footprint.  This improvement I don't think it had a direct impact on
performance but it is a simpler way to lay the memory without any
complicated reference counted pointers, and instead we use a contiguous
piece of memory.
### Make `ImmutableRecord` reusable in `BTreeCursor`.
`BTreeCursor` allocated and deallocated records when it needed a new
one. This is obviously a big waste because we could be reusing the
internal buffer to avoid allocations. `ImmutableRecord` proved to be
useful here because now, we will only store a single `ImmutableRecord`
in the cursor that we will never deallocate -- we will just reallocate
when needed and replace the current one with the next one on demand.
## Return `Row` as a reference of Registers.
A `ResultRow` bytecode takes care of gathering all the columns of a row
and returning them to the user. Previously we could create a new
`Record` struct with all the cloned values which proved to be wasteful.
SQLite is smart about this so we must be as well. Basically a row now is
a wrapper for `struct Row { *const Register, count: usize }`, and we
basically include some QOL methods to avoid using pointers directly.
I know pointers are unsafe. That's why this row will be invalidate on
the next step of the VM and this row should be not used outside there.
### Inlining go brrr
`read_varint` and `read_value` are called in a tight loop making it easy
to see overhead of the call stack. That's why I sprinkled some
`#[inline(always)]` and saw something like a 15% speed boost.
## read_record with custom `SmallVec<T>`
We tend to overuse vectors for everything, this is quite bad because it
requires heap memory allocations. We can avoid this with a simple
`SmallVec` that simply fallsback to a vec with more complex scenarios.
## Benchmarks!
```
### before
fun/limbo » cargo bench -- limbo_execute 2>&1 | grep -B 1 "time: " | tee out.log
Execute `SELECT 1`/limbo_execute_select_1
                        time:   [43.958 ns 44.056 ns 44.154 ns]
--
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/1
                        time:   [407.82 ns 408.57 ns 409.41 ns]
--
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/10
                        time:   [2.7335 µs 2.7386 µs 2.7443 µs]
--
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/50
                        time:   [13.451 µs 13.485 µs 13.520 µs]
--
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/100
                        time:   [26.967 µs 27.077 µs 27.201 µs]after:
```
### after
```
fun/limbo (more-register) » cargo bench -- limbo_execute 2>&1 | grep -B 1 "time: " | tee out.log                                                                                                                                                                                                                                                        130 ↵
Execute `SELECT 1`/limbo_execute_select_1
                        time:   [33.386 ns 33.440 ns 33.510 ns]
--
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/1
                        time:   [326.79 ns 327.37 ns 328.03 ns]
--
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/10
                        time:   [1.5817 µs 1.5849 µs 1.5889 µs]
--
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/50
                        time:   [7.3295 µs 7.3531 µs 7.3829 µs]
--
Execute `SELECT * FROM users LIMIT ?`/limbo_execute_select_rows/100
                        time:   [14.538 µs 14.570 µs 14.606 µs]
```

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1197
2025-03-30 13:17:16 +03:00
2025-03-27 19:46:30 +02:00
2025-03-29 22:04:08 +01:00
2025-03-29 22:04:08 +01:00
2025-03-30 11:12:11 +02:00
2025-03-04 17:30:50 +01:00
2025-03-23 20:44:09 -03:00
2025-03-29 14:46:11 +02:00
2025-03-25 14:17:31 +01:00
2025-03-04 09:29:57 +02:00
2025-03-29 22:09:53 +01:00
2025-01-14 18:37:26 +02:00
2025-03-19 17:29:17 +02:00
2025-01-30 18:24:19 -03:00
2025-03-23 20:29:55 -03:00
2024-05-07 16:33:44 -03:00
2024-07-12 13:07:34 -07:00
2024-07-12 12:38:56 -07:00
2025-03-25 09:49:44 +02:00

Limbo

Project Limbo

Limbo is a project to build the modern evolution of SQLite.

PyPI

Chat with developers on Discord


Features and Roadmap

Limbo is a work-in-progress, in-process OLTP database engine library written in Rust that has:

  • Asynchronous I/O support on Linux with io_uring
  • SQLite compatibility [doc] for SQL dialect, file formats, and the C API
  • Language bindings for JavaScript/WebAssembly, Rust, Go, Python, and Java
  • OS support for Linux, macOS, and Windows

In the future, we will be also working on:

  • Integrated vector search for embeddings and vector similarity.
  • BEGIN CONCURRENT for improved write throughput.
  • Improved schema management including better ALTER support and strict column types by default.

Getting Started

💻 Command Line
You can install the latest `limbo` release with:
curl --proto '=https' --tlsv1.2 -LsSf \
  https://github.com/tursodatabase/limbo/releases/latest/download/limbo_cli-installer.sh | sh

Then launch the shell to execute SQL statements:

Limbo
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database
limbo> CREATE TABLE users (id INT PRIMARY KEY, username TEXT);
limbo> INSERT INTO users VALUES (1, 'alice');
limbo> INSERT INTO users VALUES (2, 'bob');
limbo> SELECT * FROM users;
1|alice
2|bob

You can also build and run the latest development version with:

cargo run
JavaScript
npm i limbo-wasm

Example usage:

import { Database } from 'limbo-wasm';

const db = new Database('sqlite.db');
const stmt = db.prepare('SELECT * FROM users');
const users = stmt.all();
console.log(users);
🐍 Python
pip install pylimbo

Example usage:

import limbo

con = limbo.connect("sqlite.db")
cur = con.cursor()
res = cur.execute("SELECT * FROM users")
print(res.fetchone())
🐹 Go
  1. Clone the repository
  2. Build the library and set your LD_LIBRARY_PATH to include limbo's target directory
cargo build --package limbo-go
export LD_LIBRARY_PATH=/path/to/limbo/target/debug:$LD_LIBRARY_PATH
  1. Use the driver
go get github.com/tursodatabase/limbo
go install github.com/tursodatabase/limbo

Example usage:

import (
    "database/sql"
    _"github.com/tursodatabase/limbo"
)

conn, _ = sql.Open("sqlite3", "sqlite.db")
defer conn.Close()

stmt, _ := conn.Prepare("select * from users")
defer stmt.Close()

rows, _ = stmt.Query()
for rows.Next() {
    var id int 
    var username string
    _ := rows.Scan(&id, &username)
    fmt.Printf("User: ID: %d, Username: %s\n", id, username)
}
Java

We integrated Limbo into JDBC. For detailed instructions on how to use Limbo with java, please refer to the README.md under bindings/java.

Contributing

We'd love to have you contribute to Limbo! Please check out the contribution guide to get started.

FAQ

How is Limbo different from Turso's libSQL?

Limbo is a project to build the modern evolution of SQLite in Rust, with a strong open contribution focus and features like native async support, vector search, and more. The libSQL project is also an attempt to evolve SQLite in a similar direction, but through a fork rather than a rewrite.

Rewriting SQLite in Rust started as an unassuming experiment, and due to its incredible success, replaces libSQL as our intended direction. At this point, libSQL is production ready, Limbo is not - although it is evolving rapidly. As the project starts to near production readiness, we plan to rename it to just "Turso". More details here.

Publications

  • Pekka Enberg, Sasu Tarkoma, Jon Crowcroft Ashwin Rao (2024). Serverless Runtime / Database Co-Design With Asynchronous I/O. In EdgeSys 24. [PDF]
  • Pekka Enberg, Sasu Tarkoma, and Ashwin Rao (2023). Towards Database and Serverless Runtime Co-Design. In CoNEXT-SW 23. [PDF] [Slides]

License

This project is licensed under the MIT license.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in Limbo by you, shall be licensed as MIT, without any additional terms or conditions.

Description
No description provided
Readme 43 MiB
Languages
Rust 76.8%
Tcl 6.6%
C 6.4%
Dart 2.4%
Java 2.3%
Other 5.3%