mirror of https://github.com/aljazceru/turso.git synced 2025-12-27 13:04:20 +01:00

Go to file

Jussi Saurio d125daf1f2 Merge 'Use more structured approach in translate_insert' from Jussi Saurio

Closes #2686
**Note**: this PR also incorporates #2700, which I cannot merge
separately because it reveals the bug described in #2686, which nothing
on `main` currently detects.
_(Also as background, the way this all started was: I was trying to
enable `UNIQUE` and `PRIMARY KEY` usage in `simulator` in #2641 , but
couldn't because it would constantly fail due to #2686.)_
---
I'll admit this PR went in a different direction than I had envisioned.
I was trying to debug and minimally fix #2686, but was finding it
increasingly hard to understand the flow of `translate_insert` and fix
this specific issue without breaking something else. So, I thought it
might be benefit from restructuring.
---
## Functional changes
- Fixes #2686.
    * If an index contained a `rowid` alias column, we were inserting
`NULL` into the index instead of the actual integer value.
    * The root cause of this is that SQLite does insert `NULL` in place
of the rowid alias column into the table, presumably to save space.
    * This is not a problem for tables as `SELECT`ing the rowid alias
column will always be mapped into `Insn::Rowid`, but it is a major
problem for indexes as index lookups will never find anything.
## Code structure changes
The responsibility of holding the information about what to insert is
now contained in these new data structures:
```rust
/// Represents how a table should be populated during an INSERT.
#[derive(Debug)]
struct Insertion<'a> {
    /// The integer key ("rowid") provided to the VDBE.
    key: InsertionKey<'a>,
    /// The column values that will be fed to the MakeRecord instruction to insert the row.
    /// If the table has a rowid alias column, it will also be included in this record,
    /// but a NULL will be stored for it.
    col_mappings: Vec<ColumnMapping<'a>>,
    /// The register that will contain the record built using the MakeRecord instruction.
    record_reg: usize,
}

#[derive(Debug)]
enum InsertionKey<'a> {
    /// Rowid is not provided by user and will be autogenerated.
    Autogenerated { register: usize },
    /// Rowid is provided via the 'rowid' keyword.
    LiteralRowid {
        value_index: Option<usize>,
        register: usize,
    },
    /// Rowid is provided via a rowid alias column.
    RowidAlias(ColumnMapping<'a>),
}

/// Represents how a column in a table should be populated during an INSERT.
/// In a vector of InsertionMapping, the index of a given InsertionMapping is
/// the position of the column in the table.
#[derive(Debug)]
struct ColumnMapping<'a> {
    /// Column definition
    column: &'a Column,
    /// Index of the value to use from a tuple in the insert statement.
    /// This is needed because the values in the insert statement are not necessarily
    /// in the same order as the columns in the table, nor do they necessarily contain
    /// all of the columns in the table.
    /// If None, a NULL will be emitted for the column, unless it has a default value.
    /// A NULL rowid alias column's value will be autogenerated.
    value_index: Option<usize>,
    /// Register where the value will be stored for insertion into the table.
    register: usize,
}
```
---
This gets rid of a few things that are a bit hard to follow in the
current implementation:
1. Needing to keep track of "the last rowid explicit value" and other
weird edge case code related to rowids
```rust
// <old code>

// In case when both rowid and rowid-alias column provided in the query - turso-db overwrite rowid with **latest** value from the list
// As we iterate by column in natural order of their definition in scheme,
// we need to track last value_index we wrote to the rowid and overwrite rowid register only if new value_index is greater
let mut last_rowid_explicit_value = None;

...
// <more old code>

// When inserting a single row, SQLite writes the value provided for the rowid alias column (INTEGER PRIMARY KEY)
// directly into the rowid register and writes a NULL into the rowid alias column.
let write_directly_to_rowid_reg = mapping.column.is_rowid_alias;
let write_reg = if write_directly_to_rowid_reg {
    if last_rowid_explicit_value.is_some_and(|x| x > value_index) {
        continue;
    }
    last_rowid_explicit_value = Some(value_index);
    column_registers_start // rowid always the first register in the array for insertion record
} else {
    column_register
};
```
Instead when the `Insertion` struct is constructed, it will simply
overwrite `InsertionKey` if a rowid reference is encountered multiple
times, so we naturally use the "last seen" rowid value. Moreover,
`InsertionKey` is always translated first, so we need no special logic
for it:
```rust
// <new code>
translate_key(program, insertion, &mut translate_value_fn, resolver)?;
for col in insertion.col_mappings.iter() {
    translate_column(
        program,
        col.column,
        col.register,
        col.value_index,
        &mut translate_value_fn,
        resolver,
    )?;
}
```
---
2. Needing to keep track of registers in the main execution flow:
```rust
// <old code>

// allocate a register for each column in the table. if not provided by user, they will simply be set as null.
// allocate an extra register for rowid regardless of whether user provided a rowid alias column.
let num_cols = btree_table.columns.len();
let rowid_and_columns_start_register = program.alloc_registers(num_cols + 1);
let columns_start_register = rowid_and_columns_start_register + 1;
```
Now the main execution flow just uses these methods on `Insertion`:
```rust
// Create and insert the record
program.emit_insn(Insn::MakeRecord {
    start_reg: insertion.first_col_register(),
    count: insertion.col_mappings.len(),
    dest_reg: insertion.record_register(),
    index_name: None,
});
program.emit_insn(Insn::Insert {
    cursor: cursor_id,
    key_reg: insertion.key_register(),
    record_reg: insertion.record_register(),
    flag: InsertFlags::new(),
    table_name: table_name.to_string(),
});
```
---
The translation of a row now uses the information in `Insertion` and the
implementation is shared between the "insert single row" and "insert
multiple rows" cases:
```rust
/// Translate the key and the columns of the insertion.
/// This function is called by both [translate_rows_single] and [translate_rows_multiple],
/// each providing a different [translate_value_fn] implementation, because for multiple rows
/// we need to emit the values in a loop, from either an ephemeral table or a coroutine,
/// whereas for the single row the translation happens in a single pass without looping.
fn translate_rows_base<'short, 'long: 'short>(
    program: &mut ProgramBuilder,
    insertion: &'short Insertion<'long>,
    mut translate_value_fn: impl FnMut(&mut ProgramBuilder, usize, usize) -> Result<()>,
    resolver: &Resolver,
) -> Result<()> {
    translate_key(program, insertion, &mut translate_value_fn, resolver)?;
    for col in insertion.col_mappings.iter() {
        translate_column(
            program,
            col.column,
            col.register,
            col.value_index,
            &mut translate_value_fn,
            resolver,
        )?;
    }

    Ok(())
}
```
Which gets rid of the duplication in `populate_columns_single_row` and
`populate_columns_multiple_rows` in the old implementation.

Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #2687

2025-08-21 16:50:03 +03:00

.cargo

configure cargo for napi-rs

2025-08-08 15:45:05 +04:00

.github

Add a simple test for encryption

2025-08-20 11:47:25 +05:30

antithesis-tests

antithesis: Add unreliable libc stress template

2025-08-20 13:50:04 +03:00

assets

Add Nyrkiö to partners section in README

2025-07-10 04:54:24 +03:00

bindings

Merge 'Initial pass to support per page encryption' from Avinash Sajjanshetty

2025-08-20 11:11:24 +03:00

cli

Merge 'add metrics and implement the .stats command' from Glauber Costa

2025-08-18 20:26:48 -04:00

core

Fix condition that checks table.cols against number of provided values

2025-08-21 16:40:10 +03:00

docs

Add links to JavaScript packages

2025-08-19 19:31:32 +03:00

extensions

add remove_file method to the IO

2025-08-21 14:51:02 +04:00

fuzz

let fuzz still have its own workspace

2025-08-11 15:13:58 +03:00

licenses

fix tests and return nan as null

2025-03-29 14:46:11 +02:00

macros

add remove_file method to the IO

2025-08-21 14:51:02 +04:00

packages/turso-serverless

fix merge conflicts

2025-08-19 10:48:21 -03:00

parser

fmt

2025-08-15 17:09:30 +07:00

perf

chore: use rusqlite 0.37 with bundled sqlite everywhere

2025-08-11 15:13:57 +03:00

scripts

Remove ENV var and enable cache by default, track which pages were cached

2025-08-20 17:42:17 -04:00

simulator

add remove_file method to the IO

2025-08-21 14:51:02 +04:00

simulator-docker-runner

Fix simulator docker build by adding new sync directory

2025-08-18 15:32:22 -04:00

sqlite3

add optional upper_bound_inclusive parameter to some checkpoint modes

2025-08-21 14:12:11 +04:00

stress

stress: Don't hang if table creation fails

2025-08-20 13:50:04 +03:00

sync

Turso 0.1.4

2025-08-20 10:35:35 +03:00

testing

Add regression test for #2686

2025-08-21 16:40:10 +03:00

tests

fix clippy

2025-08-21 14:13:26 +04:00

vendored/sqlite3-parser

Add PRAGMA key to set the encryption key

2025-08-20 11:39:07 +05:30

.dockerignore

Add .dockerignore and Makefile commands to support docker

2025-07-31 00:00:44 -04:00

.github.json

Add Jussi to .github.json

2025-01-14 18:37:26 +02:00

.gitignore

Performance improvements to checkpointing. prevent serializing I/O

2025-08-20 17:26:54 -04:00

.python-version

setup uv for limbo

2025-04-15 12:45:46 -03:00

Cargo.lock

Merge 'Initial pass to support per page encryption' from Avinash Sajjanshetty

2025-08-20 11:11:24 +03:00

Cargo.toml

Turso 0.1.4

2025-08-20 10:35:35 +03:00

CHANGELOG.md

Update CHANGELOG

2025-08-20 09:33:26 +03:00

COMPAT.md

Merge 'Add support for unlikely(X)' from bit-aloo

2025-08-14 10:56:27 +03:00

CONTRIBUTING.md

Add fault injection steps to CONTRIBUTING.md

2025-08-20 13:50:04 +03:00

db.sqlite

reset statement before executing

2025-05-02 19:26:44 -03:00

dist-workspace.toml

Update cargo-dist to the latest official version

2025-08-02 04:35:52 +09:00

Dockerfile.antithesis

antithesis: Add unreliable stress template to Docker image

2025-08-20 13:50:04 +03:00

Dockerfile.cli

release and remove copies

2025-07-30 11:45:24 +02:00

flake.lock

fix: update flake dependencies

2025-07-17 20:25:40 -03:00

flake.nix

add sqlite debug cli for nix. Fix cursor delete panic. Add tracing for cell indices in btree

2025-05-14 13:30:39 -03:00

LICENSE.md

rename Limbo to Turso in the README and other files

2025-06-27 15:44:40 -05:00

Makefile

Add bench-sqlite script and makefile command for benchmarking an I/O backend against sqlite3

2025-08-18 15:11:29 -04:00

NOTICE.md

rename Limbo to Turso in the README and other files

2025-06-27 15:44:40 -05:00

PERF.md

Update PERF.md

2025-08-11 19:44:12 -04:00

Pipfile

Updated Pipfile

2024-07-12 13:07:34 -07:00

Pipfile.lock

Added Pipfile and Pipfile.lock

2024-07-12 12:38:56 -07:00

pyproject.toml

extract ruff lint rules to workspace

2025-06-20 15:59:03 -03:00

README.md

Fix MCP server mode section formatting

2025-08-20 10:58:54 +03:00

rust-toolchain.toml

chore: update rust to version 1.88.0

2025-07-16 19:17:58 +01:00

turso.png

Rename Limbo to Turso Database

2025-06-26 21:05:02 +03:00

uv.lock

Fix merge-py.py script to use github CLI and add makefile command

2025-07-31 10:20:17 -04:00

README.md

Turso Database

An in-process SQL database, compatible with SQLite.

About

Turso Database is an in-process SQL database written in Rust, compatible with SQLite.

⚠️ Warning: This software is ALPHA, only use for development, testing, and experimentation. We are working to make it production ready, but do not use it for critical data right now.

Features and Roadmap

SQLite compatibility for SQL dialect, file formats, and the C API [see document for details]
Change data capture (CDC) for real-time tracking of database changes.
Language support for
- Go
- JavaScript
- Java
- Python
- Rust
- WebAssembly
Asynchronous I/O support on Linux with io_uring
Cross-platform support for Linux, macOS, Windows and browsers (through WebAssembly)
Vector support support including exact search and vector manipulation
Improved schema management including extended ALTER support and faster schema changes.

The database has the following experimental features:

BEGIN CONCURRENT for improved write throughput using multi-version concurrency control (MVCC).
Incremental computation using DBSP for incremental view mainatenance and query subscriptions.

The following features are on our current roadmap:

Vector indexing for fast approximate vector search, similar to libSQL vector search.

Getting Started

Please see the Turso Database Manual for more information.

💻 Command Line

You can install the latest `turso` release with:

curl --proto '=https' --tlsv1.2 -LsSf \
  https://github.com/tursodatabase/turso/releases/latest/download/turso_cli-installer.sh | sh

Then launch the interactive shell:

$ tursodb

This will start the Turso interactive shell where you can execute SQL statements:

Turso
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database
turso> CREATE TABLE users (id INT, username TEXT);
turso> INSERT INTO users VALUES (1, 'alice');
turso> INSERT INTO users VALUES (2, 'bob');
turso> SELECT * FROM users;
1|alice
2|bob

You can also build and run the latest development version with:

cargo run

If you like docker, we got you covered. Simply run this in the root folder:

make docker-cli-build && \
make docker-cli-run

🦀 Rust

cargo add turso

Example usage:

let db = Builder::new_local("sqlite.db").build().await?;
let conn = db.connect()?;

let res = conn.query("SELECT * FROM users", ()).await?;

✨ JavaScript

npm i @tursodatabase/database

Example usage:

import { connect } from '@tursodatabase/database';

const db = await connect('sqlite.db');
const stmt = db.prepare('SELECT * FROM users');
const users = stmt.all();
console.log(users);

🐍 Python

uv pip install pyturso

Example usage:

import turso

con = turso.connect("sqlite.db")
cur = con.cursor()
res = cur.execute("SELECT * FROM users")
print(res.fetchone())

🦫 Go

Clone the repository
Build the library and set your LD_LIBRARY_PATH to include turso's target directory

cargo build --package limbo-go
export LD_LIBRARY_PATH=/path/to/limbo/target/debug:$LD_LIBRARY_PATH

Use the driver

go get github.com/tursodatabase/turso
go install github.com/tursodatabase/turso

Example usage:

import (
    "database/sql"
    _ "github.com/tursodatabase/turso"
)

conn, _ = sql.Open("sqlite3", "sqlite.db")
defer conn.Close()

stmt, _ := conn.Prepare("select * from users")
defer stmt.Close()

rows, _ = stmt.Query()
for rows.Next() {
    var id int
    var username string
    _ := rows.Scan(&id, &username)
    fmt.Printf("User: ID: %d, Username: %s\n", id, username)
}

☕️ Java

We integrated Turso Database into JDBC. For detailed instructions on how to use Turso Database with java, please refer to the README.md under bindings/java.

🤖 MCP Server Mode

The Turso CLI includes a built-in Model Context Protocol (MCP) server that allows AI assistants to interact with your databases.

Start the MCP server with:

tursodb your_database.db --mcp

The MCP server provides seven tools for database interaction:

Available Tools

open_database - Open a new database
current_database - Describe the current database
list_tables - List all tables in the database
describe_table - Describe the structure of a specific table
execute_query - Execute read-only SELECT queries
insert_data - Insert new data into tables
update_data - Update existing data in tables
delete_data - Delete data from tables
schema_change - Execute schema modification statements (CREATE TABLE, ALTER TABLE, DROP TABLE)

Example Usage

The MCP server runs as a single process that handles multiple JSON-RPC requests over stdin/stdout. Here's how to interact with it:

Example with In-Memory Database

cat << 'EOF' | tursodb --mcp
{"jsonrpc": "2.0", "id": 1, "method": "initialize", "params": {"protocolVersion": "2024-11-05", "capabilities": {}, "clientInfo": {"name": "client", "version": "1.0"}}}
{"jsonrpc": "2.0", "id": 2, "method": "tools/call", "params": {"name": "schema_change", "arguments": {"query": "CREATE TABLE users (id INTEGER, name TEXT, email TEXT)"}}}
{"jsonrpc": "2.0", "id": 3, "method": "tools/call", "params": {"name": "list_tables", "arguments": {}}}
{"jsonrpc": "2.0", "id": 4, "method": "tools/call", "params": {"name": "insert_data", "arguments": {"query": "INSERT INTO users VALUES (1, 'Alice', 'alice@example.com')"}}}
{"jsonrpc": "2.0", "id": 5, "method": "tools/call", "params": {"name": "execute_query", "arguments": {"query": "SELECT * FROM users"}}}
EOF

Example with Existing Database

# Working with an existing database file
cat << 'EOF' | tursodb mydb.db --mcp
{"jsonrpc": "2.0", "id": 1, "method": "initialize", "params": {"protocolVersion": "2024-11-05", "capabilities": {}, "clientInfo": {"name": "client", "version": "1.0"}}}
{"jsonrpc": "2.0", "id": 2, "method": "tools/call", "params": {"name": "list_tables", "arguments": {}}}
EOF

Using with Claude Code

If you're using Claude Code, you can easily connect to your Turso MCP server using the built-in MCP management commands:

Quick Setup

Add the MCP server to Claude Code:

claude mcp add my-database -- tursodb ./path/to/your/database.db --mcp

Restart Claude Code to activate the connection
Start querying your database through natural language!

Command Breakdown

claude mcp add my-database -- tursodb ./path/to/your/database.db --mcp
#              ↑            ↑       ↑                           ↑
#              |            |       |                           |
#              Name         |       Database path               MCP flag
#                          Separator

my-database - Choose any name for your MCP server
-- - Required separator between Claude options and your command
tursodb - The Turso database CLI
./path/to/your/database.db - Path to your SQLite database file
--mcp - Enables MCP server mode

Example Usage

# For a local project database
cd /your/project
claude mcp add my-project-db -- tursodb ./data/app.db --mcp

# For an absolute path
claude mcp add analytics-db -- tursodb /Users/you/databases/analytics.db --mcp

# For a specific project (local scope)
claude mcp add project-db --local -- tursodb ./database.db --mcp

Managing MCP Servers

# List all configured MCP servers
claude mcp list

# Get details about a specific server
claude mcp get my-database

# Remove an MCP server
claude mcp remove my-database

Once configured, you can ask Claude Code to:

"Show me all tables in the database"
"What's the schema for the users table?"
"Find all posts with more than 100 upvotes"
"Insert a new user with name 'Alice' and email 'alice@example.com'"

Contributing

We'd love to have you contribute to Turso Database! Please check out the contribution guide to get started.

Found a data corruption bug? Get up to $1,000.00

SQLite is loved because it is the most reliable database in the world. The next evolution of SQLite has to match or surpass this level of reliability. Turso is built with Deterministic Simulation Testing from the ground up, and is also tested by Antithesis.

Even during Alpha, if you find a bug that leads to a data corruption and demonstrate how our simulator failed to catch it, you can get up to $1,000.00. As the project matures we will increase the size of the prize, and the scope of the bugs.

More details here.

You can see an example of an awarded case on #2049.

Turso core staff are not eligible.

FAQ

Is Turso Database ready for production use?

Turso Database is currently under heavy development and is not ready for production use.

How is Turso Database different from Turso's libSQL?

Turso Database is a project to build the next evolution of SQLite in Rust, with a strong open contribution focus and features like native async support, vector search, and more. The libSQL project is also an attempt to evolve SQLite in a similar direction, but through a fork rather than a rewrite.

Rewriting SQLite in Rust started as an unassuming experiment, and due to its incredible success, replaces libSQL as our intended direction. At this point, libSQL is production ready, Turso Database is not - although it is evolving rapidly. More details here.

Publications

Pekka Enberg, Sasu Tarkoma, Jon Crowcroft Ashwin Rao (2024). Serverless Runtime / Database Co-Design With Asynchronous I/O. In EdgeSys ‘24. [PDF]
Pekka Enberg, Sasu Tarkoma, and Ashwin Rao (2023). Towards Database and Serverless Runtime Co-Design. In CoNEXT-SW ’23. [PDF] [Slides]

License

This project is licensed under the MIT license.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in Turso Database by you, shall be licensed as MIT, without any additional terms or conditions.

Partners

Thanks to all the partners of Turso!

Contributors

Thanks to all the contributors to Turso Database!

Languages

Rust 76.8%

Tcl 6.6%

C 6.4%

Dart 2.4%

Java 2.3%

Other 5.3%

README.md Unescape Escape

Turso Database

About

Features and Roadmap

Getting Started

Available Tools

Example Usage

Example with In-Memory Database

Example with Existing Database

Using with Claude Code

Quick Setup

Command Breakdown

Example Usage

Managing MCP Servers

Contributing

Found a data corruption bug? Get up to $1,000.00

FAQ

Is Turso Database ready for production use?

How is Turso Database different from Turso's libSQL?

Publications

License

Contribution

Partners

Contributors

README.md