Commit Graph

98 Commits

Author SHA1 Message Date
Nikita Sivukhin
c9c5ef4e25 remote query_mode from ProgramBuilderOpts and from function arguments
- mode never changes and ProgramBuilder already created with proper mode set correctly
2025-07-02 13:24:12 +04:00
Pekka Enberg
725c3e4ddc Rename limbo_sqlite3_parser crate to turso_sqlite3_parser 2025-06-29 12:34:46 +03:00
Pekka Enberg
09ba89e2ba core/translate: Replace todo with bail_parse_error
No point in crashing the whole app if someone attempts to change page
size.
2025-06-27 13:42:49 +03:00
Jussi Saurio
133d498724 Implement a header_accessor module so that DatabaseHeader structs arent initialized on every access 2025-06-24 14:41:50 -03:00
Jussi Saurio
cc2e14b11c Read page 1 from pager always, no separate db_header 2025-06-24 14:41:49 -03:00
Nils Koch
2827b86917 chore: fix clippy warnings 2025-06-23 19:52:13 +01:00
pedrocarlo
20115c1e74 return parse error when calling unimplemented pragma checkpoint modes 2025-06-17 11:42:20 -03:00
Pekka Enberg
882c5ca168 Merge 'Simple integrity check on btree' from Pere Diaz Bou
This PR adds support for the instruction `IntegrityCk` which performs an
integrity check on the contents of a single table. Next PR I will try to
implement the rest of the integrity check where we would check indexes
containt correct amount of data and some more.
<img width="1151" alt="image" src="https://github.com/user-
attachments/assets/29d54148-55ba-480f-b972-e38587f0a483" />

Closes #1719
2025-06-16 13:46:26 +03:00
Pekka Enberg
90c1e3fc06 Switch Connection to use Arc instead of Rc
Connection needs to be Arc so that bindings can wrap it with `Mutex` for
multi-threading.
2025-06-16 10:43:19 +03:00
Pere Diaz Bou
9383ba207d introduce integrity_check pragma 2025-06-11 11:14:29 +02:00
Anton Harniakou
8471704e00 Don't use hard-coded column names 2025-06-09 10:40:04 +03:00
Anton Harniakou
d802075ea9 Resolve merge conflict: Add columns names to result set for pragma statement output 2025-06-09 10:40:04 +03:00
Jussi Saurio
8ffe6208a3 Merge 'Minor: use use_eq_ignore_ascii_case in some places' from Anton Harniakou
Use `eq_ignore_ascii_case` because it's cooler 😎 than `x.to_lowercase()
== y.to_lowercase()`.

Closes #1678
2025-06-09 08:29:56 +03:00
Zaid Humayun
e994adfb40 Persisting database header and pointer map page to cache
This commit ensures that the metadata in the database header and the pointer map pages allocated are correctly persisted to the page cache. This was not being done earlier.
2025-06-06 23:14:25 +05:30
Zaid Humayun
5827a33517 Beginnings of AUTOVACUUM
This commit introduces AUTOVACUUM to Limbo. It introduces the concept of ptrmap pages and also adds some additional instructions that are required to make AUTOVACUUM PRAGMA work
2025-06-06 23:14:22 +05:30
Anton Harniakou
bd2becf45e Use eq_ignore_ascii_case to for case insensitive compare 2025-06-06 17:01:52 +03:00
meteorgan
a242bac340 Fix: ensure PRAGMA cache_size changes persist only for current session 2025-06-05 16:55:41 +08:00
meteorgan
f2bf6251cd write database header via normal pager route 2025-06-03 22:06:08 +08:00
meteorgan
2f82762ca2 add function parse_signed_number 2025-05-28 00:33:41 +08:00
meteorgan
d9d3a5ecbb Use the SetCookie opcode to implement user_version pragma 2025-05-28 00:31:11 +08:00
Diego Reis
2d6405b3e9 core/pragma: Remove unnecessary clone in user_version and cache_size 2025-05-23 08:43:07 -03:00
Diego Reis
2f8042da22 core/pragma: Add support for update user_version
It also changes the type from u32 to i32 since
sqlite supports negative values
2025-05-22 20:38:27 -03:00
Jussi Saurio
8bec75d804 Merge 'Initial Support for Nested Translation' from Pedro Muniz
This PR introduces some modifications to the Program Builder to allow us
to use nested parsing. By focusing the emission of Init and the last
Goto (prologue and epilogue), inside the ProgramBuilder, we can just not
emit them if we are parsing/translating in a nested context. For this
PR, I only migrated insert to use these functions as I need them to
support Insert statements that use `SELECT FROM` syntax. Nested parsing
overall enables code reuse for us and arguably is one of the only ways
to parse deeply nested queries without a lot of code duplication.
#1528

Closes #1543
2025-05-22 10:52:00 +03:00
pedrocarlo
53bf5d5ef5 adjust translate functions to take a program instead of Option<ProgramBuilder> + remove any Init emission in traslate functions + use epilogue in all places necessary 2025-05-21 16:41:10 -03:00
pedrocarlo
517c7c81cd refactor to include optional program builder argument 2025-05-21 12:47:51 -03:00
Pere Diaz Bou
67e260ff71 allow delete of dirty page in cacheflush
Dirty pages can be deleted in `cacheflush`. Furthermore, there could be
multiple live references in the stack of a cursor so let's allow them to
exist while deleting.
2025-05-21 14:09:39 +02:00
Anton Harniakou
3c0b7cad74 Eliminate a superfluous read transaction when doing PRAGMA user_version 2025-05-03 10:48:27 +03:00
Jussi Saurio
6096cfb3d8 Merge 'Add PRAGMA schema_version' from Anton Harniakou
This PR adds `PRAGMA schema_version` to get the value of the schema-
version integer at offset 40 in the database header.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1427
2025-05-01 10:50:02 +03:00
Anton Harniakou
525b7fdbaa Add PRAGMA schema_version 2025-04-30 09:41:04 +03:00
meteorgan
d2dce740f7 fix some issues about page_size 2025-04-28 16:13:07 +08:00
Jussi Saurio
fe65d6e991 Merge 'Performance: hoist entire expressions out of hot loops if they are constant' from Jussi Saurio
## Problem:
- We have cases where we are evaluating expressions in a hot loop that
could only be evaluated once. For example: `CAST('2025-01-01' as
DATETIME)` -- the value of this never changes, so we should only run it
once.
- We have no robust way of doing this right now for entire _expressions_
-- the only existing facility we have is
`program.mark_last_insn_constant()`, which has no concept of how many
instructions translating a given _expression_ spends, and breaks very
easily for this reason.
## Main ideas of this PR:
- Add `expr.is_constant()` determining whether the expression is
compile-time constant. Tries to be conservative and not deem something
compile-time constant if there is no certainty.
- Whenever we think a compile-time constant expression is about to be
translated into bytecode in `translate_expr()`, start a so called
`constant span`, which means a range of instructions that are part of a
compile-time constant expression.
- At the end of translating the program, all `constant spans` are
hoisted outside of any table loops so they only get evaluated once.
- The target offsets of any jump instructions (e.g. `Goto`) are moved to
the correct place, taking into account all instructions whose offsets
were shifted due to moving the compile-time constant expressions around.
- An escape hatch wrapper `translate_expr_no_constant_opt()` is added
for cases where we should not hoist constants even if we otherwise
could. Right now the only example of this is cases where we are reusing
the same register(s) in multiple iterations of some kind of loop, e.g.
`VALUES(...)` or in the `coalesce()` function implementation.
## Performance effects
Here is an example of a modified/simplified TPC-H query where the
`CAST()` calls were previously run millions of times in a hot loop, but
now they are optimized out of the loop.
**BYTECODE PLAN BEFORE:**
```sql
limbo> explain select
        l_orderkey,
        3 as revenue,
        o_orderdate,
        o_shippriority
from
        lineitem,
        orders,
        customer
where
        c_mktsegment = 'FURNITURE'
        and c_custkey = o_custkey
        and l_orderkey = o_orderkey
        and o_orderdate < cast('1995-03-29' as datetime)
        and l_shipdate > cast('1995-03-29' as datetime);
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     26    0                    0   Start at 26
1     OpenRead           0     10    0                    0   table=lineitem, root=10
2     OpenRead           1     9     0                    0   table=orders, root=9
3     OpenRead           2     8     0                    0   table=customer, root=8
4     Rewind             0     25    0                    0   Rewind lineitem
5       Column           0     10    5                    0   r[5]=lineitem.l_shipdate
6       String8          0     7     0     1995-03-29     0   r[7]='1995-03-29'
7       Function         0     7     6     cast           0   r[6]=func(r[7..8])  <-- CAST() executed millions of times
8       Le               5     6     24                   0   if r[5]<=r[6] goto 24
9       Column           0     0     9                    0   r[9]=lineitem.l_orderkey
10      SeekRowid        1     9     24                   0   if (r[9]!=orders.rowid) goto 24
11      Column           1     4     10                   0   r[10]=orders.o_orderdate
12      String8          0     12    0     1995-03-29     0   r[12]='1995-03-29'
13      Function         0     12    11    cast           0   r[11]=func(r[12..13])
14      Ge               10    11    24                   0   if r[10]>=r[11] goto 24
15      Column           1     1     14                   0   r[14]=orders.o_custkey
16      SeekRowid        2     14    24                   0   if (r[14]!=customer.rowid) goto 24
17      Column           2     6     15                   0   r[15]=customer.c_mktsegment
18      Ne               15    16    24                   0   if r[15]!=r[16] goto 24
19      Column           0     0     1                    0   r[1]=lineitem.l_orderkey
20      Integer          3     2     0                    0   r[2]=3
21      Column           1     4     3                    0   r[3]=orders.o_orderdate
22      Column           1     7     4                    0   r[4]=orders.o_shippriority
23      ResultRow        1     4     0                    0   output=r[1..4]
24    Next               0     5     0                    0
25    Halt               0     0     0                    0
26    Transaction        0     0     0                    0   write=false
27    String8            0     8     0     DATETIME       0   r[8]='DATETIME'
28    String8            0     13    0     DATETIME       0   r[13]='DATETIME'
29    String8            0     16    0     FURNITURE      0   r[16]='FURNITURE'
30    Goto               0     1     0
```
**BYTECODE PLAN AFTER**:
```sql
limbo> explain select
        l_orderkey,
        3 as revenue,
        o_orderdate,
        o_shippriority
from
        lineitem,
        orders,
        customer
where
        c_mktsegment = 'FURNITURE'
        and c_custkey = o_custkey
        and l_orderkey = o_orderkey
        and o_orderdate < cast('1995-03-29' as datetime)
        and l_shipdate > cast('1995-03-29' as datetime);
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     21    0                    0   Start at 21
1     OpenRead           0     10    0                    0   table=lineitem, root=10
2     OpenRead           1     9     0                    0   table=orders, root=9
3     OpenRead           2     8     0                    0   table=customer, root=8
4     Rewind             0     20    0                    0   Rewind lineitem
5       Column           0     10    5                    0   r[5]=lineitem.l_shipdate
6       Le               5     6     19                   0   if r[5]<=r[6] goto 19
7       Column           0     0     9                    0   r[9]=lineitem.l_orderkey
8       SeekRowid        1     9     19                   0   if (r[9]!=orders.rowid) goto 19
9       Column           1     4     10                   0   r[10]=orders.o_orderdate
10      Ge               10    11    19                   0   if r[10]>=r[11] goto 19
11      Column           1     1     14                   0   r[14]=orders.o_custkey
12      SeekRowid        2     14    19                   0   if (r[14]!=customer.rowid) goto 19
13      Column           2     6     15                   0   r[15]=customer.c_mktsegment
14      Ne               15    16    19                   0   if r[15]!=r[16] goto 19
15      Column           0     0     1                    0   r[1]=lineitem.l_orderkey
16      Column           1     4     3                    0   r[3]=orders.o_orderdate
17      Column           1     7     4                    0   r[4]=orders.o_shippriority
18      ResultRow        1     4     0                    0   output=r[1..4]
19    Next               0     5     0                    0
20    Halt               0     0     0                    0
21    Transaction        0     0     0                    0   write=false
22    String8            0     7     0     1995-03-29     0   r[7]='1995-03-29'
23    String8            0     8     0     DATETIME       0   r[8]='DATETIME'
24    Function           1     7     6     cast           0   r[6]=func(r[7..8]) <-- CAST() executed twice
25    String8            0     12    0     1995-03-29     0   r[12]='1995-03-29'
26    String8            0     13    0     DATETIME       0   r[13]='DATETIME'
27    Function           1     12    11    cast           0   r[11]=func(r[12..13])
28    String8            0     16    0     FURNITURE      0   r[16]='FURNITURE'
29    Integer            3     2     0                    0   r[2]=3
30    Goto               0     1     0                    0
```
**EXECUTION RUNTIME BEFORE:**
```sql
limbo> select
        l_orderkey,
        3 as revenue,
        o_orderdate,
        o_shippriority
from
        lineitem,
        orders,
        customer
where
        c_mktsegment = 'FURNITURE'
        and c_custkey = o_custkey
        and l_orderkey = o_orderkey
        and o_orderdate < cast('1995-03-29' as datetime)
        and l_shipdate > cast('1995-03-29' as datetime);
┌────────────┬─────────┬─────────────┬────────────────┐
│ l_orderkey │ revenue │ o_orderdate │ o_shippriority │
├────────────┼─────────┼─────────────┼────────────────┤
└────────────┴─────────┴─────────────┴────────────────┘
Command stats:
----------------------------
total: 3.633396667 s (this includes parsing/coloring of cli app)
```
**EXECUTION RUNTIME AFTER:**
```sql
limbo> select
        l_orderkey,
        3 as revenue,
        o_orderdate,
        o_shippriority
from
        lineitem,
        orders,
        customer
where
        c_mktsegment = 'FURNITURE'
        and c_custkey = o_custkey
        and l_orderkey = o_orderkey
        and o_orderdate < cast('1995-03-29' as datetime)
        and l_shipdate > cast('1995-03-29' as datetime);
┌────────────┬─────────┬─────────────┬────────────────┐
│ l_orderkey │ revenue │ o_orderdate │ o_shippriority │
├────────────┼─────────┼─────────────┼────────────────┤
└────────────┴─────────┴─────────────┴────────────────┘
Command stats:
----------------------------
total: 2.0923475 s (this includes parsing/coloring of cli app)
````

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #1359
2025-04-25 16:55:41 +03:00
Jussi Saurio
029e5eddde Fix existing resolve_label() calls to work with new system 2025-04-24 11:05:21 +03:00
Anton Harniakou
0a69ea0138 Support reading db page size using PRAGMA page_size 2025-04-24 10:12:02 +03:00
Pere Diaz Bou
00ab3d1c0c Fix ordering and implement Deref 2025-03-17 10:22:42 +01:00
Pere Diaz Bou
20f5ade95e Experiment with a custom Lock for database header 2025-03-17 10:21:34 +01:00
Pekka Enberg
b0636e4494 Merge 'Adds Drop Table' from Zaid Humayun
This PR adds support for `DROP TABLE` and addresses issue
https://github.com/tursodatabase/limbo/issues/894
It depends on https://github.com/tursodatabase/limbo/pull/785 being
merged in because it requires the implementation of `free_page`.
EDIT: The PR above has been merged.
It adds the following:
* an implementation for the `DropTable` AST instruction via a method
called `translate_drop_table`
* a couple of new instructions - `Destroy` and `DropTable`. The former
is to modify physical b-tree pages and the latter is to modify in-memory
structures like the schema hash table.
* `btree_destroy` on `BTreeCursor` to walk the tree of pages for this
table and place it in free list.
* state machine traversal for both `btree_destroy` and
`clear_overflow_pages` to ensure performant, correct code.
* unit & tcl tests
* modifies the `Null` instruction to follow SQLite semantics and accept
a second register. It will set all registers in this range to null. This
is required for `DROP TABLE`.
The screenshots below have a comparison of the bytecodes generated via
SQLite & Limbo.
Limbo has the same instruction set except for the subroutines which
involve opening an ephemeral table, copying over the triggers from the
`sqlite_schema` table and then re-inserting them back into the
`sqlite_schema` table.
This is because `OpenEphemeral` is still a WIP and is being tracked at
https://github.com/tursodatabase/limbo/pull/768
![Screenshot 2025-02-09 at 7 05 03 PM](https://github.com/user-
attachments/assets/1d597001-a60c-4a76-89fd-8b90881c77c9)
![Screenshot 2025-02-09 at 7 05 35 PM](https://github.com/user-
attachments/assets/ecfd2a7a-2edc-49cd-a8d1-7b4db8657444)

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #897
2025-03-06 18:27:41 +02:00
Pere Diaz Bou
e4a8ee5402 move load extensions to Connection
Extensions are loaded per connection and not per database as per SQLite
behaviour. This also helps with removing locks.
2025-03-05 14:07:48 +01:00
Pere Diaz Bou
8daf7666d1 Make database Sync + Send 2025-03-05 14:07:48 +01:00
Zaid Humayun
23a904f38d Merge branch 'main' of https://github.com/tursodatabase/limbo 2025-03-01 01:18:45 +05:30
Zaid Humayun
fbc8cd7e70 vdbe: modified the Null instruction
modified the Null instruction to more closely match SQLite semantics. Allows passing in a second register and all registers from r1..r2 area set to null
2025-02-19 21:46:26 +05:30
PThorpe92
9c8083231c Implement create virtual table and VUpdate opcode 2025-02-17 20:44:44 -05:00
Glauber Costa
fbe439f6c2 Implement the legacy_file_format pragma
easy implementation, sqlite claims it is a noop now

"This pragma no longer functions. It has become a no-op. The capabilities
formerly provided by PRAGMA legacy_file_format are now available using
the SQLITE_DBCONFIG_LEGACY_FILE_FORMAT option to the sqlite3_db_config()
C-language interface."
2025-02-14 09:50:29 -05:00
Pekka Enberg
ac54c35f92 Switch to workspace dependencies
...makes it easier to specify a version, which is needed for `cargo publish`.
2025-02-12 17:28:04 +02:00
Jonathan Webb
98e9d33478 Add read implementation of user_version pragma with ReadCookie opcode 2025-02-07 09:23:48 -05:00
Pekka Enberg
6ea7fa06d2 Merge 'prepare perf: make ProgramBuilder aware of plan to count/estimate required memory' from Jussi Saurio
Use knowledge of query plan to inform how much memory to initially
allocate for `ProgramBuilder` vectors
Some of them are exact, some are semi-random estimates
```sql
Prepare `SELECT 1`/Limbo/SELECT 1
                        time:   [756.93 ns 758.11 ns 759.59 ns]
                        change: [-4.5974% -4.3153% -4.0393%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) low severe
  1 (1.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

Prepare `SELECT * FROM users LIMIT 1`/Limbo/SELECT * FROM users LIMIT 1
                        time:   [1.4739 µs 1.4769 µs 1.4800 µs]
                        change: [-7.9364% -7.7171% -7.4979%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...`
                        time:   [3.7440 µs 3.7520 µs 3.7596 µs]
                        change: [-5.4627% -5.1578% -4.8445%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
```

Closes #899
2025-02-05 18:24:16 +02:00
Jussi Saurio
795576b2ec dont eagerly allocate result column name strings 2025-02-05 17:53:23 +02:00
Jussi Saurio
f599b5a752 Make programbuilder aware of plan to count/estimate required memory 2025-02-05 14:22:42 +02:00
sonhmai
2d4bf2eb62 core: move pragma statement bytecode generator to its own file. 2025-02-03 09:21:14 +07:00