Commit Graph

4349 Commits

Author SHA1 Message Date
Jussi Saurio
07415dac07 Merge 'Count optimization' from Pedro Muniz
After reading #1123, I wanted to see what optimizations I could do.
Sqlite optimizes `count` aggregation for the following case: `SELECT
count() FROM <tbl>`. This is so widely used, that they made an
optimization just for it in the form of the `COUNT` opcode.
This PR thus implements this optimization by creating the `COUNT`
opcode, and checking in the select emitter if we the query is a Simple
Count Query. If it is, we just emit the Opcode instead of going through
a Rewind loop, saving on execution time.
The screenshots below show a huge decrease in execution time.
- **Main**
<img width="383" alt="image" src="https://github.com/user-
attachments/assets/99a9dec4-e7c5-41db-ba67-4eafa80dd2e6" />
- **Count Optimization**
<img width="435" alt="image" src="https://github.com/user-
attachments/assets/e93b3233-92e6-4736-aa60-b52b2477179f" />

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1443
2025-05-12 10:01:50 +03:00
Jussi Saurio
6d11035f92 Merge 'Fix bound parameters on insert statements with out of order column indexes' from Preston Thorpe
closes #1449
A quick `explain` will show the root of the problem:
![image](https://github.com/user-
attachments/assets/c5e9f2cf-494f-410a-9e6f-399d4256cbc8)
vs `sqlite3`
![image](https://github.com/user-
attachments/assets/79b18254-4a5d-40bb-a5f2-1322326b745f)
We process the columns in the order they need to be to create the
record, and although we store the index of the value meant to be
inserted into that column based on the order they are bound, we still
lose the original ordering.
## EDIT: Fixed
![image](https://github.com/user-
attachments/assets/4b40451a-6eda-4ff1-8f21-a36ba8598d00)
### Multi-row insert:
![image](https://github.com/user-
attachments/assets/5b68b44c-1a96-442b-80cd-3a4df0e598ae)
## Solution:
We just needed to traverse the insert values beforehand and create an
array of N integers each representing the index of each
`ast::Expr::Variable`: then later on we can lookup the `value_index` in
that array, and grab the index of matching integer to get  the
parameters index to be bound to the relevant variable.
(yes I know that sounds like a leetcode problem)
Thanks to @jnesss for the write-up for this bug

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1459
2025-05-12 10:00:38 +03:00
PThorpe92
ab23f2a24f Add comments and reorganize fix of ordering parameters for insert statements 2025-05-11 14:20:57 -04:00
pedrocarlo
9f726dbe62 simplify simple count detection 2025-05-10 22:36:43 -03:00
pedrocarlo
977d09fd36 small fixes 2025-05-10 22:23:01 -03:00
pedrocarlo
7508043b62 add bench for select count 2025-05-10 22:23:01 -03:00
pedrocarlo
e9b1631d3c fix is_simple_count detection 2025-05-10 22:23:01 -03:00
pedrocarlo
342bf51c88 remove state machine for count 2025-05-10 22:23:01 -03:00
pedrocarlo
655ceeca45 correct count implementation 2025-05-10 22:23:01 -03:00
pedrocarlo
b69debb8c0 create count opcode 2025-05-10 22:23:01 -03:00
PThorpe92
1c7a50de96 Update comments and correct vtab insert behavior 2025-05-10 10:03:00 -04:00
PThorpe92
4e3efe655d Add integration test for binding parameters on insert of multiple rows 2025-05-10 07:46:30 -04:00
PThorpe92
e9458de0a4 Use correct math to get value indicies for nth row on multiple insert 2025-05-10 07:46:30 -04:00
PThorpe92
0d73fe0fe7 Fix parameter position on insert by handling before vdbe layer 2025-05-10 07:46:29 -04:00
PThorpe92
50f2621c12 Add several more rust tests for parameter binding 2025-05-10 07:46:29 -04:00
PThorpe92
56f5f47e86 Remove try init from tracing subscriber in tests to prevent excessive output 2025-05-10 07:46:29 -04:00
PThorpe92
c4aee50b58 Fix unclear comments in translator 2025-05-10 07:46:29 -04:00
PThorpe92
7a5422ee30 Clean up api for remap parameters and consoidate code 2025-05-10 07:46:29 -04:00
PThorpe92
9b8227dbf8 Add extensive rust integration tests for bindings parameters 2025-05-10 07:46:29 -04:00
PThorpe92
3691779408 Add tracing for remapping parameters 2025-05-10 07:46:29 -04:00
PThorpe92
3b09b9892c Comment out Go tests for binding parameters 2025-05-10 07:46:29 -04:00
PThorpe92
273711bf81 Impl Debug for Limbo value type in Go bindings 2025-05-10 07:46:28 -04:00
PThorpe92
d412e7c682 Improve naming of parameter remapping methods 2025-05-10 07:46:28 -04:00
PThorpe92
828840c371 Update bind_at api to check for recalculated parameter offset 2025-05-10 07:46:28 -04:00
PThorpe92
d908e78729 Use positional offsets in translate::expr to remap parameters to their correct offsets 2025-05-10 07:46:27 -04:00
PThorpe92
9c8dd7ebae Store current offset and value positions on program builder to remap bound parameters 2025-05-10 07:44:30 -04:00
PThorpe92
1e07e6d1b2 Add remap vec to parameters.rs to allow for reordering of arguments 2025-05-10 07:44:29 -04:00
PThorpe92
e5723b2ca1 Add test in Go bindings for parameters at diff indexes than table ordering 2025-05-10 07:44:29 -04:00
PThorpe92
c10df4788f Add current_col_idx field to program builder to keep insert order for binding params 2025-05-10 07:44:24 -04:00
Pekka Enberg
14ef25ebb8 Merge 'Add drop index' from Anton Harniakou
This commit adds suport for DROP INDEX.
Bytecode produced by this commit differs from SQLITE's bytecode, main
reason we don't do autovacuum or repacking of pages like SQLITE does.
Closes #1280

Closes #1444
2025-05-10 08:04:39 +03:00
Pekka Enberg
f2372e9aac Merge 'bindings/java: Remove disabled annotation for UPDATE and DELETE ' from Kim Seon Woo
## Changes
Since limbo now supports UPDATE and DELETE, I'm planning to implement
the rest of the features for integrating JDBC and limbo.
Since limbo's [total_changes() function seems to work differently in
compared to the sqlite's](https://discord.com/channels/12586588262579610
20/1321869557459058778/1368830400985563176), let's remove the
`@Disabled` annotation from
`execute_update_should_return_number_of_updated_elements` test  after
the issue is handled.
## Reference
https://github.com/tursodatabase/limbo/issues/615

Closes #1451
2025-05-10 08:02:13 +03:00
Pekka Enberg
73c0bd0737 Merge 'Refactor numeric literal' from meteorgan
Closes #1461
2025-05-10 08:00:58 +03:00
Pekka Enberg
be1621e099 Merge 'EXPLAIN should show a comment for the Insert opcode' from Anton Harniakou
After this commit EXPLAIN should show a comment for `Insert`.
```
limbo> explain insert into t (age, name, id) values (20, 'max', 1);
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     9     0                    0   Start at 9
1     OpenWrite          0     2     0                    0
2     Integer            1     2     0                    0   r[2]=1
3     String8            0     3     0     max            0   r[3]='max'
4     Integer            20    4     0                    0   r[4]=20
5     NewRowId           0     1     0                    0
6     MakeRecord         2     3     5                    0   r[5]=mkrec(r[2..4])
7     Insert             0     5     1     t              0   intkey=r[1] data=r[5]
8     Halt               0     0     0                    0
9     Transaction        0     1     0                    0   write=true
10    Goto               0     1     0                    0

```

Closes #1452
2025-05-10 07:59:36 +03:00
Pekka Enberg
97396a553d Merge 'bindings/wasm: add types property for typescript setting' from 오병진
This pull request includes a small change to the
`bindings/wasm/package.json` file. The change adds a `types` field
pointing to the TypeScript declaration file for the package (`[bindings/
wasm/package.jsonR15](diffhunk://#diff-
1f41234b939ba148924b9bbfedd557853ebcd22351a9c300e568ce7af0291babR15)`).

Closes #1460
2025-05-10 07:59:03 +03:00
Pekka Enberg
a105c20f69 Merge 'Implement transaction support in Go adapter' from Jonathan Ness
This PR implements basic transaction support in the Limbo Go adapter by
adding the required methods to fulfill the `driver.Tx` interface.
## Changes
- Add `Begin()` method to `limboConn` to start a transaction
- Add `BeginTx()` method with context support and proper handling of
transaction options
- Implement `Commit()` method to commit transaction changes
- Implement `Rollback()` method with appropriate error handling
- Add transaction tests
## Implementation Details
- Uses the standard SQLite transaction commands (BEGIN, COMMIT,
ROLLBACK)
- Follows the same pattern as other SQL operations in the adapter
(prepare-execute-close)
- Maintains consistent locking and error handling patterns
## Limitations
- Currently, ROLLBACK operations will return an error as they're not yet
fully supported in the underlying Limbo implementation
- Only the default isolation level is supported; all other isolation
levels return `driver.ErrSkip`
- Read-only transactions are not supported and return `driver.ErrSkip`
## Testing
- Added basic transaction tests that verify BEGIN and COMMIT operations
- Adjusted tests to work with the current Limbo implementation
capabilities
These transaction methods enable the Go adapter to be used in
applications that require transaction support while providing clear
error messages when unsupported features are requested.  I'll add to it
when Limbo supports ROLLBACK and/or additional isolation levels.

Closes #1435
2025-05-10 07:58:29 +03:00
Pekka Enberg
97ad25c506 Merge 'Initial implementation of ALTER TABLE RENAME' from Levy A.
- [x] `ALTER TABLE _ RENAME TO _`

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1456
2025-05-10 07:57:42 +03:00
Pekka Enberg
9bc1b73d67 Merge 'bindings/javascript: Improve compatibility with better-sqlite' from Diego Reis
Probably this is larger than it should, but this PR:
- Enhance error handling (instead of `unwraps`, napi's `GenericFailure`
with custom message)
- Add missing methods and properties for Database and Statements to turn
explicit what should be done
- Implement `all()` and `run()`* methods for `Statement`
- Bump napi version to the latest stable
- Implement `DatabaseStorage::sync()` for DatabaseFile
- Add conversion from js values to limbo values
\* `run()` isn't 100% compatible yet for two reasons:
 1. The function API isn't variadic -- I chatted with napi's maintainer
in Discord and he said that this isn't possible using only napi (in this
version), and we could normalize "[...] parameters at the JavaScript
layer before passing them to Rust". Something similar to [this](https://
github.com/rolldown/rolldown/blob/main/packages/rolldown/src/utils/bindi
ngify-input-options.ts), which I plan to do in another PR.
2. better-sqlite version returns a result object, which isn't currently
supported by core (AFAIK), I also plan to do this in another PR.

Closes #1464
2025-05-10 07:55:19 +03:00
Pekka Enberg
9bf3f1e90d Merge 'Add time.Time and bool data types support in Go adapter' from Jonathan Ness
PR #1442 added support for using time.Time as query parameters.
Scanning time.Time values from query results still fails:
`sql: Scan error on column index 4, name "mod_time": unsupported Scan,
storing driver.Value type string into type *time.Time`
This change modifies the `toGoValue` function to detect RFC3339
formatted datetime strings and convert them back to time.Time objects
when reading from the database. This provides complete roundtrip support
for time.Time values.
Also added boolean support by converting Go bool values to integers (0
for false, 1 for true) when binding parameters, following SQLite's
convention for representing boolean values.

Reviewed-by: Preston Thorpe (@PThorpe92)

Closes #1465
2025-05-10 07:54:10 +03:00
Pekka Enberg
0949e7d2f2 Merge 'bindings/go: Upgrade ebitengine/purego to allow for use with go 1.23.9' from Preston Thorpe
Go 1.23.9 introduced a change with to it's linker that caused a
`duplicate symbol` error with purego 1.82
this is the recommended fix per
https://github.com/golang/go/issues/73617

Closes #1466
2025-05-10 07:53:44 +03:00
PThorpe92
efd4767b6a Bindings/Go: Upgrade ebitengine/purego to allow for use with go 1.23.9 2025-05-09 20:25:29 -04:00
jnesss
02d141e3ce add support for bool type 2025-05-09 09:01:24 -07:00
Jussi Saurio
9a5990f87e Merge 'Add tests for INSERT with specified column-name list' from Anton Harniakou
Let's add some missing tests for the INSERT statement.

Closes #1455
2025-05-09 08:59:39 +03:00
Jussi Saurio
bda6526d28 Merge 'GROUP BY: refactor logic to support cases where no sorting is needed' from Jussi Saurio
Right now we have the following problem with GROUP BY:
- it always allocates a sorter and sorts the input rows, even when the
rows are already sorted in the right order
This PR is a refactor supporting a future PR that introduces a new
version of the optimizer which does 1. join reordering and 2. sorting
elimination based on plan cost. The PR splits GROUP BY into multiple
subsections:
1. Initializing the sorter, if needed
2. Reading rows from the sorter, if needed
3. Doing the actual grouping (this is done regardless of whether sorting
is needed)
4. Emitting rows during grouping in a subroutine (this is done
regardless of whether sorting is needed)
For example, you might currently have the following pseudo-bytecode for
GROUP BY:
```
SorterOpen (groupby_sorter)
OpenRead (users)
Rewind (users)
   <read columns from users>
   SorterInsert (groupby_sorter)
Next (users)
SorterSort (groupby_sorter)
   <do grouping>
SorterNext (groupby_sorter)
ResultRow
```
This PR allows us to do the following in cases where the rows are
already sorted:
```
OpenRead (users)
Rewind (users)
  <read columns from users>
  <do grouping>
Next (users)
ResultRow
```
---
In fact this is where the vast majority of the changes in this PR come
from -- eliminating the implied assumption that sorting for GROUP BY is
always required. The PR does not change current behavior, i.e. sorting
is always done for GROUP BY, but it adds the _ability_ to not do sorting
if the planner so decides.
The most important changes to understand are these:
```rust
/// Enum representing the source for the rows processed during a GROUP BY.
/// In case sorting is needed (which is most of the time), the variant
/// [GroupByRowSource::Sorter] encodes the necessary information about that
/// sorter.
///
/// In case where the rows are already ordered, for example:
/// "SELECT indexed_col, count(1) FROM t GROUP BY indexed_col"
/// the rows are processed directly in the order they arrive from
/// the main query loop.
#[derive(Debug)]
pub enum GroupByRowSource {
    Sorter {
        /// Cursor opened for the pseudo table that GROUP BY reads rows from.
        pseudo_cursor: usize,
        /// The sorter opened for ensuring the rows are in GROUP BY order.
        sort_cursor: usize,
        /// Register holding the key used for sorting in the Sorter
        reg_sorter_key: usize,
        /// Number of columns in the GROUP BY sorter
        sorter_column_count: usize,
        /// In case some result columns of the SELECT query are equivalent to GROUP BY members,
        /// this mapping encodes their position.
        column_register_mapping: Vec<Option<usize>>,
    },
    MainLoop {
        /// If GROUP BY rows are read directly in the main loop, start_reg is the first register
        /// holding the value of a relevant column.
        start_reg_src: usize,
        /// The grouping columns for a group that is not yet finalized must be placed in new registers,
        /// so that they don't get overwritten by the next group's data.
        /// This is because the emission of a group that is "done" is made after a comparison between the "current" and "next" grouping
        /// columns returns nonequal. If we don't store the "current" group in a separate set of registers, the "next" group's data will
        /// overwrite the "current" group's columns and the wrong grouping column values will be emitted.
        /// Aggregation results do not require new registers as they are not at risk of being overwritten before a given group
        /// is processed.
        start_reg_dest: usize,
    },
}

/// Enum representing the source of the aggregate function arguments
/// emitted for a group by aggregation.
/// In the common case, the aggregate function arguments are first inserted
/// into a sorter in the main loop, and in the group by aggregation phase
/// we read the data from the sorter.
///
/// In the alternative case, no sorting is required for group by,
/// and the aggregate function arguments are retrieved directly from
/// registers allocated in the main loop.
pub enum GroupByAggArgumentSource<'a> {
    /// The aggregate function arguments are retrieved from a pseudo cursor
    /// which reads from the GROUP BY sorter.
    PseudoCursor {
        cursor_id: usize,
        col_start: usize,
        dest_reg_start: usize,
        aggregate: &'a Aggregate,
    },
    /// The aggregate function arguments are retrieved from a contiguous block of registers
    /// allocated in the main loop for that given aggregate function.
    Register {
        src_reg_start: usize,
        aggregate: &'a Aggregate,
    },
}
```

Closes #1438
2025-05-09 08:56:38 +03:00
jnesss
c0dd79adc2 Add time.Time support for scanning query results 2025-05-08 20:51:54 -07:00
Diego Reis
242f4d7cdc bind/js: Add tests and some fixes 2025-05-08 15:57:46 -03:00
Diego Reis
64874bca4e bind/js: Add method all() 2025-05-08 15:23:39 -03:00
Diego Reis
559263ce3c bind/js: Add execution of prepared statements with bindings 2025-05-08 15:16:19 -03:00
meteorgan
261adb5ed7 fix cargo fmt 2025-05-08 22:26:50 +08:00
meteorgan
a1f981a973 handle int64 overflow by f64 2025-05-08 22:22:55 +08:00
Diego Reis
0aa46154ab bind/js: Add conversion from js types to limbo types 2025-05-08 10:32:46 -03:00