Previously we implemented update as a simple `Delete` + `Insert`
procedure which seemed okay for the moment but it wasn't. `Delete` can
trigger balance and a post balance `seek` which will leave cursor
pointing to an invalid page which `Insert` will try to insert to.
We solve this by removing `Delete` from the execution plan and rely on
`Insert` to properly overwrite the cell where the rowid is the same as
the one we are inserting.
Previously, with the `index_experimental` feature enabled, the query in
the added test would enter an infinite loop. This happened because
`label_grouping_agg_step` pointed to a constant argument that was moved
to the end of the program. As a result, the aggregation loop would jump
to the constant, then return to the start of the main loop, rewind the
index, and re-enter the aggregation loop - causing it to repeat
indefinitely.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1792
Previously, with the `index_experimental` feature enabled, the query in
the added test would enter an infinite loop. This happened because
`label_grouping_agg_step` pointed to a constant argument that was moved
to the end of the program. As a result, the aggregation loop would jump
to the constant, then return to the start of the main loop, rewind the
index, and re-enter the aggregation loop—causing it to repeat
indefinitely.
This PR initializes an UV project in `antithesis_tests` so that we can
have an easier time to track dependencies and build pylimbo
automatically for our environment. Consequently, making it easier to
create new antithesis tests in the future with better IDE support.
Also modified our Github actions to check python linting with Ruff, and
removed unnecessary Python jobs. With that, I applied the Ruff fixes
which is the cause of the many file changes.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1782
Fixes DELETE not emitting conditional jumps at all if the associated
WhereTerm is a constant, e.g.
```sql
limbo> create table t(x);
limbo> explain DELETE FROM t WHERE 5-5;
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 7 0 0 Start at 7
1 OpenWrite 0 2 0 0 root=2; t
2 Rewind 0 6 0 0 Rewind table t
3 RowId 0 1 0 0 r[1]=t.rowid
4 Delete 0 0 0 0
5 Next 0 3 0 0
6 Halt 0 0 0 0
7 Transaction 0 1 0 0 write=true
8 Goto 0 1 0 0
```
I was adding more stuff to the simulator in a Branch of mine, and I
caught this error with delete. Upstreaming the fix here. As we do with
Update, I added the translation step for the `WhereTerms` of the query.
Edit: Closes#1732. Closes#1733. Closes#1734. Closes#1735. Closes
#1736. Closes#1738. Closes#1739. Closes#1740.
Edit: Also pushes constant where term translation to `init_loop` for
Update and Select as well.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1746
The `RowData` opcode is required to implement #1575.
I haven't found a ideal way to test this PR independently, but I
verified its functionality while working on #1575(to be committed soon),
and it performs effectively.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1756
This PR has two parts:
1. The first commit refactors how information about which registers
should be populated in the aggregation loop is calculated and
propagated. This simplification revealed a bug, which is addressed as
part of the same commit (see the included test).
2. The second commit fixes incorrect behavior for queries where complex
expressions include both aggregate and non-aggregate components. For
example, the following query previously produced incorrect results:
```sql
SELECT
CASE WHEN c0 != 'x' THEN group_concat(c1, ',') ELSE 'x' END
FROM t0
GROUP BY c0;
```
In such cases, non-aggregate columns like `c0` were not available during
the result construction for each group, leading to incorrect evaluation.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1780
It is easy to chalk this fuzzer issue to erratic floating point
behaviour, but this is not the case here.
Currently, `exec_math_log` calculates log with arbitrary bases by using
the following formula: `log_a(b) ~= ln(b) / ln(a)`. This calculation is
an approximation with lots of its floating point precision lost to
dividing the results of natural logarithms.
By using the specialized versions of the log functions (`log2` &
`log10`), we can avoid this loss of precision.
SQLite also uses these specialized log functions when possible, so it
doesn't hurt to do the same thing when aiming for parity.
This PR fixes#1763
Reviewed-by: Diego Reis (@el-yawd)
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#1786
It is easy to chalk this fuzzer issue to erratic floating point
behaviour, but this is not the case here.
Currently, `exec_math_log` calculates log with arbitrary bases by using
the following formula: `log_a(b) ~= ln(b) / ln(a)`. This calculation is
an approximation with lots of its floating point precision lost to
dividing the results of natural logarithms.
By using the specialized versions of the log functions (`log2` &
`log10`), we can avoid this loss of precision.
SQLite also uses these specialized log functions when possible, so it
doesn't hurt to do the same thing when aiming for parity.
Copies the binary from the antithesis build instead of release
Copies symbol files from the binary to the `/symbols` directory
Resolves the `Symbols were uploaded` and `Software was instrumented`
properties in the Antithesis triage reports
Closes#1783
Previously, queries like:
```
SELECT
CASE WHEN c0 != 'x' THEN group_concat(c1, ',') ELSE 'x' END
FROM t0
GROUP BY c0;
```
would return incorrect results because c0 was not copied during the
aggregation loop into a register accessible to the logic processing the
grouped results (e.g., the CASE WHEN expression in this example).
The same issue applied to expressions in the HAVING and ORDER BY clauses.
Previously, the logic for collecting non-aggregate columns was duplicated
across multiple locations and implemented inconsistently. This caused a
bug that was revealed by the refactoring in this commit (see the added
test).
In case of starting a write txn, we always start by starting a read txn.
If we don't end_read_tx then it will never release locks.
Furthermore, we add some concurrent tests that catched this issue.
The idea is quite simple: write with 4 concurrent writers and once all
are finsihed, check the count of rows written is correct.
Closes#1777
If multiple processes access the same database file concurrently, Limbo
fails with:
```
LockingError("Failed locking file. File is locked by another process
```
Make test driver robust against that.
Closes#1769
If multiple processes access the same database file concurrently, Limbo fails with:
```
LockingError("Failed locking file. File is locked by another process
```
Make test driver robust against that.
The `cargo-dist` tool attempts to find it but because we never build it,
packaging fails. Remove the extra binary. Probably better to work
towards making experimental indexes a runtime flag instead.
Closes#1779