Commit Graph

10673 Commits

Author SHA1 Message Date
pedrocarlo
b6f94b2fa1 remove dead code in sim 2025-10-09 17:25:04 -03:00
Nikita Sivukhin
7e727d07af fix bugs add tests 2025-10-09 23:23:16 +04:00
Nikita Sivukhin
10c51c8da0 add test for convert operation 2025-10-09 22:14:38 +04:00
Nikita Sivukhin
e18f26a1f1 fix bug after refactoring 2025-10-09 21:28:46 +04:00
Nikita Sivukhin
ac9a25a417 fix clippy 2025-10-09 21:19:35 +04:00
Nikita Sivukhin
5336801574 add jaccard distance 2025-10-09 21:15:39 +04:00
Nikita Sivukhin
585d11b736 implement operations for sparse vectors 2025-10-09 20:52:58 +04:00
Jussi Saurio
edf40cc65b clippy 2025-10-09 19:00:40 +03:00
Jussi Saurio
27a88b86dc Reuse a single RecordCursor per PseudoCursor 2025-10-09 18:56:49 +03:00
Jussi Saurio
812709cf8e inline collation comparison functions 2025-10-09 18:56:49 +03:00
Nikita Sivukhin
84643dc4f2 implement sparse vector operations 2025-10-09 19:19:33 +04:00
Diego Reis
d2d265a06f Small nits and code clean ups 2025-10-09 12:14:20 -03:00
Diego Reis
b8f8a87007 Refactor bytecode emission
- we were redundantly translating tmp
- Make emit_constant_insn a method of ProgramBuilder
2025-10-09 11:57:16 -03:00
Diego Reis
84e8d11764 Fix bug when jump_if_true is enabled 2025-10-09 11:57:16 -03:00
Diego Reis
625403cc2a Fix register reuse when called inside a coroutine
- On each interaction we assume that the value is NULL, so we need to
  set it like so for every interaction in the list. So we force to not
  emit this NULL as constant;
- Forces a copy so IN expressions works inside an aggregation
  expression. Not ideal but it works, we should work more on the query
  planner for sure.
2025-10-09 11:57:16 -03:00
Diego Reis
da323fa0c4 Some clean ups and correctly working on WHERE clauses 2025-10-09 11:57:15 -03:00
Diego Reis
79958f468d Add jump_target_null to ConditionMetadata
It's kinda make sense, conditions can be evaluated into 3 values: false,
true and null. Now we handle that.
2025-10-09 11:56:14 -03:00
Diego Reis
52ed0f7997 Add in expr optimization at the parser level instead of translation.
lhs IN () and lhs NOT IN () can be translated to false and true.
2025-10-09 11:56:14 -03:00
Diego Reis
70fc509046 First step to fix 3277
This follows almost step by step sqlite's functions, and indeed it's
correct. But still have to translate some of this logic to our current
semantics
2025-10-09 11:56:14 -03:00
Jussi Saurio
0356a7102c remove another expensive assert 2025-10-09 17:50:15 +03:00
Jussi Saurio
a1a83c689b Don't yield if completion already succeeded 2025-10-09 17:50:06 +03:00
Jussi Saurio
1c35d5b342 avoid expensive Arc cloning 2025-10-09 17:43:28 +03:00
Jussi Saurio
1f310a4738 Remove expensive hot path assert 2025-10-09 17:29:18 +03:00
Mikaël Francoeur
7aa91d7c24 add pragma autovacuum_mode to simulator 2025-10-09 17:28:48 +03:00
Glauber Costa
f4116eb3d4 lie about sqlite version
I found an application in the open that expects sqlite_version() to
return a specific string (higher than 3.8...).

We had tons of those issues at Scylla, and the lesson was that you
tell your kids not to lie, but when life hits, well... you lie.

We'll add a new function, turso_version, that tells the truth.
2025-10-09 07:19:35 -07:00
Nikita Sivukhin
68632cc142 rename euclidian to L2 for consistency 2025-10-09 17:26:36 +04:00
Nikita Sivukhin
1ebf2b7c8d add f32 sparse vector type 2025-10-09 17:25:40 +04:00
Nikita Sivukhin
9e68fa7f4a simplify vector_slice operation 2025-10-09 17:11:13 +04:00
Nikita Sivukhin
d7f3a450ad return Nan for cosine distance instead of error
- errors are hard to handle in case of some scan operations (something went wrong in the middle - whoe query aborted)
- it will be more flexibly if we will return NaN and let user handle situation
2025-10-09 17:06:49 +04:00
Nikita Sivukhin
14e104f830 add convert operation 2025-10-09 16:56:36 +04:00
Nikita Sivukhin
8584ee18a3 refactor parsing/deserialization 2025-10-09 16:36:39 +04:00
Jussi Saurio
bcca404551 Avoid string allocation in sorter record comparison 2025-10-09 15:34:27 +03:00
Nikita Sivukhin
a2f4376bd2 move more operations to the operations/ folder 2025-10-09 16:18:53 +04:00
Nikita Sivukhin
7e9e102f20 move vector operations under operations/ folder 2025-10-09 16:02:03 +04:00
Jussi Saurio
e0461dd78a Sorter: compute values upfront instead of deserializing on every comparison 2025-10-09 15:01:47 +03:00
Jussi Saurio
7948259d37 Merge 'optimizer: optimize range scans to use upper and lower bounds more efficiently' from Jussi Saurio
Made a new PR based on @sivukhin 's PR #2869 that had a lot of
conflicts. You can check out the PR description from there.
## The main idea is:
Before, if we had an index on `x` and had a query like `WHERE x > 100
and x < 200`, the plan would be something like:
```
- Seek to first row where x > 100
- Then, for every row, discard the row if x >= 200
```
This is highly wasteful in cases where there are a lot of rows where `x
>= 200`. Since our index is sorted on `x`, we know that once we hit the
_first_ row where `x >= 200`, we can stop iterating entirely.
So, the new plan is:
```
- Seek to first row where x > 100
- Then, iterate rows until x >= 200, and then stop
```
This also improves the situation for multi-column indexes. Imagine index
on `(x,y)` and a condition like `WHERE x = 100 and y > 100 and y < 200`.
Before, the plan was:
```
- Seek to first row where x=100 and y > 100
- Then, iterate rows while x = 100 and discard the row if y >= 200
- Stop when x > 100
```
This also suffers from a problem where if there are a lot of rows where
`x=100` and `y >= 200`, we go through those rows unnecessarily. The new
plan is:
```
- Seek to first row where x=100 and y > 100
- Then, iterate rows while x = 100 and y < 200
- Stop when either x > 100 or y >= 200
```
Which prevents us from iterating rows like `x=100, y = 666`
unnecessarily because we know the index is sorted on `(x,y)` - once we
hit any row where `x>100` OR `x=100, y >= 200`, we can stop.

Closes #3644
2025-10-09 14:47:15 +03:00
Jussi Saurio
f9f8eda3c3 Merge 'add Calendar-based timezone conversion support in JDBC4ResultSet' from 김민석
## Summary
Implemented Calendar-based Date/Time/Timestamp getter methods in
JDBC4ResultSet to support timezone conversions.
## Changes
- Implemented `getDate(int, Calendar)` and `getDate(String, Calendar)`
- Implemented `getTime(int, Calendar)` and `getTime(String, Calendar)`
- Implemented `getTimestamp(int, Calendar)` and `getTimestamp(String,
Calendar)`
- Fixed timezone conversion logic (changed from subtraction to addition)
- Added comprehensive test cases for all implemented methods
Test Results
- All tests passed successfully
- New tests validate timezone conversion with UTC and Seoul (UTC+9)

Reviewed-by: Kim Seon Woo (@seonWKim)

Closes #3607
2025-10-09 12:52:09 +03:00
Jussi Saurio
e726803ab4 Merge 'translate: make bind_and_rewrite_expr() reject unbound identifiers if no referenced tables exist' from Jussi Saurio
Before, we just skipped evaluating `Id`, `Qualified` and
`DoublyQualified` if `referenced_tables` was `None`, leading to shit
like #3621. Let's eagerly return `"No such column"` parse errors in
these cases instead, and punch exceptions for cases where that doesn't
cleanly work
Top tip: use `Hide whitespace` toggle when inspecting the diff of this
PR
Closes #3621

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3626
2025-10-09 12:45:16 +03:00
Jussi Saurio
ab88e7c206 Merge 'don't allow duplicate col names in create table' from Pavan Nambi
closes https://github.com/tursodatabase/turso/issues/3637

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3641
2025-10-09 12:44:28 +03:00
Jussi Saurio
190e6f2e93 Merge 'Simulator: ignore Property::AllTableHaveExpectedContent when counting stats' from Pedro Muniz
`Property::AllTableHaveExpectedContent` adds a lot of simple Select full
table scans to check the DB state. These statements were being counted
in `InteractionStats`. The stats are used to calculate the Remaining
queries we need to make. By not counting these simple checks we allow
the simulator to create more meaningful interactions.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3643
2025-10-09 12:43:56 +03:00
Jussi Saurio
a76cdb83c5 fuzz: sometimes add another condition on the same column to exercise index range queries 2025-10-09 12:34:52 +03:00
Nikita Sivukhin
5b6e8e4b84 Float32/Float64 -> Float32Dense/Float64Dense 2025-10-09 13:28:40 +04:00
Nikita Sivukhin
4313f57ecb Optimize range scans 2025-10-09 11:47:41 +03:00
Pavan-Nambi
414f92d0a0 go back to for loop
cleanup

clippy
2025-10-09 13:50:45 +05:30
Jussi Saurio
acb3c97fea Merge 'When pwritev fails, clear the dirty pages' from Pedro Muniz
If we don't clear the dirty pages, we will initiate a rollback. In the
rollback, we will attempt to clear the whole page cache, but it will
then panic because there will still be dirty pages from the failed
writev

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #3189
2025-10-09 10:38:47 +03:00
Pekka Enberg
e3144a74aa Merge 'Add Nightly versions of benchmarks that run on Nyrkiö runners' from Henrik Ingo
Nyrkiö is piloting a GitHub Runner service, and congratulations, you are
the pilot customer! In order to reduce the risk somewhat, we'll
introduce this as a parallel workflow, so the existing benchmarks will
continue to run on the regular runners and they won't have any
discontinuity in their results because of this. Also since these runners
are actually a bit more expensive, we can manage the cost by only
running them periodically. Similarly, this allows us to fine tune the
Nyrkiö instance size, for example to have an EBS disk that is the
optimal size and IOPS configuration, so you don't pay for empty space.
I will show up in discord to discuss.
And congratulations on the beta release by the way! I worked hard to get
this done for your beta period, let's hope it works now.

Closes #3619
2025-10-09 10:09:57 +03:00
pedrocarlo
f54b1132ca ignore Property::AllTableHaveExpectedContent when counting stats, so we can generate more interesting interactions 2025-10-09 01:20:03 -03:00
Pavan-Nambi
f0d9ead19f add more tests
refactor and use sort_unstable_by_key
2025-10-09 08:28:59 +05:30
Pavan-Nambi
f138448da2 don't allow duplicate col names in create table 2025-10-09 08:09:31 +05:30
kimminseok
76320e82db lint issues with spotless 2025-10-09 11:19:29 +09:00