Add handling malformed inputs to function `read_varint` and test cases.
```
# 9 byte truncated to 8
read_varint(&[0x81, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80])
before -> panic index out of bounds: the len is 8 but the index is 8
after -> LimboError
# bits set without end
read_varint(&[0x80; 9])
before -> Ok((128, 9))
after -> LimboError
```
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#2904
fix#2937 , we dont need to check comment in cli because parser already
handled it (i guest).
How to reproduce:
```sh
turso> --whatCREATE TABLE users (
id INT PRIMARY KEY,
first_name VARCHAR(50),
age INT
);
```
Closes#2938
This fairly long commit implements persistence for materialized view. It
is hard to split because of all the interdependencies between
components, so it is a one big thing. This commit message will at least
try to go into details about the basic architecture.
Materialized Views as tables
============================
Materialized views are now a normal table - whereas before they were a
virtual table. By making a materialized view a table, we can reuse all
the infrastructure for dealing with tables (cursors, etc).
One of the advantages of doing this is that we can create indexes on
view columns. Later, we should also be able to write those views to
separate files with ATTACH write.
Materialized Views as Zsets
===========================
The contents of the table are a ZSet: rowid, values, weight. Readers
will notice that because of this, the usage of the ZSet data structure
dwindles throughout the codebase. The main difference between our
materialized ZSet and the standard DBSP ZSet, is that obviously ours is
backed by a BTree, not a Hash (since SQLite tables are BTrees)
Aggregator State
================
In DBSP, the aggregator nodes also have state. To store that state,
there is a second table. The table holds all aggregators in the view,
and there is one table per view. That is
__turso_internal_dbsp_state_{view_name}. The format of that table is
similar to a ZSet: rowid, serialized_values, weight. We serialize the
values because there will be many aggregators in the table. We can't
rely on a particular format for the values.
The Materialized View Cursor
============================
Reading from a Materialized View essentially means reading from the
persisted ZSet, and enhancing that with data that exists within the
transaction. Transaction data is ephemeral, so we do not materialize
this anywhere: we have a carefully crafted implementation of seek that
takes care of merging weights and stitching the two sets together.
Closes#2921
This fairly long commit implements persistence for materialized view.
It is hard to split because of all the interdependencies between components,
so it is a one big thing. This commit message will at least try to go into
details about the basic architecture.
Materialized Views as tables
============================
Materialized views are now a normal table - whereas before they were a virtual
table. By making a materialized view a table, we can reuse all the
infrastructure for dealing with tables (cursors, etc).
One of the advantages of doing this is that we can create indexes on view
columns. Later, we should also be able to write those views to separate files
with ATTACH write.
Materialized Views as Zsets
===========================
The contents of the table are a ZSet: rowid, values, weight. Readers will
notice that because of this, the usage of the ZSet data structure dwindles
throughout the codebase. The main difference between our materialized ZSet and
the standard DBSP ZSet, is that obviously ours is backed by a BTree, not a Hash
(since SQLite tables are BTrees)
Aggregator State
================
In DBSP, the aggregator nodes also have state. To store that state, there is a
second table. The table holds all aggregators in the view, and there is one
table per view. That is __turso_internal_dbsp_state_{view_name}. The format of
that table is similar to a ZSet: rowid, serialized_values, weight. We serialize
the values because there will be many aggregators in the table. We can't rely
on a particular format for the values.
The Materialized View Cursor
============================
Reading from a Materialized View essentially means reading from the persisted
ZSet, and enhancing that with data that exists within the transaction.
Transaction data is ephemeral, so we do not materialize this anywhere: we have
a carefully crafted implementation of seek that takes care of merging weights
and stitching the two sets together.
The CI is sometimes failing with the following error:
```thread '<unnamed>' panicked at sqlite3/src/lib.rs:156:37:
called `Result::unwrap()` on an `Err` value: InternalError("WAL already enabled")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at library/core/src/panicking.rs:226:5:
panic in a function that cannot unwind
stack backtrace:
0: 0x7ff612e11502 - std::backtrace_rs::backtrace::libunwind::trace::h74680e970b6e0712
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/../../backtrace/src/backtrace/libunwind.rs:117:9
1: 0x7ff612e11502 - std::backtrace_rs::backtrace::trace_unsynchronized::ha3bf590e3565a312
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/../../backtrace/src/backtrace/mod.rs:66:14
2: 0x7ff612e11502 - std::sys::backtrace::_print_fmt::hcf16024cbdd6c458
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/sys/backtrace.rs:66:9
3: 0x7ff612e11502 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h46a716bba2450163
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/sys/backtrace.rs:39:26
4: 0x7ff612e338a3 - core::fmt::rt::Argument::fmt::ha695e732309707b7
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/core/src/fmt/rt.rs:181:76
5: 0x7ff612e338a3 - core::fmt::write::h275e5980d7008551
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/core/src/fmt/mod.rs:1446:25
6: 0x7ff612e0f003 - std::io::default_write_fmt::hdc4119be3eb77042
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/io/mod.rs:639:11
7: 0x7ff612e0f003 - std::io::Write::write_fmt::h561a66a0340b6995
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/io/mod.rs:1914:13
8: 0x7ff612e11352 - std::sys::backtrace::BacktraceLock::print::hafb9d5969adc39a0
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/sys/backtrace.rs:42:9
9: 0x7ff612e12842 - std::panicking::default_hook::{{closure}}::hae2e97a5c4b2b777
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/panicking.rs:300:22
10: 0x7ff612e12645 - std::panicking::default_hook::h3db1b505cfc4eb79
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/panicking.rs:327:9
11: 0x7ff612e131e2 - std::panicking::rust_panic_with_hook::h409da73ddef13937
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/panicking.rs:833:13
12: 0x7ff612e12f56 - std::panicking::begin_panic_handler::{{closure}}::h159b61b27f96a9c2
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/panicking.rs:699:13
13: 0x7ff612e119f9 - std::sys::backtrace::__rust_end_short_backtrace::h5b56844d75e766fc
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/sys/backtrace.rs:168:18
14: 0x7ff612e12c1d - __rustc[4794b31dd7191200]::rust_begin_unwind
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/std/src/panicking.rs:697:5
15: 0x7ff611f56f9d - core::panicking::panic_nounwind_fmt::runtime::h4c94eb695becba00
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/core/src/panicking.rs:117:22
16: 0x7ff611f56f9d - core::panicking::panic_nounwind_fmt::hc3cf3432011a3c3f
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/core/src/intrinsics/mod.rs:3196:9
17: 0x7ff611f57032 - core::panicking::panic_nounwind::h0c59dc9f7f043ead
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/core/src/panicking.rs:226:5
18: 0x7ff611f57191 - core::panicking::panic_cannot_unwind::hb8732afd89555502
at /rustc/6b00bc3880198600130e1cf62b8f8a93494488cc/library/core/src/panicking.rs:331:5
19: 0x7ff611f585ee - sqlite3_open
at /home/runner/_work/turso/turso/sqlite3/src/lib.rs:127:1
20: 0x55f88e591f2c - compat::tests::test_wal_checkpoint_v2::hc70b8ddc1bc8d78d
at /home/runner/_work/turso/turso/sqlite3/tests/compat/mod.rs:188:17
```
As Rust integration tests run in parallel, they cannot use the same
tests files. However, as it turns out, one test does not even need a
file, and the other two are redundant so we can remote them.
Reviewed-by: Nikita Sivukhin (@sivukhin)
Closes#2936
They're both using the same database file which is wrong (tests run in
parallel). But more importantly, they test almost nothing, and we have a
better checkpoint test already.
closes#1419
When submitting a `pwritev` for flushing dirty pages, in the case that
it's a commit frame, we use a new completion type which tells io_uring
to add a flag, which ensures the following:
1. If any operation in the chain fails, subsequent operations get
cancelled with -ECANCELED
2. All operations in the chain complete in order
If there is an ongoing chain of `IO_LINK`, it ends at the `fsync`
barrier, and ensures everything submitted before it has completed.
for 99% of the cases, the syscall that immediately proceeds the
`pwritev` is going to be the fsync, but just in case, this
implementation links everything that comes between the final commit
`pwritev` and the next `fsync`
In the event that we get a partial write, if it was linked, then we
submit an additional fsync after the partial write completes, with an
`IO_DRAIN` flag after forcing a `submit`, which will mean durability is
maintained, as that fsync will flush/drain everything in the squeue
before submission.
The other option in the event of partial writes on commit frames/linked
writes is to error.. not sure which is the right move here. I guess it's
possible that since the fsync completion fired, than the commit could be
over without us being durable ondisk. So maybe it's an assertion
instead? Thoughts?
Closes#2909
Create one WalFileShared for a Database and update its state
accordingly. Also support case where the WAL is disabled.
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#2918
SQLite does not allow us to modify system tables, but we do. Let's fix
it.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Reviewed-by: Avinash Sajjanshetty (@avinassh)
Closes#2855
- otherwise, in multi-threading environment, other thread can think that
completion is finished and start execution
- this can lead to violated assertions (for example, page must be
loaded, but as callback is not executed yet assert will be fired)
Failing scenario:
1. main thread wants to execute pread - so it schedule IO and return
control to the caller
2. IO thread read data from the disk
3. IO thread executes complete(result)
4. complete func set result of the completion to Ok
5. main thread enter into the step loop again and check completion
status
6. completion marked as finished/is_completed - so main thread continue
execution
7. main thread check that page is loaded and fails with assertion -
because it's not loaded yet
8. IO thread executed the callback and finished the completion
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#2922
- otherwise, in multi-threading environment, other thread can think that completion is finished
and start execution
- this can lead to violated assertions (for example, page must be loaded, but as callback is not executed yet
assert will be fired)
`commit_txn` in MVCC was hacking its way through I/O until now. After
adding this and the test for concurrent writers we now see `busy` errors
returning as expected because there is no `commit` queueing happening
yet until next PR I open.
Closes#2895
When we create an ImmutableRow::from_value(), we are always adding a
null padding at the end. We didn't notice this before, because a SQLite
file with an extra column is as valid as any. But that column of course
should not be there.
I traced this to column_count(), which is off by one. My understanding
is that we should be returning based on serial_types, not offset.
Closes#2862
Add expression support for `LIMIT` and `OFFSET` by storing them as
`Expr` instead of fixed integers. Constant expressions are folded with
`try_fold_to_i64`, while dynamic ones emit runtime checks, including the
new `IfNeg` opcode to clamp negative or `NULL` values to zero. The
current `build_limit_offset_expr` implementation is still naive and will
be refined in future work.
Fixes#2913Closes#2720
Now supported:
- AEGIS variants: 256, 256X2, 256X4, 128L, 128X2, 128X4
- AES-GCM variants: AES-128-GCM, AES-256-GCM
With minor changes in order to make it easy to add new ciphers later
regardless of their key size.
Reviewed-by: Avinash Sajjanshetty (@avinassh)
Closes#2899
This PR introduces separate `package.browser.json` file for `*-browser`
npm packages (`@tursodatabase/sync-browser` and
`@tursodatabase/database-browser`).
The packages are nearly identical and the only change is `package.json`
content (browser package mentions only WASM optional dependency which
shouldn't confuse NPM and force it to download WASM dep package instead
of native one).
Due to that, innocent "hack" is implemented which swap `package.json`
with `package.browser.json` before publish of `browser` package.
Closes#2906
This PR unifies the logic for resolving aggregate functions. Previously,
bare aggregates (e.g. `SELECT max(a) FROM t1`) and aggregates wrapped in
expressions (e.g. `SELECT max(a) + 1 FROM t1`) were handled differently,
which led to duplicated code. Now both cases are resolved consistently.
The added benchmark shows a small improvement:
```
Prepare `SELECT first_name, last_name, state, city, age + 10, LENGTH(email), UPPER(first_name), LOWE...
time: [59.791 µs 59.898 µs 60.006 µs]
change: [-7.7090% -7.2760% -6.8242%] (p = 0.00 < 0.05)
Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
8 (8.00%) high mild
2 (2.00%) high severe
```
For an existing benchmark, no change:
```
Prepare `SELECT first_name, count(1) FROM users GROUP BY first_name HAVING count(1) > 1 ORDER BY cou...
time: [11.895 µs 11.913 µs 11.931 µs]
change: [-0.2545% +0.2426% +0.6960%] (p = 0.34 > 0.05)
No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low severe
2 (2.00%) high mild
5 (5.00%) high severe
```
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#2884