Merge 'Structured Generation for the Simulator' from Alperen Keleş

- added Arbitrary and ArbitraryOf<T> traits for mroe centralized
generation
- implemented random generation for tables and structured queries

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #464
This commit is contained in:
Pekka Enberg
2024-12-21 08:44:05 +02:00
10 changed files with 1359 additions and 213 deletions

74
simulator/README.md Normal file
View File

@@ -0,0 +1,74 @@
# Limbo Simulator
Limbo simulator uses randomized deterministic simulations to test the Limbo database behaviors.
Each simulations begins with a random configurations;
- the database workload distribution(percentages of reads, writes, deletes...),
- database parameters(page size),
- number of reader or writers, etc.
Based on these parameters, we randomly generate **interaction plans**. Interaction plans consist of statements/queries, and assertions that will be executed in order. The building blocks of interaction plans are;
- Randomly generated SQL queries satisfying the workload distribution,
- Properties, which contain multiple matching queries with assertions indicating the expected result.
An example of a property is the following:
```json
{
"name": "Read your own writes",
"queries": [
"INSERT INTO t1 (id) VALUES (1)",
"SELECT * FROM t1 WHERE id = 1",
],
"assertions": [
"result.rows.length == 1",
"result.rows[0].id == 1"
]
}
```
The simulator executes the interaction plans in a loop, and checks the assertions. It can add random queries unrelated to the properties without
breaking the property invariants to reach more diverse states and respect the configured workload distribution.
The simulator code is broken into 4 main parts:
- **Simulator(main.rs)**: The main entry point of the simulator. It generates random configurations and interaction plans, and executes them.
- **Model(model.rs, model/table.rs, model/query.rs)**: A simpler model of the database, it contains atomic actions for insertion and selection, we use this model while deciding the next actions.
- **Generation(generation.rs, generation/table.rs, generation/query.rs, generation/plan.rs)**: Random generation functions for the database model and interaction plans.
- **Properties(properties.rs)**: Contains the properties that we want to test.
## Running the simulator
To run the simulator, you can use the following command:
```bash
cargo run
```
This prompt (in the future) will invoke a clap command line interface to configure the simulator. For now, the simulator runs with the default configurations changing the `main.rs` file. If you want to see the logs, you can change the `RUST_LOG` environment variable.
```bash
RUST_LOG=info cargo run --bin limbo_sim
```
## Adding new properties
Todo
## Adding new generation functions
Todo
## Adding new models
Todo
## Coverage with Limbo
Todo
## Automatic Compatibility Testing with SQLite
Todo

65
simulator/generation.rs Normal file
View File

@@ -0,0 +1,65 @@
use anarchist_readable_name_generator_lib::readable_name_custom;
use rand::Rng;
pub mod plan;
pub mod query;
pub mod table;
pub trait Arbitrary {
fn arbitrary<R: Rng>(rng: &mut R) -> Self;
}
pub trait ArbitraryFrom<T> {
fn arbitrary_from<R: Rng>(rng: &mut R, t: &T) -> Self;
}
pub(crate) fn frequency<'a, T, R: rand::Rng>(
choices: Vec<(usize, Box<dyn FnOnce(&mut R) -> T + 'a>)>,
rng: &mut R,
) -> T {
let total = choices.iter().map(|(weight, _)| weight).sum::<usize>();
let mut choice = rng.gen_range(0..total);
for (weight, f) in choices {
if choice < weight {
return f(rng);
}
choice -= weight;
}
unreachable!()
}
pub(crate) fn one_of<'a, T, R: rand::Rng>(
choices: Vec<Box<dyn Fn(&mut R) -> T + 'a>>,
rng: &mut R,
) -> T {
let index = rng.gen_range(0..choices.len());
choices[index](rng)
}
pub(crate) fn pick<'a, T, R: rand::Rng>(choices: &'a Vec<T>, rng: &mut R) -> &'a T {
let index = rng.gen_range(0..choices.len());
&choices[index]
}
pub(crate) fn pick_index<R: rand::Rng>(choices: usize, rng: &mut R) -> usize {
rng.gen_range(0..choices)
}
fn gen_random_text<T: Rng>(rng: &mut T) -> String {
let big_text = rng.gen_ratio(1, 1000);
if big_text {
// let max_size: u64 = 2 * 1024 * 1024 * 1024;
let max_size: u64 = 2 * 1024; // todo: change this back to 2 * 1024 * 1024 * 1024
let size = rng.gen_range(1024..max_size);
let mut name = String::new();
for i in 0..size {
name.push(((i % 26) as u8 + b'A') as char);
}
name
} else {
let name = readable_name_custom("_", rng);
name.replace("-", "_")
}
}

View File

@@ -0,0 +1,405 @@
use std::{fmt::Display, rc::Rc};
use limbo_core::{Connection, Result, RowResult};
use rand::SeedableRng;
use rand_chacha::ChaCha8Rng;
use crate::{
model::{
query::{Create, Insert, Predicate, Query, Select},
table::Value,
},
SimConnection, SimulatorEnv, SimulatorOpts,
};
use crate::generation::{frequency, Arbitrary, ArbitraryFrom};
use super::{pick, pick_index};
pub(crate) type ResultSet = Vec<Vec<Value>>;
pub(crate) struct InteractionPlan {
pub(crate) plan: Vec<Interaction>,
pub(crate) stack: Vec<ResultSet>,
pub(crate) interaction_pointer: usize,
}
impl Display for InteractionPlan {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
for interaction in &self.plan {
match interaction {
Interaction::Query(query) => write!(f, "{};\n", query)?,
Interaction::Assertion(assertion) => {
write!(f, "-- ASSERT: {};\n", assertion.message)?
}
Interaction::Fault(fault) => write!(f, "-- FAULT: {};\n", fault)?,
}
}
Ok(())
}
}
#[derive(Debug)]
pub(crate) struct InteractionStats {
pub(crate) read_count: usize,
pub(crate) write_count: usize,
pub(crate) delete_count: usize,
}
impl Display for InteractionStats {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(
f,
"Read: {}, Write: {}, Delete: {}",
self.read_count, self.write_count, self.delete_count
)
}
}
pub(crate) enum Interaction {
Query(Query),
Assertion(Assertion),
Fault(Fault),
}
impl Display for Interaction {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Interaction::Query(query) => write!(f, "{}", query),
Interaction::Assertion(assertion) => write!(f, "ASSERT: {}", assertion.message),
Interaction::Fault(fault) => write!(f, "FAULT: {}", fault),
}
}
}
pub(crate) struct Assertion {
pub(crate) func: Box<dyn Fn(&Vec<ResultSet>) -> bool>,
pub(crate) message: String,
}
pub(crate) enum Fault {
Disconnect,
}
impl Display for Fault {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Fault::Disconnect => write!(f, "DISCONNECT"),
}
}
}
pub(crate) struct Interactions(Vec<Interaction>);
impl Interactions {
pub(crate) fn shadow(&self, env: &mut SimulatorEnv) {
for interaction in &self.0 {
match interaction {
Interaction::Query(query) => match query {
Query::Create(create) => {
env.tables.push(create.table.clone());
}
Query::Insert(insert) => {
let table = env
.tables
.iter_mut()
.find(|t| t.name == insert.table)
.unwrap();
table.rows.push(insert.values.clone());
}
Query::Delete(_) => todo!(),
Query::Select(_) => {}
},
Interaction::Assertion(_) => {}
Interaction::Fault(_) => {}
}
}
}
}
impl InteractionPlan {
pub(crate) fn new() -> Self {
InteractionPlan {
plan: Vec::new(),
stack: Vec::new(),
interaction_pointer: 0,
}
}
pub(crate) fn push(&mut self, interaction: Interaction) {
self.plan.push(interaction);
}
pub(crate) fn stats(&self) -> InteractionStats {
let mut read = 0;
let mut write = 0;
let mut delete = 0;
for interaction in &self.plan {
match interaction {
Interaction::Query(query) => match query {
Query::Select(_) => read += 1,
Query::Insert(_) => write += 1,
Query::Delete(_) => delete += 1,
Query::Create(_) => {}
},
Interaction::Assertion(_) => {}
Interaction::Fault(_) => {}
}
}
InteractionStats {
read_count: read,
write_count: write,
delete_count: delete,
}
}
}
impl ArbitraryFrom<SimulatorEnv> for InteractionPlan {
fn arbitrary_from<R: rand::Rng>(rng: &mut R, env: &SimulatorEnv) -> Self {
let mut plan = InteractionPlan::new();
let mut env = SimulatorEnv {
opts: env.opts.clone(),
tables: vec![],
connections: vec![],
io: env.io.clone(),
db: env.db.clone(),
rng: ChaCha8Rng::seed_from_u64(rng.next_u64()),
};
let num_interactions = rng.gen_range(0..env.opts.max_interactions);
// First create at least one table
let create_query = Create::arbitrary(rng);
env.tables.push(create_query.table.clone());
plan.push(Interaction::Query(Query::Create(create_query)));
while plan.plan.len() < num_interactions {
log::debug!(
"Generating interaction {}/{}",
plan.plan.len(),
num_interactions
);
let interactions = Interactions::arbitrary_from(rng, &(&env, plan.stats()));
interactions.shadow(&mut env);
plan.plan.extend(interactions.0.into_iter());
}
log::info!("Generated plan with {} interactions", plan.plan.len());
plan
}
}
impl Interaction {
pub(crate) fn execute_query(&self, conn: &mut Rc<Connection>) -> Result<ResultSet> {
match self {
Interaction::Query(query) => {
let query_str = query.to_string();
let rows = conn.query(&query_str);
if rows.is_err() {
let err = rows.err();
log::error!(
"Error running query '{}': {:?}",
&query_str[0..query_str.len().min(4096)],
err
);
return Err(err.unwrap());
}
let rows = rows.unwrap();
assert!(rows.is_some());
let mut rows = rows.unwrap();
let mut out = Vec::new();
while let Ok(row) = rows.next_row() {
match row {
RowResult::Row(row) => {
let mut r = Vec::new();
for el in &row.values {
let v = match el {
limbo_core::Value::Null => Value::Null,
limbo_core::Value::Integer(i) => Value::Integer(*i),
limbo_core::Value::Float(f) => Value::Float(*f),
limbo_core::Value::Text(t) => Value::Text(t.to_string()),
limbo_core::Value::Blob(b) => Value::Blob(b.to_vec()),
};
r.push(v);
}
out.push(r);
}
RowResult::IO => {}
RowResult::Interrupt => {}
RowResult::Done => {
break;
}
}
}
Ok(out)
}
Interaction::Assertion(_) => {
unreachable!("unexpected: this function should only be called on queries")
}
Interaction::Fault(fault) => {
unreachable!("unexpected: this function should only be called on queries")
}
}
}
pub(crate) fn execute_assertion(&self, stack: &Vec<ResultSet>) -> Result<()> {
match self {
Interaction::Query(_) => {
unreachable!("unexpected: this function should only be called on assertions")
}
Interaction::Assertion(assertion) => {
if !assertion.func.as_ref()(stack) {
return Err(limbo_core::LimboError::InternalError(
assertion.message.clone(),
));
}
Ok(())
}
Interaction::Fault(_) => {
unreachable!("unexpected: this function should only be called on assertions")
}
}
}
pub(crate) fn execute_fault(&self, env: &mut SimulatorEnv, conn_index: usize) -> Result<()> {
match self {
Interaction::Query(_) => {
unreachable!("unexpected: this function should only be called on faults")
}
Interaction::Assertion(_) => {
unreachable!("unexpected: this function should only be called on faults")
}
Interaction::Fault(fault) => {
match fault {
Fault::Disconnect => {
match env.connections[conn_index] {
SimConnection::Connected(ref mut conn) => {
conn.close()?;
}
SimConnection::Disconnected => {
return Err(limbo_core::LimboError::InternalError(
"Tried to disconnect a disconnected connection".to_string(),
));
}
}
env.connections[conn_index] = SimConnection::Disconnected;
}
}
Ok(())
}
}
}
}
fn property_insert_select<R: rand::Rng>(rng: &mut R, env: &SimulatorEnv) -> Interactions {
// Get a random table
let table = pick(&env.tables, rng);
// Pick a random column
let column_index = pick_index(table.columns.len(), rng);
let column = &table.columns[column_index].clone();
// Generate a random value of the column type
let value = Value::arbitrary_from(rng, &column.column_type);
// Create a whole new row
let mut row = Vec::new();
for (i, column) in table.columns.iter().enumerate() {
if i == column_index {
row.push(value.clone());
} else {
let value = Value::arbitrary_from(rng, &column.column_type);
row.push(value);
}
}
// Insert the row
let insert_query = Interaction::Query(Query::Insert(Insert {
table: table.name.clone(),
values: row.clone(),
}));
// Select the row
let select_query = Interaction::Query(Query::Select(Select {
table: table.name.clone(),
predicate: Predicate::Eq(column.name.clone(), value.clone()),
}));
// Check that the row is there
let assertion = Interaction::Assertion(Assertion {
message: format!(
"row [{:?}] not found in table {} after inserting ({} = {})",
row.iter().map(|v| v.to_string()).collect::<Vec<String>>(),
table.name,
column.name,
value,
),
func: Box::new(move |stack: &Vec<ResultSet>| {
let rows = stack.last().unwrap();
rows.iter().any(|r| r == &row)
}),
});
Interactions(vec![insert_query, select_query, assertion])
}
fn create_table<R: rand::Rng>(rng: &mut R, env: &SimulatorEnv) -> Interactions {
let create_query = Interaction::Query(Query::Create(Create::arbitrary(rng)));
Interactions(vec![create_query])
}
fn random_read<R: rand::Rng>(rng: &mut R, env: &SimulatorEnv) -> Interactions {
let select_query = Interaction::Query(Query::Select(Select::arbitrary_from(rng, &env.tables)));
Interactions(vec![select_query])
}
fn random_write<R: rand::Rng>(rng: &mut R, env: &SimulatorEnv) -> Interactions {
let table = pick(&env.tables, rng);
let insert_query = Interaction::Query(Query::Insert(Insert::arbitrary_from(rng, table)));
Interactions(vec![insert_query])
}
fn random_fault<R: rand::Rng>(rng: &mut R, env: &SimulatorEnv) -> Interactions {
let fault = Interaction::Fault(Fault::Disconnect);
Interactions(vec![fault])
}
impl ArbitraryFrom<(&SimulatorEnv, InteractionStats)> for Interactions {
fn arbitrary_from<R: rand::Rng>(
rng: &mut R,
(env, stats): &(&SimulatorEnv, InteractionStats),
) -> Self {
let remaining_read =
((((env.opts.max_interactions * env.opts.read_percent) as f64) / 100.0) as usize)
.saturating_sub(stats.read_count);
let remaining_write = ((((env.opts.max_interactions * env.opts.write_percent) as f64)
/ 100.0) as usize)
.saturating_sub(stats.write_count);
frequency(
vec![
(
usize::min(remaining_read, remaining_write),
Box::new(|rng: &mut R| property_insert_select(rng, env)),
),
(
remaining_read,
Box::new(|rng: &mut R| random_read(rng, env)),
),
(
remaining_write,
Box::new(|rng: &mut R| random_write(rng, env)),
),
(
remaining_write / 10,
Box::new(|rng: &mut R| create_table(rng, env)),
),
(1, Box::new(|rng: &mut R| random_fault(rng, env))),
],
rng,
)
}
}

View File

@@ -0,0 +1,242 @@
use crate::generation::table::{GTValue, LTValue};
use crate::generation::{one_of, Arbitrary, ArbitraryFrom};
use crate::model::query::{Create, Delete, Insert, Predicate, Query, Select};
use crate::model::table::{Table, Value};
use rand::Rng;
use super::{frequency, pick};
impl Arbitrary for Create {
fn arbitrary<R: Rng>(rng: &mut R) -> Self {
Create {
table: Table::arbitrary(rng),
}
}
}
impl ArbitraryFrom<Vec<Table>> for Select {
fn arbitrary_from<R: Rng>(rng: &mut R, tables: &Vec<Table>) -> Self {
let table = pick(tables, rng);
Select {
table: table.name.clone(),
predicate: Predicate::arbitrary_from(rng, table),
}
}
}
impl ArbitraryFrom<Vec<&Table>> for Select {
fn arbitrary_from<R: Rng>(rng: &mut R, tables: &Vec<&Table>) -> Self {
let table = pick(tables, rng);
Select {
table: table.name.clone(),
predicate: Predicate::arbitrary_from(rng, *table),
}
}
}
impl ArbitraryFrom<Table> for Insert {
fn arbitrary_from<R: Rng>(rng: &mut R, table: &Table) -> Self {
let values = table
.columns
.iter()
.map(|c| Value::arbitrary_from(rng, &c.column_type))
.collect();
Insert {
table: table.name.clone(),
values,
}
}
}
impl ArbitraryFrom<Table> for Delete {
fn arbitrary_from<R: Rng>(rng: &mut R, table: &Table) -> Self {
Delete {
table: table.name.clone(),
predicate: Predicate::arbitrary_from(rng, table),
}
}
}
impl ArbitraryFrom<Table> for Query {
fn arbitrary_from<R: Rng>(rng: &mut R, table: &Table) -> Self {
frequency(
vec![
(1, Box::new(|rng| Query::Create(Create::arbitrary(rng)))),
(
100,
Box::new(|rng| Query::Select(Select::arbitrary_from(rng, &vec![table]))),
),
(
100,
Box::new(|rng| Query::Insert(Insert::arbitrary_from(rng, table))),
),
(
0,
Box::new(|rng| Query::Delete(Delete::arbitrary_from(rng, table))),
),
],
rng,
)
}
}
struct CompoundPredicate(Predicate);
struct SimplePredicate(Predicate);
impl ArbitraryFrom<(&Table, bool)> for SimplePredicate {
fn arbitrary_from<R: Rng>(rng: &mut R, (table, predicate_value): &(&Table, bool)) -> Self {
// Pick a random column
let column_index = rng.gen_range(0..table.columns.len());
let column = &table.columns[column_index];
let column_values = table
.rows
.iter()
.map(|r| &r[column_index])
.collect::<Vec<_>>();
// Pick an operator
let operator = match predicate_value {
true => one_of(
vec![
Box::new(|rng| {
Predicate::Eq(
column.name.clone(),
Value::arbitrary_from(rng, &column_values),
)
}),
Box::new(|rng| {
Predicate::Gt(
column.name.clone(),
GTValue::arbitrary_from(rng, &column_values).0,
)
}),
Box::new(|rng| {
Predicate::Lt(
column.name.clone(),
LTValue::arbitrary_from(rng, &column_values).0,
)
}),
],
rng,
),
false => one_of(
vec![
Box::new(|rng| {
Predicate::Neq(
column.name.clone(),
Value::arbitrary_from(rng, &column.column_type),
)
}),
Box::new(|rng| {
Predicate::Gt(
column.name.clone(),
LTValue::arbitrary_from(rng, &column_values).0,
)
}),
Box::new(|rng| {
Predicate::Lt(
column.name.clone(),
GTValue::arbitrary_from(rng, &column_values).0,
)
}),
],
rng,
),
};
SimplePredicate(operator)
}
}
impl ArbitraryFrom<(&Table, bool)> for CompoundPredicate {
fn arbitrary_from<R: Rng>(rng: &mut R, (table, predicate_value): &(&Table, bool)) -> Self {
// Decide if you want to create an AND or an OR
CompoundPredicate(if rng.gen_bool(0.7) {
// An AND for true requires each of its children to be true
// An AND for false requires at least one of its children to be false
if *predicate_value {
Predicate::And(
(0..rng.gen_range(1..=3))
.map(|_| SimplePredicate::arbitrary_from(rng, &(*table, true)).0)
.collect(),
)
} else {
// Create a vector of random booleans
let mut booleans = (0..rng.gen_range(1..=3))
.map(|_| rng.gen_bool(0.5))
.collect::<Vec<_>>();
let len = booleans.len();
// Make sure at least one of them is false
if booleans.iter().all(|b| *b) {
booleans[rng.gen_range(0..len)] = false;
}
Predicate::And(
booleans
.iter()
.map(|b| SimplePredicate::arbitrary_from(rng, &(*table, *b)).0)
.collect(),
)
}
} else {
// An OR for true requires at least one of its children to be true
// An OR for false requires each of its children to be false
if *predicate_value {
// Create a vector of random booleans
let mut booleans = (0..rng.gen_range(1..=3))
.map(|_| rng.gen_bool(0.5))
.collect::<Vec<_>>();
let len = booleans.len();
// Make sure at least one of them is true
if booleans.iter().all(|b| !*b) {
booleans[rng.gen_range(0..len)] = true;
}
Predicate::Or(
booleans
.iter()
.map(|b| SimplePredicate::arbitrary_from(rng, &(*table, *b)).0)
.collect(),
)
} else {
Predicate::Or(
(0..rng.gen_range(1..=3))
.map(|_| SimplePredicate::arbitrary_from(rng, &(*table, false)).0)
.collect(),
)
}
})
}
}
impl ArbitraryFrom<Table> for Predicate {
fn arbitrary_from<R: Rng>(rng: &mut R, table: &Table) -> Self {
let predicate_value = rng.gen_bool(0.5);
CompoundPredicate::arbitrary_from(rng, &(table, predicate_value)).0
}
}
impl ArbitraryFrom<(&str, &Value)> for Predicate {
fn arbitrary_from<R: Rng>(rng: &mut R, (column_name, value): &(&str, &Value)) -> Self {
one_of(
vec![
Box::new(|rng| Predicate::Eq(column_name.to_string(), (*value).clone())),
Box::new(|rng| {
Predicate::Gt(
column_name.to_string(),
GTValue::arbitrary_from(rng, *value).0,
)
}),
Box::new(|rng| {
Predicate::Lt(
column_name.to_string(),
LTValue::arbitrary_from(rng, *value).0,
)
}),
],
rng,
)
}
}

View File

@@ -0,0 +1,196 @@
use rand::Rng;
use crate::generation::{
gen_random_text, pick, pick_index, readable_name_custom, Arbitrary, ArbitraryFrom,
};
use crate::model::table::{Column, ColumnType, Name, Table, Value};
impl Arbitrary for Name {
fn arbitrary<R: Rng>(rng: &mut R) -> Self {
let name = readable_name_custom("_", rng);
Name(name.replace("-", "_"))
}
}
impl Arbitrary for Table {
fn arbitrary<R: Rng>(rng: &mut R) -> Self {
let name = Name::arbitrary(rng).0;
let columns = (1..=rng.gen_range(1..5))
.map(|_| Column::arbitrary(rng))
.collect();
Table {
rows: Vec::new(),
name,
columns,
}
}
}
impl Arbitrary for Column {
fn arbitrary<R: Rng>(rng: &mut R) -> Self {
let name = Name::arbitrary(rng).0;
let column_type = ColumnType::arbitrary(rng);
Column {
name,
column_type,
primary: false,
unique: false,
}
}
}
impl Arbitrary for ColumnType {
fn arbitrary<R: Rng>(rng: &mut R) -> Self {
pick(
&vec![
ColumnType::Integer,
ColumnType::Float,
ColumnType::Text,
ColumnType::Blob,
],
rng,
)
.to_owned()
}
}
impl ArbitraryFrom<Vec<&Value>> for Value {
fn arbitrary_from<R: Rng>(rng: &mut R, values: &Vec<&Value>) -> Self {
if values.is_empty() {
return Value::Null;
}
pick(values, rng).to_owned().clone()
}
}
impl ArbitraryFrom<ColumnType> for Value {
fn arbitrary_from<R: Rng>(rng: &mut R, column_type: &ColumnType) -> Self {
match column_type {
ColumnType::Integer => Value::Integer(rng.gen_range(i64::MIN..i64::MAX)),
ColumnType::Float => Value::Float(rng.gen_range(-1e10..1e10)),
ColumnType::Text => Value::Text(gen_random_text(rng)),
ColumnType::Blob => Value::Blob(gen_random_text(rng).as_bytes().to_vec()),
}
}
}
pub(crate) struct LTValue(pub(crate) Value);
impl ArbitraryFrom<Vec<&Value>> for LTValue {
fn arbitrary_from<R: Rng>(rng: &mut R, values: &Vec<&Value>) -> Self {
if values.is_empty() {
return LTValue(Value::Null);
}
let index = pick_index(values.len(), rng);
LTValue::arbitrary_from(rng, values[index])
}
}
impl ArbitraryFrom<Value> for LTValue {
fn arbitrary_from<R: Rng>(rng: &mut R, value: &Value) -> Self {
match value {
Value::Integer(i) => LTValue(Value::Integer(rng.gen_range(i64::MIN..*i - 1))),
Value::Float(f) => LTValue(Value::Float(rng.gen_range(-1e10..*f - 1.0))),
Value::Text(t) => {
// Either shorten the string, or make at least one character smaller and mutate the rest
let mut t = t.clone();
if rng.gen_bool(0.01) {
t.pop();
LTValue(Value::Text(t))
} else {
let mut t = t.chars().map(|c| c as u32).collect::<Vec<_>>();
let index = rng.gen_range(0..t.len());
t[index] -= 1;
// Mutate the rest of the string
for i in (index + 1)..t.len() {
t[i] = rng.gen_range('a' as u32..='z' as u32);
}
let t = t
.into_iter()
.map(|c| char::from_u32(c).unwrap_or('z'))
.collect::<String>();
LTValue(Value::Text(t))
}
}
Value::Blob(b) => {
// Either shorten the blob, or make at least one byte smaller and mutate the rest
let mut b = b.clone();
if rng.gen_bool(0.01) {
b.pop();
LTValue(Value::Blob(b))
} else {
let index = rng.gen_range(0..b.len());
b[index] -= 1;
// Mutate the rest of the blob
for i in (index + 1)..b.len() {
b[i] = rng.gen_range(0..=255);
}
LTValue(Value::Blob(b))
}
}
_ => unreachable!(),
}
}
}
pub(crate) struct GTValue(pub(crate) Value);
impl ArbitraryFrom<Vec<&Value>> for GTValue {
fn arbitrary_from<R: Rng>(rng: &mut R, values: &Vec<&Value>) -> Self {
if values.is_empty() {
return GTValue(Value::Null);
}
let index = pick_index(values.len(), rng);
GTValue::arbitrary_from(rng, values[index])
}
}
impl ArbitraryFrom<Value> for GTValue {
fn arbitrary_from<R: Rng>(rng: &mut R, value: &Value) -> Self {
match value {
Value::Integer(i) => GTValue(Value::Integer(rng.gen_range(*i..i64::MAX))),
Value::Float(f) => GTValue(Value::Float(rng.gen_range(*f..1e10))),
Value::Text(t) => {
// Either lengthen the string, or make at least one character smaller and mutate the rest
let mut t = t.clone();
if rng.gen_bool(0.01) {
t.push(rng.gen_range(0..=255) as u8 as char);
GTValue(Value::Text(t))
} else {
let mut t = t.chars().map(|c| c as u32).collect::<Vec<_>>();
let index = rng.gen_range(0..t.len());
t[index] += 1;
// Mutate the rest of the string
for i in (index + 1)..t.len() {
t[i] = rng.gen_range('a' as u32..='z' as u32);
}
let t = t
.into_iter()
.map(|c| char::from_u32(c).unwrap_or('a'))
.collect::<String>();
GTValue(Value::Text(t))
}
}
Value::Blob(b) => {
// Either lengthen the blob, or make at least one byte smaller and mutate the rest
let mut b = b.clone();
if rng.gen_bool(0.01) {
b.push(rng.gen_range(0..=255));
GTValue(Value::Blob(b))
} else {
let index = rng.gen_range(0..b.len());
b[index] += 1;
// Mutate the rest of the blob
for i in (index + 1)..b.len() {
b[i] = rng.gen_range(0..=255);
}
GTValue(Value::Blob(b))
}
}
_ => unreachable!(),
}
}
}

View File

@@ -1,12 +1,20 @@
use generation::plan::{Interaction, InteractionPlan, ResultSet};
use generation::{pick, pick_index, Arbitrary, ArbitraryFrom};
use limbo_core::{Connection, Database, File, OpenFlags, PlatformIO, Result, RowResult, IO};
use model::query::{Create, Insert, Predicate, Query, Select};
use model::table::{Column, Name, Table, Value};
use properties::{property_insert_select, property_select_all};
use rand::prelude::*;
use rand_chacha::ChaCha8Rng;
use std::cell::RefCell;
use std::io::Write;
use std::rc::Rc;
use std::sync::Arc;
use tempfile::TempDir;
use anarchist_readable_name_generator_lib::readable_name_custom;
mod generation;
mod model;
mod properties;
struct SimulatorEnv {
opts: SimulatorOpts,
@@ -23,7 +31,7 @@ enum SimConnection {
Disconnected,
}
#[derive(Debug)]
#[derive(Debug, Clone)]
struct SimulatorOpts {
ticks: usize,
max_connections: usize,
@@ -33,40 +41,10 @@ struct SimulatorOpts {
read_percent: usize,
write_percent: usize,
delete_percent: usize,
max_interactions: usize,
page_size: usize,
}
struct Table {
rows: Vec<Vec<Value>>,
name: String,
columns: Vec<Column>,
}
#[derive(Clone)]
struct Column {
name: String,
column_type: ColumnType,
primary: bool,
unique: bool,
}
#[derive(Clone)]
enum ColumnType {
Integer,
Float,
Text,
Blob,
}
#[derive(Debug, PartialEq)]
enum Value {
Null,
Integer(i64),
Float(f64),
Text(String),
Blob(Vec<u8>),
}
#[allow(clippy::arc_with_non_send_sync)]
fn main() {
let _ = env_logger::try_init();
@@ -88,7 +66,7 @@ fn main() {
};
let opts = SimulatorOpts {
ticks: rng.gen_range(0..4096),
ticks: rng.gen_range(0..10240),
max_connections: 1, // TODO: for now let's use one connection as we didn't implement
// correct transactions procesing
max_tables: rng.gen_range(0..128),
@@ -96,6 +74,7 @@ fn main() {
write_percent,
delete_percent,
page_size: 4096, // TODO: randomize this too
max_interactions: rng.gen_range(0..10240),
};
let io = Arc::new(SimulatorIO::new(seed, opts.page_size).unwrap());
@@ -121,106 +100,103 @@ fn main() {
println!("Initial opts {:?}", env.opts);
for _ in 0..env.opts.ticks {
let connection_index = env.rng.gen_range(0..env.opts.max_connections);
let mut connection = env.connections[connection_index].clone();
log::info!("Generating database interaction plan...");
let mut plans = (1..=env.opts.max_connections)
.map(|_| InteractionPlan::arbitrary_from(&mut env.rng.clone(), &env))
.collect::<Vec<_>>();
match &mut connection {
SimConnection::Connected(conn) => {
let disconnect = env.rng.gen_ratio(1, 100);
if disconnect {
log::info!("disconnecting {}", connection_index);
let _ = conn.close();
env.connections[connection_index] = SimConnection::Disconnected;
} else {
match process_connection(&mut env, conn) {
Ok(_) => {}
Err(err) => {
log::error!("error {}", err);
break;
}
}
}
}
SimConnection::Disconnected => {
log::info!("disconnecting {}", connection_index);
env.connections[connection_index] = SimConnection::Connected(env.db.connect());
}
}
log::info!("{}", plans[0].stats());
log::info!("Executing database interaction plan...");
let result = execute_plans(&mut env, &mut plans);
if result.is_err() {
log::error!("error executing plans: {:?}", result.err());
}
log::info!("db is at {:?}", path);
let mut path = TempDir::new().unwrap().into_path();
path.push("simulator.plan");
let mut f = std::fs::File::create(path.clone()).unwrap();
f.write(plans[0].to_string().as_bytes()).unwrap();
log::info!("plan saved at {:?}", path);
log::info!("seed was {}", seed);
env.io.print_stats();
}
fn process_connection(env: &mut SimulatorEnv, conn: &mut Rc<Connection>) -> Result<()> {
let management = env.rng.gen_ratio(1, 100);
if management {
// for now create table only
maybe_add_table(env, conn)?;
} else if env.tables.is_empty() {
maybe_add_table(env, conn)?;
fn execute_plans(env: &mut SimulatorEnv, plans: &mut Vec<InteractionPlan>) -> Result<()> {
// todo: add history here by recording which interaction was executed at which tick
for _tick in 0..env.opts.ticks {
// Pick the connection to interact with
let connection_index = pick_index(env.connections.len(), &mut env.rng);
// Execute the interaction for the selected connection
execute_plan(env, connection_index, plans)?;
}
Ok(())
}
fn execute_plan(
env: &mut SimulatorEnv,
connection_index: usize,
plans: &mut Vec<InteractionPlan>,
) -> Result<()> {
let connection = &env.connections[connection_index];
let plan = &mut plans[connection_index];
if plan.interaction_pointer >= plan.plan.len() {
return Ok(());
}
let interaction = &plan.plan[plan.interaction_pointer];
if let SimConnection::Disconnected = connection {
log::info!("connecting {}", connection_index);
env.connections[connection_index] = SimConnection::Connected(env.db.connect());
} else {
let roll = env.rng.gen_range(0..100);
if roll < env.opts.read_percent {
// read
do_select(env, conn)?;
} else if roll < env.opts.read_percent + env.opts.write_percent {
// write
do_write(env, conn)?;
} else {
// delete
// TODO
match execute_interaction(env, connection_index, interaction, &mut plan.stack) {
Ok(_) => {
log::debug!("connection {} processed", connection_index);
plan.interaction_pointer += 1;
}
Err(err) => {
log::error!("error {}", err);
return Err(err);
}
}
}
Ok(())
}
fn do_select(env: &mut SimulatorEnv, conn: &mut Rc<Connection>) -> Result<()> {
let table = env.rng.gen_range(0..env.tables.len());
let table_name = {
let table = &env.tables[table];
table.name.clone()
};
let rows = get_all_rows(env, conn, format!("SELECT * FROM {}", table_name).as_str())?;
fn execute_interaction(
env: &mut SimulatorEnv,
connection_index: usize,
interaction: &Interaction,
stack: &mut Vec<ResultSet>,
) -> Result<()> {
log::info!("executing: {}", interaction);
match interaction {
generation::plan::Interaction::Query(_) => {
let conn = match &mut env.connections[connection_index] {
SimConnection::Connected(conn) => conn,
SimConnection::Disconnected => unreachable!(),
};
let table = &env.tables[table];
compare_equal_rows(&table.rows, &rows);
Ok(())
}
fn do_write(env: &mut SimulatorEnv, conn: &mut Rc<Connection>) -> Result<()> {
let mut query = String::new();
let table = env.rng.gen_range(0..env.tables.len());
{
let table = &env.tables[table];
query.push_str(format!("INSERT INTO {} VALUES (", table.name).as_str());
log::debug!("{}", interaction);
let results = interaction.execute_query(conn)?;
log::debug!("{:?}", results);
stack.push(results);
}
generation::plan::Interaction::Assertion(_) => {
interaction.execute_assertion(stack)?;
stack.clear();
}
Interaction::Fault(_) => {
interaction.execute_fault(env, connection_index)?;
}
}
let columns = env.tables[table].columns.clone();
let mut row = Vec::new();
// gen insert query
for column in &columns {
let value = match column.column_type {
ColumnType::Integer => Value::Integer(env.rng.gen_range(i64::MIN..i64::MAX)),
ColumnType::Float => Value::Float(env.rng.gen_range(-1e10..1e10)),
ColumnType::Text => Value::Text(gen_random_text(env)),
ColumnType::Blob => Value::Blob(gen_random_text(env).as_bytes().to_vec()),
};
query.push_str(value.to_string().as_str());
query.push(',');
row.push(value);
}
let table = &mut env.tables[table];
table.rows.push(row);
query.pop();
query.push_str(");");
let _ = get_all_rows(env, conn, query.as_str())?;
Ok(())
}
@@ -237,10 +213,15 @@ fn maybe_add_table(env: &mut SimulatorEnv, conn: &mut Rc<Connection>) -> Result<
if env.tables.len() < env.opts.max_tables {
let table = Table {
rows: Vec::new(),
name: gen_random_name(env),
columns: gen_columns(env),
name: Name::arbitrary(&mut env.rng).0,
columns: (1..env.rng.gen_range(1..128))
.map(|_| Column::arbitrary(&mut env.rng))
.collect(),
};
let rows = get_all_rows(env, conn, table.to_create_str().as_str())?;
let query = Query::Create(Create {
table: table.clone(),
});
let rows = get_all_rows(env, conn, query.to_string().as_str())?;
log::debug!("{:?}", rows);
let rows = get_all_rows(
env,
@@ -258,7 +239,7 @@ fn maybe_add_table(env: &mut SimulatorEnv, conn: &mut Rc<Connection>) -> Result<
_ => unreachable!(),
};
assert!(
*as_text != table.to_create_str(),
*as_text != query.to_string(),
"table was not inserted correctly"
);
env.tables.push(table);
@@ -266,50 +247,6 @@ fn maybe_add_table(env: &mut SimulatorEnv, conn: &mut Rc<Connection>) -> Result<
Ok(())
}
fn gen_random_name(env: &mut SimulatorEnv) -> String {
let name = readable_name_custom("_", &mut env.rng);
name.replace("-", "_")
}
fn gen_random_text(env: &mut SimulatorEnv) -> String {
let big_text = env.rng.gen_ratio(1, 1000);
if big_text {
let max_size: u64 = 2 * 1024 * 1024 * 1024;
let size = env.rng.gen_range(1024..max_size);
let mut name = String::new();
for i in 0..size {
name.push(((i % 26) as u8 + b'A') as char);
}
name
} else {
let name = readable_name_custom("_", &mut env.rng);
name.replace("-", "_")
}
}
fn gen_columns(env: &mut SimulatorEnv) -> Vec<Column> {
let mut column_range = env.rng.gen_range(1..128);
let mut columns = Vec::new();
while column_range > 0 {
let column_type = match env.rng.gen_range(0..4) {
0 => ColumnType::Integer,
1 => ColumnType::Float,
2 => ColumnType::Text,
3 => ColumnType::Blob,
_ => unreachable!(),
};
let column = Column {
name: gen_random_name(env),
column_type,
primary: false,
unique: false,
};
columns.push(column);
column_range -= 1;
}
columns
}
fn get_all_rows(
env: &mut SimulatorEnv,
conn: &mut Rc<Connection>,
@@ -538,49 +475,3 @@ impl Drop for SimulatorFile {
self.inner.unlock_file().expect("Failed to unlock file");
}
}
impl ColumnType {
pub fn as_str(&self) -> &str {
match self {
ColumnType::Integer => "INTEGER",
ColumnType::Float => "FLOAT",
ColumnType::Text => "TEXT",
ColumnType::Blob => "BLOB",
}
}
}
impl Table {
pub fn to_create_str(&self) -> String {
let mut out = String::new();
out.push_str(format!("CREATE TABLE {} (", self.name).as_str());
assert!(!self.columns.is_empty());
for column in &self.columns {
out.push_str(format!("{} {},", column.name, column.column_type.as_str()).as_str());
}
// remove last comma
out.pop();
out.push_str(");");
out
}
}
impl Value {
pub fn to_string(&self) -> String {
match self {
Value::Null => "NULL".to_string(),
Value::Integer(i) => i.to_string(),
Value::Float(f) => f.to_string(),
Value::Text(t) => format!("'{}'", t.clone()),
Value::Blob(vec) => to_sqlite_blob(vec),
}
}
}
fn to_sqlite_blob(bytes: &[u8]) -> String {
let hex: String = bytes.iter().map(|b| format!("{:02X}", b)).collect();
format!("X'{}'", hex)
}

2
simulator/model.rs Normal file
View File

@@ -0,0 +1,2 @@
pub mod query;
pub mod table;

122
simulator/model/query.rs Normal file
View File

@@ -0,0 +1,122 @@
use std::fmt::Display;
use crate::model::table::{Table, Value};
#[derive(Clone, Debug, PartialEq)]
pub(crate) enum Predicate {
And(Vec<Predicate>), // p1 AND p2 AND p3... AND pn
Or(Vec<Predicate>), // p1 OR p2 OR p3... OR pn
Eq(String, Value), // column = Value
Neq(String, Value), // column != Value
Gt(String, Value), // column > Value
Lt(String, Value), // column < Value
}
impl Display for Predicate {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Predicate::And(predicates) => {
if predicates.is_empty() {
// todo: Make this TRUE when the bug is fixed
write!(f, "TRUE")
} else {
write!(f, "(")?;
for (i, p) in predicates.iter().enumerate() {
if i != 0 {
write!(f, " AND ")?;
}
write!(f, "{}", p)?;
}
write!(f, ")")
}
}
Predicate::Or(predicates) => {
if predicates.is_empty() {
write!(f, "FALSE")
} else {
write!(f, "(")?;
for (i, p) in predicates.iter().enumerate() {
if i != 0 {
write!(f, " OR ")?;
}
write!(f, "{}", p)?;
}
write!(f, ")")
}
}
Predicate::Eq(name, value) => write!(f, "{} = {}", name, value),
Predicate::Neq(name, value) => write!(f, "{} != {}", name, value),
Predicate::Gt(name, value) => write!(f, "{} > {}", name, value),
Predicate::Lt(name, value) => write!(f, "{} < {}", name, value),
}
}
}
// This type represents the potential queries on the database.
#[derive(Debug)]
pub(crate) enum Query {
Create(Create),
Select(Select),
Insert(Insert),
Delete(Delete),
}
#[derive(Debug)]
pub(crate) struct Create {
pub(crate) table: Table,
}
#[derive(Clone, Debug, PartialEq)]
pub(crate) struct Select {
pub(crate) table: String,
pub(crate) predicate: Predicate,
}
#[derive(Clone, Debug, PartialEq)]
pub(crate) struct Insert {
pub(crate) table: String,
pub(crate) values: Vec<Value>,
}
#[derive(Clone, Debug, PartialEq)]
pub(crate) struct Delete {
pub(crate) table: String,
pub(crate) predicate: Predicate,
}
impl Display for Query {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Query::Create(Create { table }) => {
write!(f, "CREATE TABLE {} (", table.name)?;
for (i, column) in table.columns.iter().enumerate() {
if i != 0 {
write!(f, ",")?;
}
write!(f, "{} {}", column.name, column.column_type)?;
}
write!(f, ")")
}
Query::Select(Select {
table,
predicate: guard,
}) => write!(f, "SELECT * FROM {} WHERE {}", table, guard),
Query::Insert(Insert { table, values }) => {
write!(f, "INSERT INTO {} VALUES (", table)?;
for (i, v) in values.iter().enumerate() {
if i != 0 {
write!(f, ", ")?;
}
write!(f, "{}", v)?;
}
write!(f, ")")
}
Query::Delete(Delete {
table,
predicate: guard,
}) => write!(f, "DELETE FROM {} WHERE {}", table, guard),
}
}
}

71
simulator/model/table.rs Normal file
View File

@@ -0,0 +1,71 @@
use std::{fmt::Display, ops::Deref};
pub(crate) struct Name(pub(crate) String);
impl Deref for Name {
type Target = str;
fn deref(&self) -> &Self::Target {
&self.0
}
}
#[derive(Debug, Clone)]
pub(crate) struct Table {
pub(crate) rows: Vec<Vec<Value>>,
pub(crate) name: String,
pub(crate) columns: Vec<Column>,
}
#[derive(Debug, Clone)]
pub(crate) struct Column {
pub(crate) name: String,
pub(crate) column_type: ColumnType,
pub(crate) primary: bool,
pub(crate) unique: bool,
}
#[derive(Debug, Clone)]
pub(crate) enum ColumnType {
Integer,
Float,
Text,
Blob,
}
impl Display for ColumnType {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
ColumnType::Integer => write!(f, "INTEGER"),
ColumnType::Float => write!(f, "REAL"),
ColumnType::Text => write!(f, "TEXT"),
ColumnType::Blob => write!(f, "BLOB"),
}
}
}
#[derive(Clone, Debug, PartialEq)]
pub(crate) enum Value {
Null,
Integer(i64),
Float(f64),
Text(String),
Blob(Vec<u8>),
}
fn to_sqlite_blob(bytes: &[u8]) -> String {
let hex: String = bytes.iter().map(|b| format!("{:02X}", b)).collect();
format!("X'{}'", hex)
}
impl Display for Value {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Value::Null => write!(f, "NULL"),
Value::Integer(i) => write!(f, "{}", i),
Value::Float(fl) => write!(f, "{}", fl),
Value::Text(t) => write!(f, "'{}'", t),
Value::Blob(b) => write!(f, "{}", to_sqlite_blob(b)),
}
}
}

78
simulator/properties.rs Normal file
View File

@@ -0,0 +1,78 @@
use std::rc::Rc;
use limbo_core::Connection;
use rand::Rng;
use crate::{
compare_equal_rows,
generation::ArbitraryFrom,
get_all_rows,
model::{
query::{Insert, Predicate, Query, Select},
table::Value,
},
SimulatorEnv,
};
pub fn property_insert_select(env: &mut SimulatorEnv, conn: &mut Rc<Connection>) {
// Get a random table
let table = env.rng.gen_range(0..env.tables.len());
// Pick a random column
let column_index = env.rng.gen_range(0..env.tables[table].columns.len());
let column = &env.tables[table].columns[column_index].clone();
let mut rng = env.rng.clone();
// Generate a random value of the column type
let value = Value::arbitrary_from(&mut rng, &column.column_type);
// Create a whole new row
let mut row = Vec::new();
for (i, column) in env.tables[table].columns.iter().enumerate() {
if i == column_index {
row.push(value.clone());
} else {
let value = Value::arbitrary_from(&mut rng, &column.column_type);
row.push(value);
}
}
// Insert the row
let query = Query::Insert(Insert {
table: env.tables[table].name.clone(),
values: row.clone(),
});
let _ = get_all_rows(env, conn, query.to_string().as_str()).unwrap();
// Shadow operation on the table
env.tables[table].rows.push(row.clone());
// Create a query that selects the row
let query = Query::Select(Select {
table: env.tables[table].name.clone(),
predicate: Predicate::Eq(column.name.clone(), value),
});
// Get all rows
let rows = get_all_rows(env, conn, query.to_string().as_str()).unwrap();
// Check that the row is there
assert!(rows.iter().any(|r| r == &row));
}
pub fn property_select_all(env: &mut SimulatorEnv, conn: &mut Rc<Connection>) {
// Get a random table
let table = env.rng.gen_range(0..env.tables.len());
// Create a query that selects all rows
let query = Query::Select(Select {
table: env.tables[table].name.clone(),
predicate: Predicate::And(Vec::new()),
});
// Get all rows
let rows = get_all_rows(env, conn, query.to_string().as_str()).unwrap();
// Make sure the rows are the same
compare_equal_rows(&rows, &env.tables[table].rows);
}