MVCC is like the annoying younger cousin (I know because I was him) that
needs to be treated differently. MVCC requires us to use root_pages that
might not be allocated yet, and the plan is to use negative root_pages
for that case. Therefore, we need i64 in order to fit this change.
We have used i64 before because that is the size of an integer in
SQLite. However, I believe that for large enough databases, the chances
of collision here are just too high. The effect of a collision is the
database silently returning incorrect data in the materialized view.
So now that everything else is working, we should move to i128.
The Materialized View code had custom serialization written so we could
move this code forward. Now that we have many operators and the views
work, replace it with something saner.
The main insight is that if we transform the AggregateState into Values
before the serialization, we are able to just use standard SQLite
serialization for the values. We then just have to add sizes, codes for
the functions, etc (which are also represented as Values).
This PR improves the DBSP circuit so that it handles the JOIN operator.
The JOIN operator exposes a weakness of our current model: we usually
pass a list of columns between operators, and find the right column by
name when needed.
But with JOINs, many tables can have the same columns. The operators
will then find the wrong column (same name, different table), and
produce incorrect results.
To fix this, we must do two things:
1) Change the Logical Plan. It needs to track table provenance.
2) Fix the aggregators: it needs to operate on indexes, not names.
For the aggregators, note that table provenance is the wrong
abstraction. The aggregator is likely working with a logical table that
is the result of previous nodes in the circuit. So we just need to be
able to tell it which index in the column array it should use.