In case an ORDER BY column exactly matches a result column in the SELECT,
the insertion of the result column into the ORDER BY sorter can be skipped
because it's already necessarily inserted as a sorting column.
For this reason we have a mapping to know what index a given result column
has in the order by sorter.
This commit makes that mapping much simpler.
Previously, the logic for collecting non-aggregate columns was duplicated
across multiple locations and implemented inconsistently. This caused a
bug that was revealed by the refactoring in this commit (see the added
test).
Currently we have some usages of LIMIT where the actual limit counter
is initialized next to the DecrJumpZero instruction, and then
`program.mark_last_insn_constant()` is used to hoist the counter
initialization to the beginning of the program.
This is very fragile, and already FROM clause subquery handling works
around this with a hack (removed in this PR), and (upcoming) WHERE clause
subqueries would also run into problems because of this, because the LIMIT
might need to be initialized once for every iteration of the subquery.
This PR removes those usages for LIMIT, and LIMIT processing is now more
intuitive:
- limit counter is now initialized at the start of the query processing
- a function init_limit() is extracted to do this for select/update/delete
Previously the Operation enum consisted of:
- Operation::Scan
- Operation::Search
- Operation::Subquery
Which was always a dumb hack because what we really are doing is
an Operation::Scan on a "virtual"/"pseudo" table (overloaded names...)
derived from a subquery appearing in the FROM clause.
Hence, refactor the relevant data structures so that the Table enum
now contains a new variant:
Table::FromClauseSubquery
And the Operation enum only consists of Scan and Search.
No functional changes (intended, at least!)
We've run into trouble in multiple places due to the fact that
we delete terms from the where clause (e.g. when a constant condition
is removed, or the term becomes part of an index seek key).
A simpler solution is to add a flag indicating that the term is
consumed (used), so that it is not translated in the main loop
anymore when WHERE clause terms are evaluated.
TLDR: no need to call either of:
program.emit_insn_with_label_dependency() -> just call program.emit_insn()
program.defer_label_resolution() -> just call program.resolve_label()
Changes:
- make BranchOffset an explicit enum (Label, Offset, Placeholder)
- remove program.emit_insn_with_label_dependency() - label dependency is automatically detected
- for label to offset mapping, use a hashmap from label(negative i32) to offset (positive u32)
- resolve all labels in program.build()
- remove program.defer_label_resolution() - all labels are resolved in build()