Octocontrabass wrote:mikegonta wrote:the-ideom (which is written in ideom) is only a small simple single pass (concept oriented) imperative compiler
mikegonta wrote:the-ideom can re-compile itself in less than 1 second (on my Celeron NUC) and requires 4 passes to generate the code.
I see your compiler has gone from four passes to one. What changed?
Terminology.
the-ideom does a single pass over the entire source (including any included files).
It is during this pass that the high level ideom abstract concepts are extracted.
These include type and global definitions, global bodies, routine headers and routine bodies.
These can be used and defined in any order, as well as used before defined.
During the analysis stage the-ideom reduces these to their conceptual common denominators and
represents them internally in a flat list-like internal representation.
A routine's fragments represent the statements and structure of the routine.
During the transmogrifying (code generating) stage a series of consolidations take care of
what is normally referred to as simple optimizations. Other consolidations such as constant folding,
operator precedence and parenthesis are actually taken care of during the initial analysis and transformed
for the IR.
There follows 4 distinct and well defined passes over all and then some of the IR during the outputting of x86 machine code.
The first such pass generates the machine code and the addresses of the individual fragments.
This is an "optimistic" process in that the shortest form of the intel instructions is initially assumed.
This applies to relative jumps. There are also short form instruction which use an offset - the EBP based
frame pointer local variables. the-ideom has, prior to code generation sorted these local variables by usage
so that the most frequently used are located closer to the start of the frame.
The next pass is to resolve the obvious differences in instruction size for those jump relative instructions
in which the actual distance is greater than the initial optimistic assumption.
The third pass is to resolve the former short distance relative jumps which are now longer due to the increase
lengths of the previous pass. The three passes guarantee generation of the shortest possible instructions.
There are however a small group of alternate instructions which are in some cases shorter which the-ideom does
not currently employ.
After these three passes a virtual address for the data can be calculated. The data sections are then generated.
The fourth pass is more like a "linking" pass than code generation (the-ideom currently generates the PE executable
directly). Now that the data has been addressed. These addresses as well as the final addresses of the routines are
filled in.