Page 1 of 2

Assembler for an abstract machine

Posted: Wed Apr 21, 2010 9:02 am
by robheus
Most assemblers I'm aware of compile the assembler code natively into machine instructions for the host machine/cpu. But do assemblers exist (and are they usefull) that compile into instructions for some (well chosen/designed) abstract machine, so that the same compiled code could be used on any machine/platform? It would necessitate of course a secondary compilation pass for generating code for some specified machine (in most cases the native/host machine).

Re: Assembler for an abstract machine

Posted: Wed Apr 21, 2010 9:21 am
by Combuster
Google for llvm, cil/msil and java bytecode.

Re: Assembler for an abstract machine

Posted: Wed Apr 21, 2010 9:40 am
by robheus
Combuster wrote:Google for llvm, cil/msil and java bytecode.
Java bytcode isn't assembling, and cil/msil I wouldn't call exactly platform independent.

llvm... will investigate.

Re: Assembler for an abstract machine

Posted: Wed Apr 21, 2010 9:56 am
by Solar
Tao intent was such a system. They compiled / assembled the source (C, C++, or Java) for the "VP" (virtual processor), which was platform-independent. On the specific hardware, the VP code could then be compiled to native, either upon installation (AOT) or upon runtime (JIT).

The VP code loader / translator was said to translate faster on average than the VP code could be loaded from disk.

Quite a nice technology, but their primary aim was the embedded market. An involvement with Amiga Inc. was their bid for the desktop (and that's how I came to know of them), but we all know what became of that particular.

Re: Assembler for an abstract machine

Posted: Wed Apr 21, 2010 11:38 am
by qw
What would such an assembly language look like?

Re: Assembler for an abstract machine

Posted: Wed Apr 21, 2010 11:55 am
by JamesM
Yes - Transitive Ltd. before they were obtained by IBM did exactly that to convert assembly code from one architecture to another, along with JIT compilation and static analysis.

As to what such a language would look like, LLVM bitcode.

Re: Assembler for an abstract machine

Posted: Thu Apr 22, 2010 12:14 am
by Love4Boobies
robheus wrote:cil/msil I wouldn't call exactly platform independent.
Yes it is. Many people think "CIL, C#, Microsoft; it must be for Windows" but that is quite false. Mono is the living proof - it uses CIL and has ports for several operating systems as well as CPU architectures.

Re: Assembler for an abstract machine

Posted: Thu Apr 22, 2010 12:36 am
by Combuster
robheus wrote:Java bytcode isn't assembling
So .class files have no textual representation other than their corresponding .java files? Object files can't be made from assembly but only C? :wink:

Re: Assembler for an abstract machine

Posted: Thu Apr 22, 2010 12:49 am
by Solar
Hobbes wrote:What would such an assembly language look like?
There was an article on OSNews back when the Tao / Amiga connection became public, including a screenshot of VP source.

If you are really interested in details, try to hunt down someone who still has one of the AmigaDE SDK's lying around. (I could possibly assist you there, I still got pretty decent contacts to the Amiga "scene".) That SDK included documentation on the VP technology.

Re: Assembler for an abstract machine

Posted: Thu Apr 22, 2010 2:13 am
by qw
JamesM wrote:As to what such a language would look like, LLVM bitcode.
IMHO it's just another Intermediate Language, except that its syntax is assembly-like. I wouldn't call anything assembly language if it's not for a specific processor, though I admit that it is a matter of definition.

Re: Assembler for an abstract machine

Posted: Thu Apr 22, 2010 5:12 am
by qw
Hobbes wrote:What would such an assembly language look like?
After giving the subjest some more thought, I think my real question is: what would be the benefit of such a language over a higher level language? It doesn't give you control to the hardware because it's for a virtual machine, and I don't think it's much use for optimization either.

Re: Assembler for an abstract machine

Posted: Thu Apr 22, 2010 5:51 am
by gravaera
I think the main benefits are that the compiling end of the build process is independent of any architecture, and it produces transparent code. Then since the produced assembly-like language is close to assembly, and it is already the optimized output of the compiler, when the translating assembler passes through it, it can now optimize this already optimized code even further.

I never looked at LLVM's bytecode, but I have been fostering an idea of similar merits. the compilers would output pseudo instructions in the assembly-like language as the most efficient way the sequence can be done. Then when the real assembly takes place the assembler would be able to choose the most efficient way to translate that code into its own machine language.

It does help with optimization in that sense.

Re: Assembler for an abstract machine

Posted: Thu Apr 22, 2010 7:13 am
by Solar
gravaera more or less named it.

The compiler already did the bulk work of parsing the high-level language (using one of multiple frontends), as well as the brunt of the optimizing work. Unreachable code has been eliminated, return value optimizations have been done, all the stuff that takes up clock cycles.

The loader / translater only has to map the virtual registers to physical / memory, and translate the virtual opcodes to native ones - simple lookup work, which can be done with reasonable efficiency even on low-on-horsepower embedded systems.

Re: Assembler for an abstract machine

Posted: Thu Apr 22, 2010 8:23 am
by qw
Gravaera, Solar,
I don't know much about code optimization, but isn't this true for every Intermediate Language? It doesn't need to have an assembly-like syntax, or am I missing something?

Re: Assembler for an abstract machine

Posted: Thu Apr 22, 2010 10:17 am
by Selenic
Hobbes wrote:Gravaera, Solar,
I don't know much about code optimization, but isn't this true for every Intermediate Language? It doesn't need to have an assembly-like syntax, or am I missing something?
Basically, the less difference between source and target, the less effort it takes to translate (this is obviously true in natural languages too). The LLVM bitcode is pretty low-level (hence the name) and so most instructions require little translation effort (for example, think about the effort required to compile, say, a Python 'add' bytecode which could end up trying two separate method calls, versus LLVM's 'add i32' instruction which just adds two integer registers and stores the result in a third)