A simple question

Programming, for all ages and all languages.
Post Reply
User avatar
MajickTek
Member
Member
Posts: 101
Joined: Sat Dec 17, 2016 6:58 am
Libera.chat IRC: MajickTek
Location: The Internet
Contact:

A simple question

Post by MajickTek »

Is it possible, say, to "convert" ASM to other architectures? So like changing the syntax/registers/etc?

Even better would be a cross-compiler for ASM.

No, GAS does not count. I like NASM syntax.
Everyone should know how to program a computer, because it teaches you how to think! -Steve Jobs

Code: Select all

while ( ! ( succeed = try() ) ); 
User avatar
tongko
Member
Member
Posts: 26
Joined: Wed Nov 07, 2012 2:40 am
Location: Petaling Jaya, Malaysia

Re: A simple question

Post by tongko »

what do you expect the outcome would be if target architecture doesn't support general purpose register? or it might not be supporting CPUID? no interrupt? etc...
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: A simple question

Post by Brendan »

Hi,
MajickTek wrote:Is it possible, say, to "convert" ASM to other architectures? So like changing the syntax/registers/etc?

Even better would be a cross-compiler for ASM.

No, GAS does not count. I like NASM syntax.
It's entirely possible (not necessarily easy, but possible). However; the result is likely to be significantly slower (and uglier) than native assembly would have been (and also slower and uglier than a higher level language like C would've been); and (depending on the target architecture) might require a special run-time to handle some of the trickier aspects.

For a simple example consider an 80x86 "div dword [foo]" instruction (that divides a 64-bit value in EDX:EAX by a 32-bit value from memory) that's being converted to ARMv8. ARMv8 doesn't allow operands to be in memory (it's mostly "load/store") so you'd need to start by loading the value from "[foo]" into a temporary register (and that means finding a free register, which probably means pushing a register and then popping it after), and it doesn't have an "unsigned division of 64-bit value by a 32-bit value" instruction (the closest is "udiv" which divides a 64-bit value by a 64-bit value, which means the value in the temporary register has to have been zero extended), and division on ARM doesn't generate an "divide error" exception when it overflows (so you'd have to use comparisons and some sort of trap to some sort of run-time to emulate a divide error exception), and division on ARM doesn't give you a remainder either (you'd have to calculate one yourself using multiplication and subtraction afterwards - "UMSUBL" I think).

Essentially, that single "div dword [foo]" might end up being 12 instructions (and if the cross-assembler is very good at optimising and the remainder isn't used and there was an unused register and ..... it might be able to reduce it down to maybe 6 instructions if you're very lucky); but in native assembly the code wouldn't have been written to assume the CPU generates a remainder and wouldn't have been written to assume that "divide error" exceptions exist, and wouldn't have needed to find a spare register to use; and it probably would've been at least 5 times faster.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Octocontrabass
Member
Member
Posts: 5521
Joined: Mon Mar 25, 2013 7:01 pm

Re: A simple question

Post by Octocontrabass »

I don't know of any tools to do this for assembly source, but after you assemble your code it's certainly possible to translate from one CPU architecture to another.

With that said, you're still better off writing your code in some high-level language that can be compiled for all of your target CPUs.
User avatar
TightCoderEx
Member
Member
Posts: 90
Joined: Sun Jan 13, 2013 6:24 pm
Location: Grande Prairie AB

Re: A simple question

Post by TightCoderEx »

Wow, the amount of time that's gone by, but close to 50 yrs ago, developers were facing that very conundrum. Hence, "C" was born and as @Octocontrabass has pointed out, if there is a need to design for different architectures, HLL is a viable alternative or maybe the only alternative.

IA32/64, PIC, ARM and AVR are the ones I dabble with the most, but I have found, due to the scope of each device, there is very little in common with each that would make some sort of cross compilation worthwhile. Based on @Brendan response, I think you'll agree, technically, yes it's possible, but is it practical. Rewriting, or using that ubiquitous cross compiler called "C" is probably the only way to go.
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: A simple question

Post by Schol-R-LEA »

The term you want[1] is binary translation. While static translations are certainly possible - as shown by the ARM translation of Starcraft, among other things - the use of dynamic binary translation, while less efficient, is far more common, being basically the same process used in JIT 'compilers' to translate bytecodes, which are the foundational methods used in both Java and .Net.

It has been used for emulators going back to at least the 1980s, and was used by the PowerPC Macintoshen to run M68K code, and later x86 Macs to run PPC code. Most modern cross-architecture emulators (e.g., QEMU) perform binary translations rather than instruction-per-instruction interpretation; as with any JIT translation, it slows down the loading process but drastically reduces the total emulation overhead. It is usually done piecemeal rather that translating a whole system or program at once, with the translated code being cached/memoized to reduce repeated translations but with the LRU code being discarded if memory is needed, which if the caching isn't tuned well can lead to re-translations.

This approach also makes it easy to handle trapping system calls and such in a consistent manner - rather than having separate methods for virtualizing native code versus emulating non-native code, they simply convert the non-native code to native and virtualize it the same way.

Hardware dynamic binary translation is fairly common as well, and has largely displaced microcode as the means for managing complex instructions. While the original Transmeta Crusoe chip[2] was a failure, the technique has found its way into all of the later Intel x86 CPUs as part of it instruction decoding.

Aside from the advantage of being able to batch translate instructions (just as with emulators), it also means that the CPU can manipulate and optimize the micro-operations, rather than just the x86 code. One of the things often missed in discussions about register renaming, out of order instruction processing, etc. is that the CPU isn't doing that with the x86 registers and instructions, but with the micro-instructions and a huge anonymous register file with (IIUC) up to 1024 registers.

Internally, a modern i7 has more in common with a UltraSPARC than an 8088. Their advantage over (other) RISC designs is entirely in the extremely late binding of the optimizations - it is, in effect, using Massalin's code synthesis approach at the hardware level. In principle, these techniques could be applied just as easily to a hardware implementation of JVM or .Net, or even bypass bytecode entirely and implement a hardware translation of a high-level language a la the Burroughs mainframes and the LispMs - but since what users[3] actually want is not to have to change anything, they use it to keep a long-dead[4] design going in emulation decade after decade, and will probably continue to do so long after everyone on the forum is to dust.
  1. Which I knew existed but couldn't recall - I managed to find it by looking up Transmeta on Wikipedia.
  2. Where it was called "Code Morphing Software™" - same thing, just with a trademarked name.
  3. On both the consumer and commercial levels. Consumers don't want to have to learn anything new, while businesses don't want to pay for anything new. If anything commercial users are even more conservative than home users - the biggest would just as soon still be using Big Iron if they could, despite the fact that just running the mainframes was ruinously expensive, and few want to spend money even if it means saving money later. Anyone who thinks business decisions are rational or profit-focused hasn't actually observed business decisions being made.
  4. This isn't hyperbole. The x86 as an architecture hasn't actually existed since the 1990s, any more than the System/360 has - we may be running x86 software, but it isn't on x86 hardware, no matter what Intel and AMD claim. Admittedly, this could be seen as nitpicking, and one could even say that by this standard that, since the older x86 systems used microcode, there never was such a thing as x86 in hardware, so make of this what you will.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
User avatar
igorov70
Posts: 11
Joined: Sun Mar 25, 2018 10:35 am
Libera.chat IRC: i dont have
Location: Exchange student in Moscow

Re: A simple question

Post by igorov70 »

MajickTek wrote:Is it possible, say, to "convert" ASM to other architectures? So like changing the syntax/registers/etc?

Even better would be a cross-compiler for ASM.

No, GAS does not count. I like NASM syntax.
Alreday happen in the history (8080 to 8086 assembly translator for dos)
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: A simple question

Post by Schol-R-LEA »

igorov70 wrote:
MajickTek wrote:Is it possible, say, to "convert" ASM to other architectures? So like changing the syntax/registers/etc?

Even better would be a cross-compiler for ASM.

No, GAS does not count. I like NASM syntax.
Alreday happen in the history (8080 to 8086 assembly translator for dos)
Setting aside the fact that this was thread necromancy (given that the last post in the thread before yours was four months ago), you might want to read the rest of the thread before answering a question that was answered far more exhaustively already by several others.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Post Reply