Page 1 of 2

x86 vs ARM (android) integer representation glitch?

Posted: Thu Jun 22, 2017 12:21 pm
by Geri
i am attempted to port my subleq emulator on ARM (Android).

the folowing simple code snippet does a dawn-compatible loop (64 bit big endian subleq cpu cycle).

Code: Select all

unsigned long emulalproc(unsigned long eip){
	for(unsigned int w=0;w<4096;w++){ // doing actually only 32 bit. makes no sense to do long long in a 32 bit arm binary. this limits the maximum ram to 3,9 gbyte.
		unsigned int * utasitaskezd=(unsigned int*)&RAM[eip];
		unsigned int A=bswap32(utasitaskezd[1]);
		unsigned int B=bswap32(utasitaskezd[3]);
		long long B_ertek=bswap64(*(unsigned long long*)&RAM[B])-bswap64(*(unsigned long long*)&RAM[A]);
		if(B_ertek<=0){
		    eip=bswap32(utasitaskezd[5]);
		}else{
		    eip+=24;
		}
		*(long long*)&RAM[B]=bswap64(B_ertek);
		// you may do more than one emulation cycle here, or you may use other optimization tricks.
		// i dont know arm, so i will not do any spectacular here. you may do arm assembly (so you also can have big endian).
	}
	return eip;
}
parts of the code is limited to 32 bit addressing to spare some cycles as this code will be compiled to 32 bit binary for now.

bswap64 on x86 is defined to __bswap_64 and on arm its __builtin_bswap64.
bswap32 on x86 is defined to __bswap_32 and on arm its __builtin_bswap32.

the format of the internal bit representation is big endian:
64 63 62 61 (........ ) 4 3 2 1

on x86 windows / linux, when compiled with gcc, the code is fine.

however when i attempt to run the code on arm (android and gcc), i get crash.

-i dont know, when or why i get the crash, as i cant do printf on android (maybe i will try sprintf later to log into a file to see what is going on)

-arm is biendian cpu, but its in little endian mode by default for sure (i checked the defines to see if its little endian for sure)

-i am started to worry of arm gcc's 64 bit integer representation is messed up, and the built in bswap64 will convert it to some different format. is this possible?

-or it is possible that 32 bit arm gcc will not work with 64 bit numbers when comparing due to an old compiler glitch that has been fixed since eons (but can still exist on arm)?

-or do i overlook something complitely different with the arm architecture?

Re: x86 vs ARM (android) integer representation glitch?

Posted: Fri Jun 23, 2017 12:58 am
by alexfru
Geri wrote:i am attempted to port my subleq emulator on ARM (Android).

the folowing simple code snippet does a dawn-compatible loop (64 bit big endian subleq cpu cycle).

Code: Select all

unsigned long emulalproc(unsigned long eip){
	for(unsigned int w=0;w<4096;w++){ // doing actually only 32 bit. makes no sense to do long long in a 32 bit arm binary. this limits the maximum ram to 3,9 gbyte.
		unsigned int * utasitaskezd=(unsigned int*)&RAM[eip];
		unsigned int A=bswap32(utasitaskezd[1]);
		unsigned int B=bswap32(utasitaskezd[3]);
		long long B_ertek=bswap64(*(unsigned long long*)&RAM[B])-bswap64(*(unsigned long long*)&RAM[A]);
		if(B_ertek<=0){
		    eip=bswap32(utasitaskezd[5]);
		}else{
		    eip+=24;
		}
		*(long long*)&RAM[B]=bswap64(B_ertek);
		// you may do more than one emulation cycle here, or you may use other optimization tricks.
		// i dont know arm, so i will not do any spectacular here. you may do arm assembly (so you also can have big endian).
	}
	return eip;
}
parts of the code is limited to 32 bit addressing to spare some cycles as this code will be compiled to 32 bit binary for now.
Android is little-endian these days on x86, ARM and MIPS, 32-bit and 64-bit.

Two things to try. Disable optimization and see if anything changes. Make sure you're reading/writing properly aligned 32-bit and 64-bit ints. I vaguely recall there used to be some alignment requirements on ARM. Quick links on alignment:
Understanding x86 vs ARM Memory Alignment on Android
How does the ARM Compiler support unaligned accesses?

Re: x86 vs ARM (android) integer representation glitch?

Posted: Fri Jun 23, 2017 1:53 am
by ~
Why don't you use automatically-selecting size data types, for example wideip instead of eip?

It would probably make your life easier as it would allow you to port the exact same low-level code to any processor type, 16, 32, 64, any bit size, much more directly?

Re: x86 vs ARM (android) integer representation glitch?

Posted: Fri Jun 23, 2017 5:24 am
by Geri
alexfru wrote:Make sure you're reading/writing properly aligned 32-bit and 64-bit ints. I vaguely recall there used to be some alignment requirements on ARM. Quick links on alignment
so...
arm cpus cant do unaligned access from c.
Image

jesus christ, i never tought i need such things on this great architecture

i wonder how my other software are even able to work, as i never was aware of this

where is my axe
~ wrote:Why don't you use automatically-selecting size data types
they existence is just as unpredictable as the size of these general data types.

Re: x86 vs ARM (android) integer representation glitch?

Posted: Fri Jun 23, 2017 7:18 pm
by dozniak
Geri wrote:arm cpus cant do unaligned access from c.
how is this even remotely related to C?

Re: x86 vs ARM (android) integer representation glitch?

Posted: Fri Jun 23, 2017 7:21 pm
by alexfru
dozniak wrote:
Geri wrote:arm cpus cant do unaligned access from c.
how is this even remotely related to C?
Undefined behavior?

Re: x86 vs ARM (android) integer representation glitch?

Posted: Fri Jun 23, 2017 8:00 pm
by Geri
dozniak wrote:how is this even remotely related to C?
you lack the basics so we cant explain (and if you think this was a personal offense, you may rethink it and count to 100 before answering)
alexfru wrote:Undefined behavior?
btw i tryed to fprintf to see whats going on, and also added __packed. it seems it collapses after a few 1000 cycles. it is very hard to figure out what is going on, as it takes 10 minutes to copy the newly compiled app package beetwhen the phones (compile, export, shutdown, start, cable in, copy, cable out, other phone, cable in, delete, install, pray). so i creeping extremely slowly. the crash happens due to desintegration of memory areas needed for ,,pointer walk'', the bug will slowly reveal itself as i will investigate the results.

newer versions of gcc with newer versions of arm targets supports unaligned access without any magic (they somehow hacked into it), but that will probably not help on my situation as im stuck with old potato compatible devtools. and also linux kernel itself supports catching the unaligned access on arm (as the kernel does this too), but on android its simply disabled.

also, you are the best low-level expert i even seen - not just on this forum, but overally. i wish i could give you just as good advices as you are giving to me.

Re: x86 vs ARM (android) integer representation glitch?

Posted: Sat Jun 24, 2017 3:01 pm
by Korona
He is right, unaligned memory access is UB, even on x86.

Re: x86 vs ARM (android) integer representation glitch?

Posted: Sat Jun 24, 2017 3:46 pm
by zaval
unaligned access for armv7 and higher is defined. but one should have SCTLR.A flag cleared. then ordinary half-word (16 bit) and word (32 bit) load store instructions do unaligned access. these are:

Code: Select all

LDRH, LDRHT, LDRSH, LDRSHT, STRH, STRHT, TBH
LDR, LDRT, STR, STRT
plus there are also byte load stores. they obviously don't care about alignment at all.
of course I have no idea whether android lets one clear that flag. :)

instructions on arm always (happily) are little-endian.

Re: x86 vs ARM (android) integer representation glitch?

Posted: Sat Jun 24, 2017 4:38 pm
by Geri
zaval: i am prety much sure this compiler generates code for armv5 so this will be MAYBE one of the problems, but other problems are also probably present, i would guess i start it to hunt down next week

Korona: unaligned memory access is not an undefined behavior on x86, as its supported by the hardware

Re: x86 vs ARM (android) integer representation glitch?

Posted: Sat Jun 24, 2017 6:08 pm
by dozniak
alexfru wrote:
dozniak wrote:
Geri wrote:arm cpus cant do unaligned access from c.
how is this even remotely related to C?
Undefined behavior?
It's a hardware specific thing, on arm unaligned access could be enabled on some cpus. Still not related to C - you can get an exception if you do unaligned access from raw asm.

Re: x86 vs ARM (android) integer representation glitch?

Posted: Sat Jun 24, 2017 10:16 pm
by alexfru
dozniak wrote:
alexfru wrote:
dozniak wrote:
how is this even remotely related to C?
Undefined behavior?
It's a hardware specific thing, on arm unaligned access could be enabled on some cpus. Still not related to C - you can get an exception if you do unaligned access from raw asm.
C99, ยง6.3.2.3 (Pointers), clause 7 wrote:A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undefined.

Re: x86 vs ARM (android) integer representation glitch?

Posted: Sun Jun 25, 2017 8:47 am
by Korona
Geri wrote:Korona: unaligned memory access is not an undefined behavior on x86, as its supported by the hardware
Sure it's not undefined behavior in x86 machine code. However you're writing C and not x86 assembly and unaligned access is undefined behavior in the C abstract machine. The C standard defines an abstract machine against which you code your program. If the standard says that the behavior is undefined you cannot rely on it even if some architectures/compilers may support it while the stars align correctly.

Re: x86 vs ARM (android) integer representation glitch?

Posted: Sun Jun 25, 2017 9:20 am
by Geri
Korona: as in practice the coders theyself, and real world data and file formats will not care about aligments at all, i dont had any options just one: allow it and rely on it.

Re: x86 vs ARM (android) integer representation glitch?

Posted: Sun Jun 25, 2017 9:31 am
by Korona
In practice, even on x86 unaligned access is slow. Natural alignment for example implies that data cannot cross page and cache size boundaries which makes stuff much easier to handle at the hardware level. Everyone who cares about performance has to ensure that their data is properly aligned, even if the CPU allows unaligned access.

Why do you think C enforces alignment? To make life harder for you? Try to run your code with --sanitize=undefined and see what happens.