Hi
What is the best way to split an integer into two shorts? Or to split an interger into 4 chars? This is in C.
Thanks
srg
Splitting a Integer
Re:Splitting a Integer
By using pointers or bitmasking and bitshifting. Exactly how to split them up would really depend on the machine you're working with (big endian/little endian). For instance:
It really all depends on exactly what you want to do with it.
Code: Select all
// Assuming:
// int is defined as a 32-bit integer
// short is defined as a 16-bit integer
// char is defined as an 8-bit integer
int someInteger=0x12345678;
short *someShort=(short *)&someInteger;
char *someChar=(char *)&someInteger;
// Big Endian:
// someShort[0] = 0x1234
// someShort[1] = 0x5678
// someChar[0] = 0x12
// someChar[1] = 0x34
// someChar[2] = 0x56
// someChar[3] = 0x78
//
// Little Endian:
// someShort[0] = 0x5678
// someShort[1] = 0x1234
// someChar[0] = 0x78
// someChar[1] = 0x56
// someChar[2] = 0x34
// someChar[3] = 0x12
short shortValue[2];
char charValue[4];
shortValue[0]=(short)(someInteger & 0xFFFF);
shortValue[1]=(short)((someInteger >> 16) & 0xFFFF);
charValue[0]=(char)(someInteger & 0xFF);
charValue[1]=(char)((someInteger >> 8) & 0xFF);
charValue[2]=(char)((someInteger >> 16) & 0xFF);
charValue[3]=(char)((someInteger >> 24 ) & 0xFF);
// Both Big Endian and Little Endian:
// shortValue[0] = 0x5678
// shortValue[1] = 0x1234
// charValue[0] = 0x78
// charValue[1] = 0x56
// charValue[2] = 0x34
// charValue[3] = 0x12
Re:Splitting a Integer
Actually, in there is a simpler way to do this, which works in both C and assembly: just overlap the variables. In C, you can do this with a union, like this:
Note that the Intel systems are little-endian, which is why I have BYTE_SEX set to 1 in this example. If you know that the code will only run on an x86 system (a bad assumption unless it is system code), you can do away with the macros and just use
In x86 assembly it is even easier: just define two adjacent labelled words, then save the whole dword value at the first of them. You can then access the LSW with the first label and the MSW with the second. Again in NASM, this code will have the same result as the code above:
which leaves you with BX == LSW(x) and AX == MSW(x).
An even easier, and probably faster (I haven't tested it) way to get the same result is to load the variable into an 32-bit register, copy the lower half (which is the MSB) into a different 16-bit half-register, then shift the upper half into the lower half of the first register. In NASM, this would be:
The downside of this is that if you need to save the value to memory, you need to take two extra steps, whereas with the former approach, once it is saved you can access the two half values or the whole value directly from memory whenever you wish.
Any of these approaches can be used to extract individual bytes as well, just as long as you keep the byte order in mind.
Code: Select all
#define BYTE_SEX 1 /* 0 == big-endian, 1 == little-endian */
#if BYTE_SEX == 0
#define LSW(x) x.half[1]
#define MSW(x) x.half[0]
/* the most significant word is the upper part of the integer, and least sig. word is the lower part */
#else
#define LSW(x) x.half[0]
#define MSW(x) x.half[1]
#endif
union
{
int whole;
short half[2];
} split_int;
short upper, lower;
spilt_int.whole = 0x012345678;
upper = MSW(split_int);
lower = LSW(split_int);
Code: Select all
upper = split_int.half[1];
lower = split_int.half[0];
Code: Select all
[section .code]
mov ebx, DWORD x
mov LSW, ebx
mov ax, MSW
mov bx, LSW ; not strictly necessary here,
; but demonstrates the way you could do it otherwise
[section .data]
LSW dw 0
MSW dw 0
An even easier, and probably faster (I haven't tested it) way to get the same result is to load the variable into an 32-bit register, copy the lower half (which is the MSB) into a different 16-bit half-register, then shift the upper half into the lower half of the first register. In NASM, this would be:
Code: Select all
mov eax, DWORD x
mov bx, ax
shr eax, 16
Any of these approaches can be used to extract individual bytes as well, just as long as you keep the byte order in mind.