OSDev.org • nasm -> gas

Page 1 of 2

nasm -> gas

Posted: Fri Mar 23, 2007 10:59 am

by ehird

Code: Select all

    push byte 0
    push byte 1
etc

^ bonafide kernel dev tutorial

GAS only accepts "push something". What does nasm do here, and how would I do it in gas?

thanks

Posted: Fri Mar 23, 2007 11:38 am

by XCHG

x86's architecture does not allow bytes to be pushed onto the stack although you can manipulate individual bytes in the stack but you can not push them one by one onto the stack. I don't know who has written that code but I am sure it is not coded by a programmer who knows enough about Assembly.

However, doing that in NASM will generate the PUSh opcode anyway.

Posted: Fri Mar 23, 2007 11:44 am

by ehird

It's from http://www.osdever.net/bkerndev/index.php..

Posted: Fri Mar 23, 2007 1:04 pm

by turtling

Use pushl instead of push byte, if converting from nasm to gas is what you want. Also, you will probably get a problem with having the handlers being put into eax and eax called. Just call the handler instead, not putting it into eax, to solve your problem.

edit: Putting the handler into eax it's to save the eip register, but I don't know how to do it in gas.

SO removing it is an hack solution

Posted: Fri Mar 23, 2007 1:20 pm

by ehird

Well, gas doesn't complain when I "call *%eax", but since I can't test it yet...

Posted: Fri Mar 23, 2007 2:08 pm

by mystran

Yeah, "call *%eax" is how you do indirect jump in GAS syntax.

Posted: Fri Mar 23, 2007 2:22 pm

by ehird

Interestingly, when testing in bochs my exception handling code just isn't running - triple fault and reboot straight away. The current asm is this:

Code: Select all

isr0: # divide by zero
	cli
	pushl 0
	pushl 1
	jmp isr_common_stub

.extern fault_handler

isr_common_stub:
	pusha
	push %ds
	push %es
	push %fs
	push %gs
	mov %ax, 0x10
	mov %ds, %ax
	mov %es, %ax
	mov %fs, %ax
	mov %gs, %ax
	mov %eax, %esp
	push %eax
	mov %eax, fault_handler
	call *%eax
	pop %eax
	pop %gs
	pop %fs
	pop %es
	pop %ds
	popa
	add %esp, 8
	iret

which as far as I know is a direct translation. Anybody see anything obviously wrong? The C code is basically as in the tutorial:

Code: Select all

#include <kernel.h>

/* Function prototypes. */
void isr0();

void isrs_install(void)
{
	idt_set_gate(0, (unsigned)isr0, 0x08, 0x8E);
}

char *exceptions[] = {
	"Division by zero"
};

void fault_handler(struct regs *r)
{
	if (r->int_no < 32) { /* Exception */
		puts("Exception: ");
		puts(exceptions[r->int_no]);
		puts(". System halted!");
		for (;;);
	}
}

Posted: Fri Mar 23, 2007 2:50 pm

by turtling

You need to inverse the parameters, it's gas. Also, I tried call *%eax in my code, and it doesn't work.

Posted: Fri Mar 23, 2007 3:21 pm

by ehird

Ah. Just the movs, or the other calls too?

Edit; yes, other calls. bochs is giving unexpected errors here:

Code: Select all

00018114804e[CPU0 ] fetch_raw_descriptor: GDT: index (ff57)1fea > limit (17)
00018114804e[CPU0 ] interrupt(): gate descriptor is not valid sys seg
00018114804e[CPU0 ] interrupt(): gate descriptor is not valid sys seg

the first one, according to google, means the GDT is hosed. Some looking in code shows it's my asm messing up - but in a different part. Very odd...

Posted: Fri Mar 23, 2007 6:33 pm

by Happy Egghead

All Intel CPU's from the 286 upwards support direct pushing of variables.
There are three instructions PUSH imm8, PUSH imm16 and PUSH imm32.
Not sure about 64 bit processors though, but I assume that they support a PUSH imm64 in long mode.
Saves using a register when you just want to shove things onto the stack.

Posted: Sat Mar 24, 2007 4:04 am

by XCHG

The whole "BYTE" size specifier is tricky. When you do that with the BYTE size specified, what the CPU does is that it sign extends the value of the imm8 that you are pushing and then puts it into the stack at the current ESP but the tricky part is that the ESP is not decremented one byte but actually, it is decremented four bytes as if you pushed a DWORD. So basically, you are not really pushing a BYTE.

Posted: Sat Mar 24, 2007 4:41 am

by Happy Egghead

I've just looked through the code in the tutorials "start.asm" and I believe it may be an ASM error in the basic kernel.
The code posted by ehird is

Code: Select all

  push byte 0
  push byte 1
etc

however in the system.h file used by the kernel the regs field is stated this way -

Code: Select all

/* This defines what the stack looks like after an ISR was running */
struct regs
{
    unsigned int ds, es, fs, gs;
    unsigned int edi, esi, ebp, esp, ebx, edx, ecx, eax;
    unsigned int int_no, err_code;
    unsigned int eip, cs, eflags, useresp, ss;    
};

The int_no and err_code are declared as unsigned ints (1 dword = 32 bits).
So while the code does push a byte, as XCHG has pointed out ESP is incremented by 4 in 32 bit protected mode / unreal mode. The reason an error is not given by the execution/declaration discrepancy is due to the use of an 'unsigned int' as the data type, which is correct - a sign extended 32-bit value pushed onto the stack.

The code should just read

Code: Select all

PUSH whatever
PUSH whatever

removing the implied 'byte' and letting the assembler/compiler handle the alignment issues. I suspect GAS only accepts a 'push something' to avoid potential problems like this.

My thought is that all such 'push'es should be 32 bit anyway (being in a 32 bit code segment and all). I suspect that even C compilers when faced with a 'signed int' will align that 'int' to an appropriate boundary or in the case of 'packed' structures simply abstract away the difference with slightly less optimized code. ie load dword, unpack 'int' from dword, play with 'int' and then pack it back in when finished. Perhaps someone who has delved more deeply into these issues could confirm this.

Anyway in 16 bit mode the push imm's should work perfectly due to the processor tolerating a misaligned stack as per the 8088/6 real mode.

Posted: Sat Mar 24, 2007 5:02 am

by ~

This is what I get in NASM:

Code: Select all

bits 16
 push  byte 1    ;Opcode: 1 byte  --- Value: 1 byte
 push  word 1    ;Opcode: 1 byte  --- Value: 2 bytes
 push dword 1    ;Opcode: 2 bytes --- Value: 4 byte


bits 32
 push byte 1     ;Opcode: 1 byte  --- Value: 1 byte
 push word 1     ;Opcode: 2 bytes --- Value: 2 bytes
 push dword 1    ;Opcode: 1 bytes --- Value: 4 bytes

As you can see, the instructions effectively have different sizes. The only thing to note is that if you work in a 16-bit environment you should pop WORD's, and if you are in a 32-bit environment, you should pop DWORD's, because anyway values will be aligned at 2 and 4 bytes for 16 and 32-bit modes respectively. I know that for sure, I do use those tricks to trim at maximum the space used in my boot sector:

Code: Select all

bits 16
 push byte 0
 pop ds

Posted: Sat Mar 24, 2007 5:36 am

by XCHG

~:

I think you should be more careful about using the adjective "aligned" because when you do this:

Code: Select all

PUSH    BYTE 0

The stack segment wouldn't get "aligned" but the sign bit of the pushed byte will get extended into the other three bytes left in the DWORD and finally a DWORD will be pushed onto the stack. You might have an unaligned stack segment but after doing the above, your stack segment will not get aligned or anything. You will still have an unaligned stack.

About the opcodes generated, in no available mode of the CPU you can have an opcode with a imm/reg value that is 1 byte long. The opcode itself will be a byte long and the value will be suffixed to it. Therefore, in 32-bit mode for example, if you do the above code, the generated opcode should and will be 2 bytes long. You can check that for 16 bit mode and see that the opcode is still 2 bytes long.

The same goes with your WORD and DWORD versions of the PUSH instruction. The below code:

Code: Select all

PUSH    DWORD 0

Generates a 5-byte-long opcode in 32-bit mode and a 6-byte-long opcode in 16-bit. You can create a dummy file and test all these for yourself.

As per the alignment in NASM, it is better that you use the ALIGN like this:

Code: Select all

  ALIGN 0x02, DB 0x00
  __MyAlignedProcedure:
    ;~

That aligns the [__MyAlignedProcedure] procedure on a WORD boundary and fills the code segment with null-bytes (0x00) until it is aligned. However, filling the code segment with null bytes for the sake of alignment is not a good idea because you might, on certain circumstances, get an Invalid Opcode Exception when your EIP is accidently set on the null bytes, and then you will have to search for days to be able to find the cause of the error. You can also use NOP:

Code: Select all

  ALIGN 0x04, NOP
  __MyAlignedProcedure:
    ;~

Posted: Sat Mar 24, 2007 6:26 am

by ~

XCHG wrote:~:

I think you should be more careful about using the adjective "aligned" because when you do this:
Code: Select all
PUSH    BYTE 0
The stack segment wouldn't get "aligned" but the sign bit of the pushed byte will get extended into the other three bytes left in the DWORD and finally a DWORD will be pushed onto the stack. You might have an unaligned stack segment but after doing the above, your stack segment will not get aligned or anything. You will still have an unaligned stack.

That would happen in 32-bit mode.

I am aware that if you push something like a byte in 16-bit mode it will be pushing really 2 bytes and one still has to pop 2 bytes (no problem there).

The only concern I see is when the AMD manual says "Pushing an odd number of 16-bit operands (i.e., 1, 3, 5... WORDs) when the stack address-size attribute is 32 results in a misaligned stack pointer".

I interpret that like there's only one problem in which the stack doesn't keep its default size, and that happens in 32-bit mode if you do something like:

Code: Select all

bits 32
 push ax  ;Pushes a WORD, doesn't extend into a DWORD

For that case, you either have to push other WORD to pop 1 DWORD, or pop the same WORD value like:

Code: Select all

bits 32
 pop ax    ;Pops the previous WORD

In any other mode or instruction, the stack keeps aligned and the operand size gets extended so it have 2, 4 or 8 bytes, for 16, 32 and 64-bit modes respectively. Just be careful about pushing an odd number of WORDs in protected mode becouse that will make you lose your stack pointer alignment to 4 bytes (16-bit compatibility thing?).

That's how I interpret that.

As for the instruction sizes, I should clarify:

Code: Select all

bits 16 
 push  byte 1    ;Opcode: 1 byte  --- Value: 1 byte  --- Total: 2 bytes
 push  word 1    ;Opcode: 1 byte  --- Value: 2 bytes --- Total: 3 bytes
 push dword 1    ;Opcode: 2 bytes --- Value: 4 byte  --- Total: 6 bytes


bits 32 
 push byte 1     ;Opcode: 1 byte  --- Value: 1 byte  --- Total: 2 bytes
 push word 1     ;Opcode: 2 bytes --- Value: 2 bytes  --- Total: 4 bytes
 push dword 1    ;Opcode: 1 bytes --- Value: 4 bytes  --- Total: 5 bytes