Page 1 of 1

James Molloy's Tutorial Known Bugs

Posted: Sat Oct 04, 2014 11:41 am
by Synon
I think I've found a bug in James Molloy's tutorial.

In the ISR and IRQ handlers, we see the code

Code: Select all

 %macro ISR_NOERRCODE 1  ; define a macro, taking one parameter
  [GLOBAL isr%1]        ; %1 accesses the first parameter.
  isr%1:
    cli
    push byte 0
    push byte %1
    jmp isr_common_stub
%endmacro

%macro ISR_ERRCODE 1
  [GLOBAL isr%1]
  isr%1:
    cli
    push byte %1
    jmp isr_common_stub
%endmacro 
However, later, we see this structure:

Code: Select all

typedef struct registers
{
   u32int ds;                  // Data segment selector
   u32int edi, esi, ebp, esp, ebx, edx, ecx, eax; // Pushed by pusha.
   u32int int_no, err_code;    // Interrupt number and error code (if applicable)
   u32int eip, cs, eflags, useresp, ss; // Pushed by the processor automatically.
} registers_t; 
Here, int_no and err_code are 32 bits wide, but previously we pushed 8 bits! I can definitively say it's the macro which is wrong, because the CPU pushes 32-bit error codes, so the "dummy" error code should be 32-bits too. For consistency's sake we may as well keep the ISR number as 32 bits.

This is clearly a mistake because later in isr_common_stub, we see:

Code: Select all

add esp, 8     ; Cleans up the pushed error code and pushed ISR number
So in the macros, you must change "push byte" to "push dword".

The same bug is present in the next lesson, IRQs and the PIT.

I can't add it to the Known Bugs wiki article because I'm not in the appropriate group and I don't know what group to join or how.

Re: James Molloy's Tutorial Known Bugs

Posted: Sat Oct 04, 2014 12:10 pm
by max
Synon wrote:I can't add it to the Known Bugs wiki article because I'm not in the appropriate group and I don't know what group to join or how.
Add yourself to the wiki group in the user control panel. But try to only edit information if you are 100% sure that it's wrong and someone else confirmed that. ;)

Btw.. the byte/dword is just to tell the assembler the size of the operand. PUSH on x86 always moves the stack pointer by 4 bytes..
EDIT: ah alexfru was faster.. check what he says.^^

Re: James Molloy's Tutorial Known Bugs

Posted: Sat Oct 04, 2014 12:31 pm
by alexfru
Synon wrote: I think I've found a bug in James Molloy's tutorial.
You are unsure?
Synon wrote: In the ISR and IRQ handlers, we see the code
...
However, later, we see this structure:
...
Here, int_no and err_code are 32 bits wide, but previously we pushed 8 bits!
Are you sure about 8 bits? Did you check?
Synon wrote: I can definitively say it's the macro which is wrong, because the CPU pushes 32-bit error codes, so the "dummy" error code should be 32-bits too.
Is it not 32-bit? Did you check?
Synon wrote: For consistency's sake we may as well keep the ISR number as 32 bits.
That would be less confusing, certainly.
Synon wrote: This is clearly a mistake because later in isr_common_stub, we see:

Code: Select all

add esp, 8     ; Cleans up the pushed error code and pushed ISR number
So in the macros, you must change "push byte" to "push dword".
You sound pretty sure by now. But it doesn't look like you really observed the bug by running the code nor by checking the documentation on the CPU (the PUSH instruction, specifically) and the assembler (I believe, this is NASM code).

Pull out your favorite CPU manual and look at the description of the PUSH instruction. What do you see there?

Write a tiny test application with the "push byte 0" instruction and single-step this instruction in the debugger. How does the register and memory state change when this instruction executes?

But don't die of shame just yet. We do make mistakes like this. I made this exact one last week at work and stole several minutes of time from a colleague.

Re: James Molloy's Tutorial Known Bugs

Posted: Sat Oct 04, 2014 12:57 pm
by Synon
Well, I just checked

Code: Select all

push byte 0
with Qemu in real mode.

And ESP changed by 2 rather than 1, so presumably in protected mode it would change by 4 i.e. push 4 bytes.

So the 'byte' qualifier just tells the CPU how many bytes to read, but it always pushes a word (= 16 bits in real mode, 32 bits in protected mode and 64 bits in long mode)?

I suppose the decision to use 'byte' over 'dword' was to save space in the binary?

Re: James Molloy's Tutorial Known Bugs

Posted: Sat Oct 04, 2014 5:51 pm
by FallenAvatar

Re: James Molloy's Tutorial Known Bugs

Posted: Sat Oct 04, 2014 5:59 pm
by alexfru
Why do you "suppose" when you can just read the manual?

Code: Select all

PUSH—Push Word, Doubleword orQuadword Onto the Stack
Opcode* Instruction Op/En 64-Bit Mode Compat/Leg Mode Description
...
6A PUSH imm8 C Valid Valid Push imm8.
...

Operand size. The D flag in the current code-segment descriptor determines the 
default operand size; it may be overridden by instruction prefixes (66H or 
REX.W).

The operand size (16, 32, or 64 bits) determines the amount by which the stack
pointer is decremented (2, 4 or 8).

If the source operand is an immediate and its size is less than the operand size,
a sign-extended value is pushed on the stack.
...

Code: Select all

Operation
...
ELSE IF SRC is immediate byte
    THEN TEMP ←SignExtend(SRC); (* extend to operand size *)
ELSE IF SRC is immediate word (* operand size is 16 *)
    THEN TEMP ←SRC
ELSE IF SRC is immediate doubleword (* operand size is 32 or 64 *)
    THEN
        IF operand size = 32
            THEN TEMP ←SRC;
            ELSE TEMP ←SignExtend(SRC); (* extend to operand size of 64 *)
        FI;
...
ELSE IF stack-address size = 32
    THEN
        IF operand size = 32
            THEN
                ESP ← ESP −4;
                Memory[SS:ESP] ←TEMP; (* Push doubleword *)
            ELSE (* operand size = 16 *)
                ESP ← ESP −2;
                Memory[SS:ESP] ←TEMP; (* Push word *)
        FI;
    ELSE  (* stack-address size = 16 *)
        IF operand size = 32
            THEN
                SP ← SP− 4;
                Memory[SS:SP] ←TEMP; (* Push doubleword *)
            ELSE (* operand size = 16 *)
                SP ←SP − 2;
                Memory[SS:SP] ←TEMP; (* Push word *)
        FI;
FI;
Can you read that and make any sense of it?