Page 1 of 1

Bootloader problems :(

Posted: Mon Jun 26, 2006 7:44 am
by 0Scoder
Hi, I am currently writing a two-stage bootloader for my OS. I have managed to get the first stage to work fine (it finds the second stage on a fat12 floppy disk, then runs it), but after testing in bochs the second stage fails miserably.

I managed to trace the error to the following peice of code, using dots printed to the screen:

Code: Select all

;Find the file we want
 mov cx, WORD [MaxRootEntries]   ;load loop counter
 mov di, 0x0200               ;locate first root entry
 .loop:
  push cx
  mov cx, 0x000B   ;eleven character name
  push di
  rep cmpsb         ;test for entry match
  pop di
  je file_found
  pop cx
  add di, 0x0020   ;queue next directory entry
 loop .loop
 jmp ERROR
 file_found:

 mov si, msg_dot   ;write tracing dot
 call WRITE
at which point bochs gives the error:

Code: Select all

prefetch: RIP > CS.limit
the code in wider context:

Code: Select all

LOAD_KERNEL:
 mov si, kernel_image   ;filename
 mov ax, p_kernel   ;memory offset
 mov bx, ps_kernel   ;memory segment
 call LOAD_IMAGE
 mov si, msg_dot
 call WRITE    ;print dot -> so we can trace errors
RET

LOAD_IMAGE:   ;Loads a file from a fat12 disk (bx:ax - where to load | si - filename)
;Push input registers to the stack
 push ax   ;offset
 push bx   ;segment

;Get the root directory into memory
 call GET_ROOT_DIRECTORY
 mov si, msg_dot
 call WRITE

;Find the file we want
 mov cx, WORD [MaxRootEntries]   ;load loop counter
 mov di, 0x0200               ;locate first root entry
 .loop:
  push cx
  mov cx, 0x000B   ;eleven character name
  push di
  rep cmpsb         ;test for entry match
  pop di
  je file_found
  pop cx
  add di, 0x0020   ;queue next directory entry
 loop .loop
 jmp ERROR
 file_found:

 mov si, msg_dot   ;write tracing dot
 call WRITE
The values of the various variables:

Code: Select all

 kernel_image   db "KERNEL  BIN"
 p_kernel   dw 0x1100
 ps_kernel   dw 0x0000
 p_rootdir   dw 0x900
Unfortunately I can't get bochs debugger to trace call instructions, and using the 'trace_on' command takes an age going through bios stuffs. I do however have a cpu dump from before the LOAD_IMAGE call:

Code: Select all

eax:0xe0e00
ebx:0x7
ecx:0xe0006
edx:0xfff
ebp:0x0
esi:0x59c
edi:0x5
esp:0xfff8
eflags:0x246
eip:0x5b5
cs:s=0x0, dl=0xffff, dh=0x9b00, valid=1
ss:s=0x0, dl=0xffff, dh=0x9300, valid=7
ds:s=0x0, dl=0xffff, dh=0x9300, valid=3
es:s=0x0, dl=0xffff, dh=0x9300, valid=1
fs:s=0x0, dl=0xffff, dh=0x9300, valid=1
gs:s=0x0, dl=0xffff, dh=0x9300, valid=1
ldtr:s=0x0, dl=0x0, dh=0x0, valid=0
tr:s=0x0, dl=0x0, dh=0x0, valid=0
gdtr:base=0x0, limit=0x0
idtr:base=0x0, limit=0x3ff
dr0:0x0
dr1:0x0
dr2:0x0
dr3:0x0
dr6:0xffff0ff0
dr7:0x400
tr3:0x0
tr4:0x0
tr5:0x0
tr6:0x0
tr7:0x0
cr0:0x60000010
cr1:0x0
cr2:0x0
cr3:0x0
cr4:0x0
inhibit_mask:0
done

Is there anyone who can see what is wrong here? (or even explain what the error means?)

Thanks for you help in advance,
OScoder

Re:Bootloader problems :(

Posted: Mon Jun 26, 2006 8:15 am
by Pype.Clicker
a small trick i use to make "trace-on" more helpful is
"pbreak 0x7c00 ; c ; trace-on" -- which at least runs the BIOS initialization without tracing.

You may also want to insert a break on "int 13h" where you will turn of tracing and, and a second break after the "int 13h" where you will turn it on again ...

Or just a break after the "int 13h" call so that as soon as you entered the BIOS, you just continue until the next break ...

Re:Bootloader problems :(

Posted: Mon Jun 26, 2006 8:18 am
by Pype.Clicker
"RIP > CS.limit" likely means that the bochs has tried to jump to an address over 64K in a realmode segment ... (RIP is the 64-bit name of 'IP' as you might guess).

Re:Bootloader problems :(

Posted: Mon Jun 26, 2006 8:47 am
by Ryu
My guess its a corrupt stack from the Bochs error. Heres what I found:
OScoder wrote: ;Find the file we want
mov cx, WORD [MaxRootEntries]???;load loop counter
mov di, 0x0200???????????????;locate first root entry
.loop:
push cx ;<--- +2
mov cx, 0x000B???;eleven character name
push di ;<---- +4
rep cmpsb?????????;test for entry match
pop di ;<---- +2
je file_found ; <--- if jumped stack is not fully restored
pop cx ;<---- 0
add di, 0x0020???;queue next directory entry
loop .loop
jmp ERROR
file_found:

mov si, msg_dot???;write tracing dot
call WRITE
So you can replace it with:

Code: Select all

;Find the file we want
 mov cx, WORD [MaxRootEntries]   ;load loop counter
 mov di, 0x0200               ;locate first root entry
 .loop:
  push cx ;<--- +2
  push di ;<---- +4
  mov cx, 0x000B   ;eleven character name
  repe cmpsb         ;contine comparing if equal
  pop di ;<---- +2
  pop cx ;<---- 0 (push/pops unaffects the flags)
  je file_found ; <--- if jumped stack is fully restored
  add di, 0x0020   ;queue next directory entry
 loop .loop
 jmp ERROR
 file_found:

 mov si, msg_dot   ;write tracing dot
 call WRITE
edit: Just to be sure ive replaced rep cmpsb with repe cmpsb.

Re:Bootloader problems :(

Posted: Mon Jun 26, 2006 11:31 am
by Combuster
Getting RIP > CS.limit in realmode usually means that your code is running somewhere it shouldn't.

Likely reasons:
- Missing or incorrect ORG directive (or CS value)
- You jump to a piece of 'loaded' code (next stage) and you either didnt put code there or you mistyped the address
- You have a bug in your stackframe code and you do a (I)RET.
- Your code runs off the end of your asm file (or you didnt load all sectors of the second stage)
- Forgot to load SS/SP

On another note, If that's the exact Bochs dump, I suggest you get a more recent version.

Re:Bootloader problems :(

Posted: Tue Jun 27, 2006 1:45 am
by Pype.Clicker
someone corrects me if i get wrong, but, afaik:

- if you try to execute an instruction after 0xffff (i mean e.g. you have a NOP in CS:FFFF), the CPU should wrap back to CS:0x0000 -- no exception or whatsoever.

- same if you have garbage on your stack: it's just a value between 0x0000 and 0xFFFF (for both segment and offset) and the CPU will just be happy with any combination of the two)

- if you ever send your code to wastelands of non-initialized memory, Bochs complains with "running in bogus memory" (i guess OScoder would have told us, then).

- same if you accidentally miss a part of your 2nd stage or have a wrong IRET or whatever happens with the stack (all those are good advices and things to be checked, indeed).

Now i can figure out one thing that bochs could dislike (and possibly real CPUs as well). Imagine rather than having a 'nop' at 0xFFFF you have the start of "mov ax, 0x1234" -- a 4-bytes (or maybe just 3?) instruction. I'm unsure whether the CPU will try to load the 'extra' bytes from CS:0x0000 or CS:0x10000 but i fear it'll rather be the second one.

Re:Bootloader problems :(

Posted: Tue Jun 27, 2006 3:54 am
by Midas
Pype.Clicker wrote:or maybe just 3
I think, assuming I can follow the Intel manuals right, that it's only a 3-byte opcode. MOV AX, imm is 0xB8 [operand], so I'm guessing 0xB8 0x34 0x12.

Maybe I'm wrong, and I realise that this isn't the point, but it's just some information.

Re:Bootloader problems :(

Posted: Tue Jun 27, 2006 6:26 am
by blip
Midas, it is a 4-byte instruction when in 32-bit mode. At that point you would need a 0x66 prefix operand size prefix. In 16-bit modes though you are correct.
Pype.Clicker wrote:- if you try to execute an instruction after 0xffff (i mean e.g. you have a NOP in CS:FFFF), the CPU should wrap back to CS:0x0000 -- no exception or whatsoever.
I'm pretty sure you would get an exception but I can't be certain. Just did a test and a GPF is generated. If the handler just does an IRET then it will seem to wrap around because the offset pushed is only 16-bits and therefore 0000h.
Pype.Clicker wrote:Now i can figure out one thing that bochs could dislike (and possibly real CPUs as well). Imagine rather than having a 'nop' at 0xFFFF you have the start of "mov ax, 0x1234" -- a 4-bytes (or maybe just 3?) instruction. I'm unsure whether the CPU will try to load the 'extra' bytes from CS:0x0000 or CS:0x10000 but i fear it'll rather be the second one.
A 286+ CPU would generate an exception here; the 8086/8 did the former and 80186/8 did the latter, IIRC.

Re:Bootloader problems :(

Posted: Wed Jun 28, 2006 12:31 am
by Ryu
Pype.Clicker wrote:
Now i can figure out one thing that bochs could dislike (and possibly real CPUs as well). Imagine rather than having a 'nop' at 0xFFFF you have the start of "mov ax, 0x1234" -- a 4-bytes (or maybe just 3?) instruction. I'm unsure whether the CPU will try to load the 'extra' bytes from CS:0x0000 or CS:0x10000 but i fear it'll rather be the second one.
Basically I only know 386 assembly. As Blip already said it will generate an exception when offset goes beyond the 64KB segment. However, my guess is that exceptions are generated due to the hidden part of the segment descriptor where the address have gone beyond the limits, by default this limit is 0FFFFh (correct me if I'm wrong, as I see Bochs outputs a 0FFFFFh limit for CS only and a bit confused there). If this is the case, I am curious to see if in unreal mode using a 2^32 limit this would cause an exception to occur in this condition, or will wrap around within segment.

Another thing to keep in mind, its possible to address above 1MB with segment:offset which may wrap around in linear address due to A20 line being disabled.

Re:Bootloader problems :(

Posted: Wed Jun 28, 2006 12:04 pm
by mystran
Exception for AMD, about Intel I'm not quite sure. One XBox hack worked because Intel wraps... but I don't remember the details, whether latests processors still do it, or ... well.. much of anything else about that.

Anyway, don't depend on exception, but assume that you'll get it on some processors.