Page 1 of 2

Second-Stage Bootloader Bug

Posted: Thu Dec 11, 2008 11:16 am
by Richy
Hi. I'm building a second-stage bootloader. I want it to display a "hello world" message. My problem is that it doesn't. I've made two versions of the program. The first displays the message with some garbage characters before it, the second does not display it at all. Both programs are fairly simple and straightforward, so I can't see what's going wrong.

Here's the first version, which displays garbage before the Hello World message:

Code: Select all

; Update the segment registers
mov ax, cs
mov ds, ax
mov es, ax

HelloString db "Hello World",0
MOV SI, HelloString ;Store string pointer to SI

print:
LODSB		;AL=memory contents at DS:SI
OR AL, AL	;Check if value in AL is zero (end of string)
JZ loop 	;If end then return

MOV AH, 0x0E	;Tell BIOS that we need to print one charater on screen.
MOV BH, 0x00	;Page no.
MOV BL, 0x07	;Text attribute 0x07 is lightgrey font on black background
INT 0x10	;Call video interrupt
JMP print       ; Print next character

loop:
MOV AL, 'Z'	;I'll print a char to see that it ended
MOV AH, 0x0E	;Tell BIOS that we need to print one charater on screen.
MOV BH, 0x00	;Page no.
MOV BL, 0x07	;Text attribute 0x07 is lightgrey font on black background
INT 0x10	;Call video interrupt
JMP $ 		;Infinite loop
And here's the second version, that doesn't display the message at all (it does display the final 'Z' char though).

Code: Select all

JMP beginningofprogram

print:
LODSB		;AL=memory contents at DS:SI
OR AL, AL	;Check if value in AL is zero (end of string)
JZ loop 	;If end then return

MOV AH, 0x0E	;Tell BIOS that we need to print one charater on screen.
MOV BH, 0x00	;Page no.
MOV BL, 0x07	;Text attribute 0x07 is lightgrey font on black background
INT 0x10	;Call video interrupt
JMP print       ; Print next character


beginningofprogram:

; Update the segment registers
mov ax, cs
mov ds, ax
mov es, ax

MOV SI, HelloString ;Store string pointer to SI
JMP print       ; Print next character

loop:
MOV AL, 'Z'	;I'll print a char to see that it ended
MOV AH, 0x0E	;Tell BIOS that we need to print one charater on screen.
MOV BH, 0x00	;Page no.
MOV BL, 0x07	;Text attribute 0x07 is lightgrey font on black background
INT 0x10	;Call video interrupt
JMP $ 		;Infinite loop

HelloString db "Hello World",0
The first-stage bootloader itself is fairly simple. And it works: In both cases the program is loaded and displays the final Z character, and I've also used it to load another hello world program that displayed the message character by character without using strings and SI, and that one worked fine. So I figure the bug should be in this code.

Can anyone pick out the bug in either one of these programs? Thanks!

Re: Second-Stage Bootloader Bug

Posted: Thu Dec 11, 2008 11:52 am
by M-Saunders
Richy wrote:

Code: Select all

; Update the segment registers
mov ax, cs
mov ds, ax
mov es, ax

HelloString db "Hello World",0
MOV SI, HelloString ;Store string pointer to SI
Think of what the CPU is doing here: it executes the first three mov instructions, and then you insert the string "Hello World" in the path of execution. So the CPU attempts to execute that string, which could be all manner of opcodes, hence (at least part of) the problem! So don't insert strings into the middle of executable code, or at least jmp past them.

M

Re: Second-Stage Bootloader Bug

Posted: Thu Dec 11, 2008 12:11 pm
by Richy
Thanks. I tried putting the string creation outside the executable code. That was part of the idea behind the second version of the program I posted. I've tried it again now, to be sure, in two different ways: by putting a jmp command to skip that line and by moving the line to the end of the executable code, after the infinite loop. In both cases, the program didn't display the string at all, and only displays the final Z.

It's almost like there's a 0 at the beginning of the string that makes the program skip to the end right from the first character...

Re: Second-Stage Bootloader Bug

Posted: Thu Dec 11, 2008 12:21 pm
by Combuster
CS is not guarantueed to always be the same value - it is usually either 0x0000 or 0x07c0, but paranoia doesn't hurt so it can essentially be everything
Which means that in some cases the processor tries to find that string at 0x7c0 * 16 + offset, and in the other case at 0x0 * 16 + offset.

Then there's the assembler. It could assume that the start of the file will be at 0x0, or some other place (especially if you use a linker). So you might be accessing your string at any of the following locations:

random_segment * 16 + random_file_offset + offset_to_string.

How likely do you think that it is that it will magically "guess" the right values? :wink:

hints:
- org statement
- far jump

Re: Second-Stage Bootloader Bug

Posted: Thu Dec 11, 2008 12:56 pm
by Richy
I can't put in an org statement. I already have one in the first-stage bootloader. I'm assembling both the first and second stage into a single binary*, and NASM won't accept two org statements.

So I've tried replacing the cs with the right value:

Code: Select all

mov ax, 1000h
Since that's where the first-stage bootloader loads the code to*. But no luck. Still doesn't display the string.

What did you mean by the "far jump" hint?


*This version is meant to be the simplest possible two-stage bootloader, with no file system support, so I need to have both bootloaders in the same binary with the second one right after the first one, so that I know exactly where it is copied on the disk and I can load it with Int 13h.

**Here's the first-stage bootloader's code:

Code: Select all

    [BITS 16]
    [ORG 0x7C00]	

    ; Update the segment registers
    mov ax,  cs
    mov ds, ax
    mov es, ax

    reset:                      ; Reset the floppy drive
            mov ax, 0           ;
            mov dl, 0           ; Drive=0 (=A)
            int 13h             ;
            jc reset            ; ERROR => reset again

    read:
            mov ax, 1000h       ; ES:BX = 1000:0000
            mov es, ax          ;
            mov bx, 0           ;

            mov ah, 2           ; Load disk data to ES:BX
            mov al, 1          ; Load 1 sector
            mov ch, 0           ; Cylinder=0
            mov cl, 2           ; Sector=2
            mov dh, 0           ; Head=0
            mov dl, 0           ; Drive=0
            int 13h             ; Read!

            jc read             ; ERROR => Try again
            jmp 1000h:0000      ; Jump to the program

    times 510-($-$$) db 0
    dw 0AA55h

Re: Second-Stage Bootloader Bug

Posted: Thu Dec 11, 2008 1:03 pm
by Combuster
lemme see
random_segment * 16 + random_file_offset + offset_to_string.
becomes
0x1000 * 16 + 0x7c00 + offset_to_string
= 0x17exx
while its actually loaded at:
0x10000-0x101ff

Now go fix the math.

Re: Second-Stage Bootloader Bug

Posted: Thu Dec 11, 2008 1:29 pm
by Richy
Thanks for the help Combuster. But I'm sorry I'm a bit rusty at assembly - before this program, the last time I used this language was almost a decade ago. So please slow down and let me catch up! :oops:

So my program is loaded by the first stage bootloader in 0x1000h to 0x11ff (one sector's worth, though I actually use less than that - and why are your numbers 10 times bigger?). That's the value I need to specify instead of cs in my second-stage bootloader, right? If I set the address at 0x1000 in the first bootloader, why do I need to take into account the 7c00 - won't I be in 1000h regardless?

Also, what's the math you mention I need to fix?

Sorry if those are somewhat basic questions...

Re: Second-Stage Bootloader Bug

Posted: Thu Dec 11, 2008 2:59 pm
by kmtdk
combuster have already gave you what you need;
but lets resume on it:

combuster said:
random_segment * 16 + random_file_offset
I got the ideer that you dont know what a lineary address is, so i will just show you
But first, how can you acess 1 mb , if your registers only allow 64 kb ?? , think of it as a help

a linary address is the "real" address.
the way you calculate the lineary address is:

Code: Select all

(segment * 16) + offset 
very simple, but what does it says:
it says that if we are at the address
cs: ip
and cs = 0x1000 and ip = 0x0000
then the "real" address is not 0x1000, but:

Code: Select all

(0x1000 * 16) + 0 = 0x10000 
where 0x1000 is cs , and 0 is ip
This is the way it works for every seg:offset memory acess you do, it is also called indrecty memory acess. ( and it is the answerd of my "little" quistion :twisted: )
-------------

so when you say that your secound stage is executing at 0x1000 it is WORNG
it is at 0x10000.

but your jmp is not worng
because 0x1000:0x0000 = 0x10000

but the reason why you dont see anything from the print is the same problem
"LODSB" reads the DS:SI , and store the byte at "AL"
so if you are at 0x1000:0x0000
and the string is at 0x1000:0x0100
then you will have to load DS with 0x1000
and si with 0x0100



KMT dk

Re: Second-Stage Bootloader Bug

Posted: Thu Dec 11, 2008 3:29 pm
by Richy
Thank you kmtdk, for the segment:offset explanation. That explains that part.

Unfortunately, I'm still not seeing what's wrong with my program :oops: Say I take the second one, the one that doesn't have the string definition in the middle of the executable code, and I modify it as such:

Code: Select all

mov ax, 0x1000
mov ds, ax
mov es, ax
mov si, HelloString 
Now ds is certain to contain the right memory segment (as opposed to before where it depended on whether cs was correct) and si points to my string (sparing me the need to know exactly at which offset it is). Shouldn't that work? Because it doesn't, it's still not displaying the string. What am I missing?

Re: Second-Stage Bootloader Bug

Posted: Fri Dec 12, 2008 12:41 pm
by Owen
But NASM still thinks your code is at some random location past the end of your stage one bootloader.
Put your second stage in it's own bin file, and concatenate them another way.

Re: Second-Stage Bootloader Bug

Posted: Fri Dec 12, 2008 5:23 pm
by Troy Martin
And, if you don't know about far jumps, it's basically like this:

Code: Select all

jmp 0x2000:0xCAFE
where, in this case, the 0x2000 is the new code segment and 0xCAFE is the offset to jump to. Very useful.

Re: Second-Stage Bootloader Bug

Posted: Fri Dec 12, 2008 5:47 pm
by Combuster
When I said, fix the math, I meant you should check what you did would actually work.

the location where you are accessing the string is still as follows:
random_segment * 16 + random_file_offset + offset_to_string.

random_segment is the segment you are using to access the string (i.e. most commonly DS, but occasionally SS or ES)
you set DS to 0x1000, so the random segment is known.
interesting note, mov ax, cs; mov ds, ax doesn't change this at all as CS was already 0x1000 (because you set it with the far jump in stage 1)

random_file_offset is where you told the assembler that your code would be.
the org statement reads ORG 0x7c00, hence the file offset is known.

offset_to_string I won't compute by hard, but we can assume that it lies within a sector's range. since it's the second, the value lies somewhere between 0x200 and 0x3ff

so adding things together, you are locating a string at 0x1000 * 16 + 0x7c00 + 0x2xx physical
= 0x17exx

Now, where did the code actually go:
you told the bios to read a sector to ES:BX (and you set the registers to ES=0x1000, BX=0)
So a sector gets loaded to segment * 16 + offset
= 0x1000 * 16 + 0
= 0x10000
which isn't even close to
0x17e00

Now you have two equations, and you know which parameters are located where.

Homework:
a) determine which values you can and can not change
b) adjust either equation so that it matches the other.
Since you call yourself a programmer, this should be a piece of cake.

Re: Second-Stage Bootloader Bug

Posted: Fri Sep 07, 2012 6:18 pm
by Rolice
Hello.
I am very new to the forum. I decided to play the hard way with self-written bootloader, with the purpose of understanding how the things are being happened. :shock:

I spend a whole day (10-12h) reading throughout the web for second stage. checking whether the disk operations are failing or why call instruction fails... Then I realize that the data was somehow displaced so the nasm-built binary just cannot handle it, but however the code was fine, it got executed on schedule, cannot say the same for the data. :|

I have read this topic more than 2 times (seeing the same problem, but not a clear solution) and then my eye caught strange thought (post) of Combuster, about 0x0000 or 0x07c0, which I read more than 5 times to while seeing somthing useful but not exactly sure what it is... :lol:

So I found my solution, I put my second stage loader with ORG 0h and reset the ds, es, ss to 1000h, instead of ORG 1000h and reset ds, es, ss to 0h. That fixed the problem. But I am unsure why? :roll:

I assume the BIOS of VM loads the first boot loader at 0000:7c00, instead of 7c00:0000, but isn't supposed to be trasparent, ie. the both should lead to the same place? How this would affect the inverted execution (ORG 1000h ds, es, ss zeros), in a way which would lead to the problem discussed here? :?

Thanks

Re: Second-Stage Bootloader Bug

Posted: Fri Sep 07, 2012 6:37 pm
by Love4Boobies
Note: I've only read the post before my own.
Rolice wrote:I assume the BIOS of VM loads the first boot loader at 0000:7c00, instead of 7c00:0000, but isn't supposed to be trasparent, ie. the both should lead to the same place? How this would affect the inverted execution (ORG 1000h ds, es, ss zeros), in a way which would lead to the problem discussed here? :?
Hi and welcome to the forum. First of all, you mean 0000:7C00 vs. 07C0:0000, not vs. 7C00:0000. Next, the problem you are having is caused by instructions which use the offset you provide with the segment inside one of the segment registers. For example, if your entry point is 07C0:0000 instead of 0000:7C00 (which the BIOS ought to expect), although they mean the same thing, if you you pass the package offset 0050h rather than 7C50h to INT 13h, the resulting addresses won't resolve to the same physical address (07C0:0050 vs. 0000:0050). The same can apply to jumps and other things (although this is usually less visible in boot sector as jumps can also be short---in other words, relative to the current instruction pointer).

Re: Second-Stage Bootloader Bug

Posted: Fri Sep 07, 2012 6:46 pm
by Rolice
Thank you for you explanation. Now I verified the source of the problem. Yes, in the second boot loader I have some short jumps, that is possibly why the code did not brake. :)