Problems with a beginner bootloader.
Problems with a beginner bootloader.
Hi,
first a background of what I'm trying to do:
I'm writing a very stripped down version of a 2-stage bootloader, just as a learning exercise. It's a typical 2-stage bootloader. The first stage loads the 2nd stage from disk, the 2nd stage loads the kernel from disk, goes to protected mode , copies the loaded kernel to the 1MB mark, and then jumps to it, to execute the kernel.
To keep things simple, for now, I am using an unformatted disk image to test the bootloader. So, I am not dealing with stuff like the BPB or FAT loading etc. (After the basic working, I plan to extend it to do all those things). What I've instead done is this:
1.Use 'dd' to load 1st stage (raw binary) to the first sector of image 'hd.img'. The 1st stage is exactly 512 bytes, obviously. The first stage prints the string "Loading Stage 2" using int 0x10 and then loads the stage 2 code from disk. Then jump to the stage 2 code. (Stage 2 is on sector 2 of disk)
2.Since I'm not handling disks with FS on them (it's a raw unformatted disk image), the 2nd stage also fits in 512 bytes (exactly), and this code is loaded to the 2nd sector of 'hd.img' using 'dd' with bs=512 and seek=1 options. Stage 2 prints a string "Loading OS" using int 0x10 and then loads the kernel from disk at an address below 1MB in real mode. After this, Stage 2 shifts to Protected mode, and copies the loaded kernel to the 1 Mb mark, using rep movsb. Then jump to the kernel code. Before loading the kernel from disk, stage 2 enables A20 line and installs the GDT. (The kernel is on sector 3 of disk)
3.The kernel for now, is also of size 512 bytes. All it does is call a 32 bit print routine to print "Kernel loaded". This kernel code, in raw binary is loaded on the 3rd sector of 'hd.img', using dd with bs=512 and seek=2.
So, sector1:stage1 sector2:stage2 sector3:kernel. So basically, stage1 loads sector2, and stage2 loads sector3 from hd.img and then executes the code that was present on sector3 (kernel)
The problem:
All of the above is working fine with Bochs. I get the proper outputs in sequence: Loading Stage 2 -> Loading OS -> Kernel loaded.
But, when I do this on actual hardware, it fails at the kernel execution point. That is, I loaded the bootloader on a USB stick (raw,unformatted) , using 'dd', just as with hd.img. Then, on booting the PC (32-bit x86) from USB, I get only this: Loading stage 2 -> Loading OS.
The same code that could load and execute the kernel at the 1MB mark in protected mode in Bochs, fails when done with actual hardware. Also, it fails with Qemu. To summarize, the bootloader works fine with Bochs, but not with the actual machine and Qemu (and Qemu does not have a debugger).
What is it that Bochs does differently , that Qemu and actual processor don't do?. I've coded based on tutorials found on osdev and other sites. Do new systems follow a different boot procedure? Or, does the problem lie with the fact that I'm using an unformatted disk, with no partitions (so no BPB, MBR etc)
The kernel copying part (to 1MB) here, is done in protected mode. Not in flat mode, like many sources suggest. I tried doing that in flat mode,but it didn't work, so I switched to doing it in P mode.
(I preferred to explain the algo, instead of posting the code , so as not to bloat the size of the thread. Would however be glad to post the code here if needed.)
Thanks.
Sanjeev M K.
first a background of what I'm trying to do:
I'm writing a very stripped down version of a 2-stage bootloader, just as a learning exercise. It's a typical 2-stage bootloader. The first stage loads the 2nd stage from disk, the 2nd stage loads the kernel from disk, goes to protected mode , copies the loaded kernel to the 1MB mark, and then jumps to it, to execute the kernel.
To keep things simple, for now, I am using an unformatted disk image to test the bootloader. So, I am not dealing with stuff like the BPB or FAT loading etc. (After the basic working, I plan to extend it to do all those things). What I've instead done is this:
1.Use 'dd' to load 1st stage (raw binary) to the first sector of image 'hd.img'. The 1st stage is exactly 512 bytes, obviously. The first stage prints the string "Loading Stage 2" using int 0x10 and then loads the stage 2 code from disk. Then jump to the stage 2 code. (Stage 2 is on sector 2 of disk)
2.Since I'm not handling disks with FS on them (it's a raw unformatted disk image), the 2nd stage also fits in 512 bytes (exactly), and this code is loaded to the 2nd sector of 'hd.img' using 'dd' with bs=512 and seek=1 options. Stage 2 prints a string "Loading OS" using int 0x10 and then loads the kernel from disk at an address below 1MB in real mode. After this, Stage 2 shifts to Protected mode, and copies the loaded kernel to the 1 Mb mark, using rep movsb. Then jump to the kernel code. Before loading the kernel from disk, stage 2 enables A20 line and installs the GDT. (The kernel is on sector 3 of disk)
3.The kernel for now, is also of size 512 bytes. All it does is call a 32 bit print routine to print "Kernel loaded". This kernel code, in raw binary is loaded on the 3rd sector of 'hd.img', using dd with bs=512 and seek=2.
So, sector1:stage1 sector2:stage2 sector3:kernel. So basically, stage1 loads sector2, and stage2 loads sector3 from hd.img and then executes the code that was present on sector3 (kernel)
The problem:
All of the above is working fine with Bochs. I get the proper outputs in sequence: Loading Stage 2 -> Loading OS -> Kernel loaded.
But, when I do this on actual hardware, it fails at the kernel execution point. That is, I loaded the bootloader on a USB stick (raw,unformatted) , using 'dd', just as with hd.img. Then, on booting the PC (32-bit x86) from USB, I get only this: Loading stage 2 -> Loading OS.
The same code that could load and execute the kernel at the 1MB mark in protected mode in Bochs, fails when done with actual hardware. Also, it fails with Qemu. To summarize, the bootloader works fine with Bochs, but not with the actual machine and Qemu (and Qemu does not have a debugger).
What is it that Bochs does differently , that Qemu and actual processor don't do?. I've coded based on tutorials found on osdev and other sites. Do new systems follow a different boot procedure? Or, does the problem lie with the fact that I'm using an unformatted disk, with no partitions (so no BPB, MBR etc)
The kernel copying part (to 1MB) here, is done in protected mode. Not in flat mode, like many sources suggest. I tried doing that in flat mode,but it didn't work, so I switched to doing it in P mode.
(I preferred to explain the algo, instead of posting the code , so as not to bloat the size of the thread. Would however be glad to post the code here if needed.)
Thanks.
Sanjeev M K.
Sanjeev Mk
Re: Problems with a beginner bootloader.
We can only guess. So I'll ask, which method of enabling A20 are you using ?
If a trainstation is where trains stop, what is a workstation ?
Re: Problems with a beginner bootloader.
Enabling it by writing to the keyboard output port. The standard method.
Okay, probably a noob doubt: I know we're writing to the output port via the keyboard controller. Since most of the bootloader methods are kind of 'historic', and since we are guessing things, is there a chance of this going wrong if you have a USB keyboard?
(Yeah, but I get your point, I'll try playing with other methods.)
Also, as most tutorials suggested, my code and data selectors in the GDT overlap, that is, both have same base and limit. So is it possible that the kernel code copied to the 1 MB mark, is being treated as data and not executable?.
Okay, probably a noob doubt: I know we're writing to the output port via the keyboard controller. Since most of the bootloader methods are kind of 'historic', and since we are guessing things, is there a chance of this going wrong if you have a USB keyboard?
(Yeah, but I get your point, I'll try playing with other methods.)
Also, as most tutorials suggested, my code and data selectors in the GDT overlap, that is, both have same base and limit. So is it possible that the kernel code copied to the 1 MB mark, is being treated as data and not executable?.
Sanjeev Mk
- beyondsociety
- Member
- Posts: 39
- Joined: Tue Oct 17, 2006 10:35 pm
- Location: Eagle, ID (USA)
- Contact:
Re: Problems with a beginner bootloader.
Its a posibility , but seems more like memory isn't zeroed out or your overwriting some location in memory when your coping the kernel, I would run it through the bochs debugger and single-step through the code that may be causing the problem to see exactly what's going on.
"I think it may be time for some guru meditation"
"Barbarians don't do advanced wizardry"
"Barbarians don't do advanced wizardry"
-
- Member
- Posts: 255
- Joined: Tue Jun 15, 2010 9:27 am
- Location: Flyover State, United States
- Contact:
Re: Problems with a beginner bootloader.
It doesn't matter if the GDT segments overlap, as long as the segment selector in CS is a valid executable code segment. If you're running in real mode and setting up a GDT there, it should be a 16-bit segment, and then you want to set the PE bit in cr0, and jump to a 32-bit code segment.
How are you treating the image in Bochs? Is it set up as a hard disk or floppy disk? And are you sure that your BIOS supports emulation of a USB drive as a hard drive?
And as for the a20 problems, there are known problems with enabling it, try methods other than the keyboard controller, such as the BIOS if it supports it. Here is some code that has multiple methods to enable the A20 line.
How are you treating the image in Bochs? Is it set up as a hard disk or floppy disk? And are you sure that your BIOS supports emulation of a USB drive as a hard drive?
And as for the a20 problems, there are known problems with enabling it, try methods other than the keyboard controller, such as the BIOS if it supports it. Here is some code that has multiple methods to enable the A20 line.
Re: Problems with a beginner bootloader.
Hi,
It may also be possible that your USB stick may have CHS addressing rather than LBA addressing. Just format the stick with fat32 and check with "disk utility" in ubuntu whether it says 0x0B ( for CHS ) or 0x0C for LBA and your bootloader maybe trying to load the rest of the kernel from completely different addresses after address translation.
EDIT: qemu can be configured to use gdb debugger. Also qemu comes with qemu monitor which enables to check the registers etc..
It may also be possible that your USB stick may have CHS addressing rather than LBA addressing. Just format the stick with fat32 and check with "disk utility" in ubuntu whether it says 0x0B ( for CHS ) or 0x0C for LBA and your bootloader maybe trying to load the rest of the kernel from completely different addresses after address translation.
EDIT: qemu can be configured to use gdb debugger. Also qemu comes with qemu monitor which enables to check the registers etc..
Last edited by Muneer on Wed Nov 30, 2011 4:21 am, edited 1 time in total.
Even the smallest person could change the course of the future - Lord Of The Rings.
In the end all that matters is what you have done - Alexander.
Even after a decade oh god those still gives me the shivers.
In the end all that matters is what you have done - Alexander.
Even after a decade oh god those still gives me the shivers.
Re: Problems with a beginner bootloader.
I have not read the thread fully but:
We could help you a lot more if you posted the code.
We could help you a lot more if you posted the code.
Get back to work!
Github
Github
Re: Problems with a beginner bootloader.
Okay. Weird thing. I tested the bootloader on another system (AMD 64) with my USB. It worked perfectly. And, on this system, it worked on Qemu also. The expected sequence of output was:
Loading Stage 2
Loading OS
Kernel Loaded (this is printed in the kernel)
Now there are 2 systems, one Intel 32-bit, and another AMD 64-bit.
On the Intel computer (32 bit, Intel Dual Core processor)
The bootloader gives the above expected output in Bochs. But, prints only the first 2 strings in Qemu and the actual hardware, using USB booting. And yes, this system supports USB booting as it does boot from USB and prints the first 2 strings. Also, I've booted Linux from USB on this system many a time. I also tried various other methods of enabling A20, but that also doesn't work.
Possible failure points:
1.Kernel not copied to 1 MB mark
or 2. Copied to 1 MB, but couldn't jump there and execute.
On the AMD computer (64 bit, AMD Athlon x6)
The bootloader gives the expected output in Bochs, and also in Qemu and the actual hardware using USB boot. And here , it works with all the different A20 methods.
If the problem is with USB addressing (CHS or LBA), then it shouldn't work on the latter system as well.
Does Bochs not emulate the host system accurately?
Loading Stage 2
Loading OS
Kernel Loaded (this is printed in the kernel)
Now there are 2 systems, one Intel 32-bit, and another AMD 64-bit.
On the Intel computer (32 bit, Intel Dual Core processor)
The bootloader gives the above expected output in Bochs. But, prints only the first 2 strings in Qemu and the actual hardware, using USB booting. And yes, this system supports USB booting as it does boot from USB and prints the first 2 strings. Also, I've booted Linux from USB on this system many a time. I also tried various other methods of enabling A20, but that also doesn't work.
Possible failure points:
1.Kernel not copied to 1 MB mark
or 2. Copied to 1 MB, but couldn't jump there and execute.
On the AMD computer (64 bit, AMD Athlon x6)
The bootloader gives the expected output in Bochs, and also in Qemu and the actual hardware using USB boot. And here , it works with all the different A20 methods.
If the problem is with USB addressing (CHS or LBA), then it shouldn't work on the latter system as well.
Does Bochs not emulate the host system accurately?
Sanjeev Mk
Re: Problems with a beginner bootloader.
I agree with you, twiceACcurrent wrote:We could help you a lot more if you posted the code.
If you have seen bad English in my words, tell me what's wrong, please.
Re: Problems with a beginner bootloader.
I am not sure that addressing really is the problem.
Cant guarantee that. There are more possibilities like your intel pc may be configured to emulate the USB stick as Floppy which means CHS and your amd64 may be configured to boot USB as Hard DIsk (LBA). Of course there may be internal address translation by the bios, though I am not sure about that.ACcurrent wrote:If the problem is with USB addressing (CHS or LBA), then it shouldn't work on the latter system as well.
I think not.. IIRC it just emulates a particular pc stated in its configuration file.sanjeevmk wrote:Does Bochs not emulate the host system accurately?
Even the smallest person could change the course of the future - Lord Of The Rings.
In the end all that matters is what you have done - Alexander.
Even after a decade oh god those still gives me the shivers.
In the end all that matters is what you have done - Alexander.
Even after a decade oh god those still gives me the shivers.
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: Problems with a beginner bootloader.
Looks like people missed asking the obvious question here: do you actually check the return value of any BIOS call?
No emulator and no virtual machine will accurately pretend to be the host system. Virtual machines will actually give you the host processor to run on, but you will not get the hardware and you will not get the firmware. In some cases an emulation feature can allow you to talk with an actual plug-in device, but not anything relevant in the bootloader stage. VMs are in that regard exactly what the name says, a machine not existing in real life.sanjeevmk wrote:Does Bochs not emulate the host system accurately?
Re: Problems with a beginner bootloader.
Here's the whole code.
First, the routines I call from the code. The routines are followed by each of the stages and the kernel.
I have not posted the 32-bit printing routines, as they are doing their job correctly,and not really relevant to the context. These routines are called from the kernel and the problem is in jumping to the kernel code anyway.
I'm just working with 3 sectors, so I haven't done any address translation and stuff. Just using CHS addressing.
Enable A20:
GDT and Install GDT
Stage1.asm
Stage2.asm
Kernel.asm
First, the routines I call from the code. The routines are followed by each of the stages and the kernel.
I have not posted the 32-bit printing routines, as they are doing their job correctly,and not really relevant to the context. These routines are called from the kernel and the problem is in jumping to the kernel code anyway.
I'm just working with 3 sectors, so I haven't done any address translation and stuff. Just using CHS addressing.
Enable A20:
Code: Select all
EnableA20:
cli
pusha
call wait_input
mov al,0xAD
out 0x64,al ; disable keyboard
call wait_input
mov al,0xD0
out 0x64,al ; tell controller to read output port
call wait_output
in al,0x60
push eax ; get output port data and store it
call wait_input
mov al,0xD1
out 0x64,al ; tell controller to write output port
call wait_input
pop eax
or al,2 ; set bit 1 (enable a20)
out 0x60,al ; write out data back to the output port
call wait_input
mov al,0xAE ; enable keyboard
out 0x64,al
call wait_input
popa
sti
ret
; wait for input buffer to be clear
wait_input:
in al,0x64
test al,2
jnz wait_input
ret
; wait for output buffer to be clear
wait_output:
in al,0x64
test al,1
jz wait_output
ret
Code: Select all
gdt:
NULL: equ $-gdt
dd 0
dd 0
CODE: equ $-gdt
dw 0xFFFF
dw 0
db 0
db 10011010b
db 11001111b
db 0
DATA: equ $-gdt
dw 0xFFFF
dw 0
db 0
db 10010010b
db 11001111b
db 0
gdt_end:
pointer:
dw gdt_end - gdt - 1
dd gdt
install_gdt:
pusha
cli
lgdt [pointer]
sti
popa
ret
Code: Select all
org 0x0
bits 16
;set up segment registers base to 0x07C0
stage1:
mov ax,0x07C0
mov ds,ax
mov es,ax
mov ax,0x0
mov ss,ax
mov sp,0xFFFF
;reset disk, so we start at 0x0 of disk.
resetdisk:
mov ah,0x0
mov dl,0x80
int 0x13
jc resetdisk
;es:bx = 0x50:0x0. This where we load the 2nd stage. Physical -> 0x500
mov bx,0x0050
mov es,bx
mov bx,0x0
;load stage 2 from sector 2. stage 2 is 512 bytes, so load just the 1 sector.
load_stage2:
mov ah,0x02
mov al,0x1 ;just the one sector for stage 2, 512 bytes
mov ch,0x0
mov cl,0x2 ;stage 2 is on 2nd sector of disk
mov dh,0x0
mov dl,0x80
int 0x13
jc load_stage2
;print "Loading Stage 2"
mov si,msg_stage1
print:
lodsb
or al,al
jz here
mov ah,0x0E
int 0x10
jmp print
;Now do a far jump to 0x50:0x0, which is where stage 2 is loaded.
here:
jmp 0x50:0x0
msg_stage1 db "Loading Stage 2",13,10,0
times 510-($-$$) db 0
dw 0xAA55
Stage2.asm
Code: Select all
org 0x500
jmp stage2
%include "GDT.inc"
%include "A20.inc"
bits 16
%define KERNEL 0x100000
;set up segment registers
stage2:
xor ax,ax
mov ds,ax
mov es,ax
mov ax,0x9000
mov ss,ax
mov sp,0xFFFF
;print "Loading OS" , as that is what we're doing
mov si,msg_stage2
print:
lodsb
or al,al
jz do_stuff
mov ah,0x0E
int 0x10
jmp print
;enable A20 and load GDTR.
do_stuff:
call EnableA20
call install_gdt
;reset the disk so accesses start from 0x0
resetdisk:
mov ah,0x0
mov dl,0x80
int 0x13
jc resetdisk
;es:bx = 0x3000:0x0. This is where we load the kernel, at first.
mov bx,0x3000
mov es,bx
mov bx,0x0
;load the kernel from sector 3 of harddisk. Kernel is of 512 bytes, so just load
;the one sector
loadkernel:
mov ah,0x2
mov al,0x1 ;just the one sector to load
mov ch,0x0
mov cl,0x3 ;sector 3, where the kernel binary is located
mov dh,0x0
mov dl,0x80
int 0x13
jc loadkernel
;shift to protected mode
protected_mode:
mov eax,cr0
or al,1
mov cr0,eax
jmp CODE:stage2_5 ;reload segment registers
bits 32
stage2_5:
mov ax,DATA ;the data selector of GDT
mov ds,ax
mov es,ax
mov fs,ax
mov gs,ax
mov ss,ax
mov esp,0x90000
;copy kernel loaded at 0x3000:0x0 in real mode, to the 1 MB mark now.
copykernel:
mov esi,0x30000 ;physical address of where the kernel is loaded
mov edi,0x100000 ;physical address of where it is to be copied
mov ecx,0x200 ;the kernel is of size 512 bytes
cld
rep movsb
jmp CODE:KERNEL ;jump to the kernel.
msg_stage2 db "Loading OS",13,10,0
times 512-($-$$) db 0
Code: Select all
org 0x100000
bits 32
jmp main
%include "stdio.inc"
%include "GDT.inc"
;load segment registers
main:
mov ax,DATA
mov ds,ax
mov es,ax
mov ss,ax
mov esp,0x90000
call ClrScr32 ;clear the screen
mov ebx,msg_kernel
call Puts32 ;print "Kernel Loaded"
msg_kernel db "Kernel Loaded",13,10,0
times 512-($-$$) db 0
Sanjeev Mk
Re: Problems with a beginner bootloader.
The addressing , LBA/CHS is not a problem here, in my view, as the Intel system (which is where things are not working) successfully loads stage 2 from the disk and executes it.
Sanjeev Mk
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: Problems with a beginner bootloader.
If you read the clues, DL need not be 0x80, CHS accesses may fail on a harddisk and LBA accesses may fail on a floppy, and you don't actually check for errors.
Other than that, don't try to overwrite the BIOS:
Other than that, don't try to overwrite the BIOS:
Code: Select all
mov ax,0x9000
mov ss,ax
mov sp,0xFFFF
Re: Problems with a beginner bootloader.
Yeah, I should add error checking.
But if disk accesses were indeed the problem, the 2nd stage (on 2nd sector) should not load and execute. I'm using CHS addressing to load the 2nd stage. I don't see why that should fail for loading the kernel.
But if disk accesses were indeed the problem, the 2nd stage (on 2nd sector) should not load and execute. I'm using CHS addressing to load the 2nd stage. I don't see why that should fail for loading the kernel.
Sanjeev Mk