Page 2 of 2

Re: Problems with a beginner bootloader.

Posted: Wed Nov 30, 2011 8:24 am
by Love4Boobies
I didn't really read everything, I just happened to randomly notice another bug as I was scrolling down:
sanjeevmk wrote:

Code: Select all

org 0x0

bits 16

;set up segment registers base to 0x07C0
stage1:
   mov ax,0x07C0
   mov ds,ax
   mov es,ax
The BIOS Boot Specification requires that BIOS firmware use 0000:7C00, not 07C0:0000, so if you want to be compliant with the specification (and use a more efficient encoding---you are restricted in terms of code size, are you not?), you ought to go with that. However, considering that there are some BIOSes which only care about the physical address, the best way to go about avoiding bugs is something like the following:

Code: Select all

org 7C00h

xor ax, ax
mov ds, ax
mov es, ax
jmp 0000h:next

next:

Re: Problems with a beginner bootloader.

Posted: Wed Nov 30, 2011 8:34 am
by egos

Code: Select all

EnableA20:
  ...
Didn't see.

Code: Select all

	NULL: equ $-gdt
What syntax is it?

Code: Select all

install_gdt:
	pusha
	cli
	lgdt [pointer]
	sti
	popa
	ret
Cool :mrgreen:

Code: Select all

 
	mov ax,0x0	
	mov ss,ax
	mov sp,0xFFFF
Bad code. xor sp,sp will be better. mov sp,7C00h will be good.

Code: Select all

	mov dl,0x80
	int 0x13
There are BIOSes that use other boot disk number for USB stick and second hard disk. Additionally you should be sure that USB-FDD and other like options in BIOS Setup were disabled.

Code: Select all

load_stage2:
	mov ah,0x02
	mov al,0x1  ;just the one sector for stage 2, 512 bytes
	mov ch,0x0
	mov cl,0x2  ;stage 2 is on 2nd sector of disk
	mov dh,0x0
	mov dl,0x80
	int 0x13
	jc load_stage2
There are BIOSes that not support old disk service for some disks (especially for large disks). To try to read the disk all time while user thinking is not so good.

Code: Select all

	mov si,msg_stage1
print:
	lodsb
Are you sure that direction flag is set?

Code: Select all

	mov ax,0x9000
	mov ss,ax
	mov sp,0xFFFF
Do you know what is EBDA?

Code: Select all

	call install_gdt ; sti
...
;shift to protected mode
protected_mode:
	mov eax,cr0
	or al,1
	mov cr0,eax
Disable interrupts firstly.

Code: Select all

copykernel:
	mov esi,0x30000   ;physical address of where the kernel is loaded
	mov edi,0x100000  ;physical address of where it is to be copied
	mov ecx,0x200     ;the kernel is of size 512 bytes
	cld
	rep movsb
Check that memory is available firstly. rep movsd is more suitable for moving large data sets.

Re: Problems with a beginner bootloader.

Posted: Wed Nov 30, 2011 9:35 am
by Combuster
Love4Boobies wrote:The BIOS Boot Specification requires that BIOS firmware use 0000:7C00, not 07C0:0000, so if you want to be compliant with the specification (and use a more efficient encoding---you are restricted in terms of code size, are you not?), you ought to go with that. However, considering that there are some BIOSes which only care about the physical address, the best way to go about avoiding bugs is something like the following:

(snip)
If you care about size, you'd skip the far jump. All int/ret and all jmp/jxx/call label instructions are relative and do not care about what value CS is (and you don't want to use CS prefixes for size either). The written value for DS/ES is the only thing that needs to relate to whatver the org directive says.

The existing implementation works.

Re: Problems with a beginner bootloader.

Posted: Wed Nov 30, 2011 12:35 pm
by Love4Boobies
Combuster wrote:If you care about size, you'd skip the far jump.
My point was that his decision was the worst of all:
  • If he wants resilient code, he should use the jump.
  • If he wants reduced code size, he should follow the spec.
Combuster wrote:All int/ret and all jmp/jxx/call label instructions are relative and do not care about what value CS is (and you don't want to use CS prefixes for size either). The written value for DS/ES is the only thing that needs to relate to whatver the org directive says.
You're leaving out near, absolute indirect jumps and calls. I didn't read his code so I don't know whether he uses them but a lot of mine surely does, not to mention that it's a good idea for maitainability---later on he might forget he's not allowed to use them. All it takes is to know what value CS has. Here's a very obvious optimization that I use all the time in order to avoid using CALL/RET (notice that the routine can be "called" from multiple locations):

Code: Select all

mov ebx, next
jmp routine
next:
.
.
.

routine:
; Do useful stuff but preserve EBX
jmp ebx
Don't forget that a lot people do chain loading. If you REP STOS(B/W/D/Q) with DF = 1, then you'll be left with the offset in SI so you can use "jmp si" since its encoding is more efficient.

Re: Problems with a beginner bootloader.

Posted: Wed Nov 30, 2011 2:22 pm
by sanjeevmk
Now, I changed the kernel code to a 16-bit code, so I can execute it in real mode and check if the kernel binary is being loaded from disk, below the 1 MB mark. I am trying to load it to 0x3000:0x0. The only thing this does now is print "Kernel Loaded". Then , as before I burnt this to sector 3 of USB with dd.
Sector 1 -> Stage 1 , Sector 2 -> Stage 2 , Sector 3 -> Kernel (or rather just a print routine).

So this is how it should work:
1. Stage 1 prints "Loading Stage 2", then loads Stage 2 from Sector 2, to physical address 0x500 (0x50:0x0), and then jumps to it.
2. Stage 2 prints "Loading OS", then loads Kernel from Sector 3, to physical address 0x30000 (0x3000:0x0), and then jumps to it. (No copying to 1 MB).
This I am doing just to check if the kernel binary indeed get's loaded from disk at 0x3000:0x0 or not.
3. Kernel prints "Kernel loaded". This is a 16-bit code, at address 0x3000:0x0.

But this is what is happening:
As before, burnt the code (3 sectors) to USB using dd, and booted the Intel PC with it. Again , it prints only the first 2 strings. "Loading Stage 2", followed by "Loading OS".
Now, I had expected this to work, since I cut off all the A20, protected mode parts and just running the kernel in real mode at an address below 1 MB. But this also failed.

So the only possible point of failure I can see is loading sector 3 from the disk (can't expect the jmp instruction to 0x3000:0x0, to go wrong). I took it for granted that sector 3 will get loaded , because sector 2 (stage 2) is being correctly loaded and executed by stage 1. I can't understand why loading sector 3 can fail though. I am using the same method to load sector 3 that I used for sector 2. (CHS based int 0x13).

I'll try doing this by LBA now (int 0x13 extended read). I need a little help with the disk geometry though. I don't know how to calculate the Heads Per Cylinder (HPC) that is required by LBA method. I ran fdisk on my USB device. This is what fdisk has to say about the geometry:

Code: Select all

Disk /dev/sdb: 4005 MB, 4005560320 bytes
124 heads, 62 sectors/track, 1017 cylinders
Units = cylinders of 7688 * 512 = 3936256 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
How do I calculate HPC from this? Could not find any reference to this on the internet. LBA wikipedia page says HPC is typically 16, but can I bank on this?

Re: Problems with a beginner bootloader.

Posted: Wed Nov 30, 2011 3:09 pm
by sanjeevmk
This is what fdisk has to say about the geometry:

Code: Select all

Disk /dev/sdb: 4005 MB, 4005560320 bytes
124 heads, 62 sectors/track, 1017 cylinders
Units = cylinders of 7688 * 512 = 3936256 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
How do I calculate HPC from this? Could not find any reference to this on the internet. LBA wikipedia page says HPC is typically 16, but can I bank on this?
Okay , I found by size calculations that the number 124 shown by fdisk is actually the HPC. (I thought it was the total number of cylinders)
The calculation I used was this:

Code: Select all

(size in bytes) 4005560320 = 1 disk * (1017 cylinders / disk) * (HPC) * (1 track / head) *(62 sectors / track) * (512 bytes / sector)
From this, HPC comes out to be 124.

Re: Problems with a beginner bootloader.

Posted: Wed Nov 30, 2011 4:19 pm
by egos
If extended read function is supported you should use it else you should get spt and hpc through function 8 and translate lba to chs for legacy read function.

Re: Problems with a beginner bootloader.

Posted: Wed Nov 30, 2011 5:48 pm
by Combuster
Can you at least try fixing all the issues? There is a reason why the second read fails:
Combuster wrote:Other than that, don't try to overwrite the BIOS:

Code: Select all

   mov ax,0x9000
   mov ss,ax
   mov sp,0xFFFF
egos wrote:

Code: Select all

   mov ax,0x9000
   mov ss,ax
   mov sp,0xFFFF
Do you know what is EBDA?

Re: Problems with a beginner bootloader.

Posted: Thu Dec 01, 2011 11:35 am
by sanjeevmk
Hey I corrected everything as you guys suggested and everything just clicked! Thanks a lot guys, especially Combuster and egos. As the two of you had pointed out, the problem was because I was overwriting the BIOS (EBDA part) with the stack. I was referring to the memory map stated on another website, and there it was explicitly stated that 0x90000 to 0x9FFFF were unused areas and had no mention of EBDA. It was after egos mentioned about EBDA, did I check out the memory map given on osdev: http://wiki.osdev.org/Memory_Map_(x86)

Previous code

Code: Select all

mov ax,0x9000
mov ss,ax
mov sp,0xFFFF
This set up the stack from 0x90000 to 0x9FFFF, which is where EBDA had it's regime. So, the new code:

Code: Select all

mov ax,0x600
mov ss,ax
mov sp,0x0FFF
This sets up stack from 0x6000 to 0x6FFF, a safe area to play with. After this change, everything went smooth, except for 2 more minor bugs:

Bug 1:

Code: Select all

protected_mode:
mov eax,cr0
or al,1  ; buggy line
mov cr0,eax
Instead of doing "or eax,1" , I was doing "or al,1" which was causing the PC to restart. Rookie mistake.

Bug 2:
My kernel code did not end with the instructions "cli hlt" (to stop execution). And , I had declared all my data variables at the end of regular code. Because there was no "cli hlt", the code must have reached the data part , which was causing the PC to restart.

Also, in order to find the place where the PC is restarting, I had to always insert this piece of code after critical or doubtful sections, and then reboot:

Code: Select all

hang:
       jmp hang
This would pause the execution at the said point. So if the PC just hung, I would know that the Triple fault position was not yet encountered, and if the PC restarted, I would know that the triple fault point was before the "hang" part, hence the PC executed it and restarted. Is there a cleaner and easier way to do this? I had to restart the system several times during this.

Lessons learnt:
1. Code carefully.
2. Gather data from various sources (in this case, data being the memory map)
3. If it works on the emulator, it might still fail on a real system.

Thanks again, people, couldn't have completed this without your help! I sure could not have figured out on my own that I was overwriting something, let alone BIOS or EBDA :)