Page 1 of 1

Qemu error at boot

Posted: Tue Mar 10, 2020 6:47 am
by Onelio
Hi, first of all, I'd like to apologize since I'm sure that issues like this one have been published before but I've been working on it sometime and still cannot find a solution.
So some weeks ago I started writing a bootloader for fun to a FAT16 file system for HDD that you can find in this repo https://github.com/Onelio/SysBoot and I got to the point where it enters the second stage(I don’t have a second stage so it essentially boots to a dummy file called “kernel.bin”) without problem after setting the A20 line (I will relay the Protected Mode setup to the second stage so the first stage just has to load the file and enable the a20 to load it into 0xF7C0:8400) and everything works fine under Bochs but whenever I try it on Qemu it just fails with the error “Trying to execute code outside RAM or ROM at 0x545136dc”

After a lot of searching and testing I found out that the problem is with the enabling of the A20 line here https://github.com/Onelio/SysBoot/blob/ ... io.asm#L83 that seems to not work on Qemu so I came here to ask for guidance and maybe advice. I know that I lack the keyboard enabling attempt but aside for the bios one which it's only supported by some bios the other one should work properly for what I've read.

Here is how I call Qemu:

Code: Select all

qemu-system-i386 bin/system.iso -monitor stdio
Thanks for your time.

Re: Qemu error at boot

Posted: Tue Mar 10, 2020 8:45 am
by Octocontrabass
Why not load the second stage to a lower address?

How did you figure out that it's A20 causing problems and not something else?

Re: Qemu error at boot

Posted: Tue Mar 10, 2020 10:27 am
by bzt
Onelio wrote:the first stage just has to load the file and enable the a20 to load it into 0x100.000
I'm not sure what that address supposed to be, but if A20 needs to be enabled for it, then I'm pretty sure you can't load there using BIOS. If it's conventional memory, then why not load at the lowest address possible so that you can utilize more of the memory below 640k?
Onelio wrote:I will relay the Protected Mode setup to the second stage
Why don't you enable the A20 there then? Enabling A20 is only relevant for protected mode because you simply can't really address memory above 1M in real mode.
Onello wrote:After a lot of searching and testing I found out that the problem is with the enabling of the A20 line here
Like Octocontrabass said, I'm not sure either that the problem is this function. However it worth nothing that not all PC supports the fast A20 method, but all BIOS should support the enable A20 function (this is specially true for VMs, on a real hardware you'd probably have the fast method anyway). Nonetheless, if fast method is not supported, there should be no "Trying to execute code outside RAM" error, of that I'm sure. I'd suggest to add "-s -S" to qemu's command line and connect gdb to the vm to debug the issue.
Onello wrote:I came here to ask for guidance and maybe advice.
My first advice would be to use fasm instead of nasm, because it's much much better. But this is somewhat a personal preference thing.

For guidance I can give you these:

imgrecv.asm has a boot sector that loads the kernel over serial (not important for you). What is important, it sets up protected mode, and if the kernel is an ELF64, then long mode too in the boot sector. Everything implemented in a single sector, no 2nd stage. You can replace the code loading the kernel over serial in lines 122-155 with a code using LBA calls. If you're not interested in 64 bit, protected mode is enough for you, then you can remove lines 169-208.

bootboot 1st stage is a boot sector that is capable of loading a 2nd stage using LBA packets. It supports CDROMs too (with 2048 bytes sectors instead of 512 bytes), and RAID mirrors.

bootboot 2nd stage can be loaded in many ways (Grub, BBS ROM, as Linux kernel etc., not relevant) and also by it's 1st stage boot sector (line 258). It properly detects if the CPU can do protected mode (line 300), and it has an example how to load sectors above the 1MB mark (line 882, hint: it uses a temp buffer in low memory which can be accessed by real mode BIOS, and then in protmode it copies the sector to its final position in memory). It also has a proper A20 enabling sequence (line 364). It reads the GPT, locates ESP (line 1034), and then it reads a file from that partition regardless if it's FAT16 or FAT32 (line 1112, about 99% of the code is shared between FAT16 and FAT32).

alexfru's boot16.asm is a pretty neat boot sector that loads the 2nd stage from a FAT16 file (I believe this might very well be exactly what you need).

I hope these might help.

Cheers,
bzt

Re: Qemu error at boot

Posted: Tue Mar 10, 2020 11:45 am
by Onelio
Hi guys, thanks for the answers.

The reason why I load it there is because I want to use lower addresses to save data structures from the dummy kernel.bin file(which I want to use as the main program for my project, skipping the 2nd bootloader and using only the 512 bytes length one).
About how I found out that the problem was with the A20 Line... well... testing I saw that there was no problem when jumping to a lower address like 0x07C0:F7C0 and it does not happen in Bochs so I guessed that it may be caused by Qemu not enabling it.

Anyway, thanks again for the fast answers and for the piece advice bzt. I appreciate it a lot. Going to take a deep look at them (specially the 64bit stuff) and try to use gdb to see exactly where it breaks.
Maybe it is as u guys said and I'm breaking it somewhere else but it seems odd to me that just Qemu seems to give this problem.


Bye.

Re: Qemu error at boot

Posted: Tue Mar 10, 2020 2:46 pm
by bzt
Onelio wrote:Hi guys, thanks for the answers.
Welcome!
Onelio wrote:The reason why I load it there is because I want to use lower addresses to save data structures from the dummy kernel.bin file(which I want to use as the main program for my project, skipping the 2nd bootloader and using only the 512 bytes length one).
It's not going to be easy and you'll have to do some tricks to squeeze the code into 512 bytes, but it can be done. Examples: in imgrecv.asm, I've overlapped the GPT value's offset with the 64 bit GDT's null descriptor to save a few bytes. Using "xor ax,ax" requires fewer bytes to encode than "mov ax,0" and they both zero out the ax register, etc.
Onelio wrote:About how I found out that the problem was with the A20 Line... well... testing I saw that there was no problem when jumping to a lower address like 0x10.000 and it does not happen in Bochs so I guessed that it may be caused by Qemu not enabling it.
I think a little explanation is appropriate here. I don't get what you mean by 0x100.000 or 0x10.000, as those are not real mode addresses. See segmentation for more explanation. In a nutshell, under real mode, you have a 20 bit address space which is addressed by a 16 bit segment base and a 16 bit offset, and looks like 0x0000:0000. The base and offset overlaps 12 bits (16+16-12=20), and the linear address is calculated as base * 16 + offset. So for example 0x07C0:0000 and 0x0000:7C00 are the same linear address 0x07C00. In real mode with the 20 bits, you can only address 2^20 = 1 MB, 0x00000 to 0xFFFFF. Linear address 0x100000 and above are only accessible from protected mode, and only if A20 is enabled (which actually does extend the memory address from 20 bits, hence the name, A20 gate). I know it's difficult at first, but please keep counting the digits in the addresses, it will help understand what I'm saying. Now you can't use the full address space, as there's a ROM too which occupies some of the addresses, and the CPU and the BIOS needs some RAM too. You can only use 0x00500 to 0xA0000, which is 640k in total. 0xA0000 linear address can be accessed by 0x0A000:0000 segment+offset combination, and that's the "top" of the usable memory, everything above (up to 1M) is device memory.
Onelio wrote:Anyway, thanks again for the fast answers and for the piece advice bzt. I appreciate it a lot. Going to take a deep look at them (specially the 64bit stuff) and try to use gdb to see exactly where it breaks.
Maybe it is as u guys said and I'm breaking it somewhere else but it seems odd to me that just Qemu seems to give this problem.
Again, welcome! If your code does not work in any of bochs, qemu or VB, then you can be sure there's something wrong with your code, and you were just lucky with the others. I'd recommend to try your code on real hardware only after you can run it successfully in all of these three.

Good luck!
bzt

Re: Qemu error at boot

Posted: Tue Mar 10, 2020 3:45 pm
by Onelio
Hi again,

I apologize for the formatting of the address values. I used it to try to make it easier to understand but seems that got the other way around he he... (We use that weird format where I study so never thought about the possibility that it could be just a local thing)
I've updated my other two messages with the Real Mode format SEGMENT:OFFSET

I thought that addressing 0x100000 was possible with just the A20 Gate enabled. First time I hear about the need of the protected mode and the limit of 0xA0000. I'll work in my project taking that in mind.
By the way, I found out using the debugger that the problem happens whenever I try to read to 0x100000(0xF7C0:8400) in Qemu. Seems like it works in Bochs maybe because it has stuff enabled by default(I watched the physical memory at that position and everything is OK) but as you said I need to make sure that it will work in various devices.

Thanks once more for the help and advices,
Bye.

Re: Qemu error at boot

Posted: Wed Mar 11, 2020 1:14 am
by Octocontrabass
Onelio wrote:I thought that addressing 0x100000 was possible with just the A20 Gate enabled.
It is. I don't know what bzt is talking about, there would be no need for the A20 gate if those addresses weren't accessible in real mode.

The problem is that you're passing the address 0xf7c0:8400 to INT 0x13, which may need to translate it into a physical address for a DMA controller. Things can fall apart when you do that: should the translation assume A20 is enabled, or disabled? Does the DMA controller support addresses above 0xA0000? There are no definite answers, so the results can be different between emulators (as you've noticed already).

Re: Qemu error at boot

Posted: Wed Mar 11, 2020 4:59 am
by Onelio
Hi again,

I would like to share a little update.
After taking into account all said here I tried to debug everything again and seems like the problem is with the read int.

First, I saw with "info registers" that the A20 is already active (Also moved a "Loop:jmp Loop" (0xFEEB) to 0x100000 and it can be seen there with xp /3x 0x100000).
Image

Second, I tried setting up a breakpoint after the return from the int (in the ret instruction i placed after the int) and it seems to work without crashing. The only issue is that it does not load anything into memory. (Checked with xp /20x 0x100000)

Code: Select all

Read:
    mov     ah, 0x42
    int     0x13
    ret
PD: The int does not activate the CF when returning so apparently everything worked "fine"


Finally, I found out that even though it does not crash at reading it does happen because of it since the following instruction(the ret one) is the one that actually jumps at that weird address when the read targets to 0x100000. Notice that I went there with a call and return without pushing anything so this should not be an issue.

I'll continue debugging to see if I can notice something else.
Bye.

Re: Qemu error at boot

Posted: Wed Mar 11, 2020 7:23 am
by bzt
Octocontrabass wrote:It is. I don't know what bzt is talking about, there would be no need for the A20 gate if those addresses weren't accessible in real mode.
You misunderstood. Under real mode the addressing is supposed to turn around after 0xFFFFF. Unfortunately enough, many code, devices and some BIOSes among others expect this, that's the reason why there's an A20 gate (engineers could have simply expand the address bus width but they didn't do that to avoid incompatibility). Just because the CPU could use more than 20 bits in addresses doesn't mean the rest of the machine can too (but 8086/8088 can't, 80286 could access 24 bits tops). On the other hand in 32 bit protected mode, the addressing is expected to not turn around.

Note: using unreal mode on 386 and above, it is possible to rape segmentation to access 4G at once in 16 bit mode. But should you do that and is it a good idea to rely on an undocumented feature? I don't think so.
Octocontrabass wrote:Things can fall apart when you do that: should the translation assume A20 is enabled, or disabled? Does the DMA controller support addresses above 0xA0000? There are no definite answers, so the results can be different between emulators (as you've noticed already).
That's exactly what I was talking about.

Cheers,
bzt

Re: Qemu error at boot

Posted: Wed Mar 11, 2020 9:19 am
by Octocontrabass
bzt wrote:Under real mode the addressing is supposed to turn around after 0xFFFFF.
How do you figure? Intel documents this as one of the ways real mode differs from an actual 8086/8088.
bzt wrote:Note: using unreal mode on 386 and above, it is possible to rape segmentation to access 4G at once in 16 bit mode. But should you do that and is it a good idea to rely on an undocumented feature? I don't think so.
The Intel SDM Volume 3A section 9.9.2 explicitly states that segment descriptor attributes loaded in protected mode will continue to be used in real mode, so I'm not sure that counts as "undocumented". (And documented or not, Microsoft used it in HIMEM.SYS, so it will continue to work as long as real mode exists.)
bzt wrote:That's exactly what I was talking about.
Unless I've misread your post, you were talking only about the CPU. There are no problems there: enable the A20 line and the CPU can access physical addresses between 0x100000 and 0x10FFEF in real mode. It's the BIOS or the hardware that can't handle it.

Re: Qemu error at boot

Posted: Wed Mar 11, 2020 1:15 pm
by Onelio
Hi again,

I think I'm giving up so this thread can be now closed or achieved.
Seems like everything over 0xA000:0000 ends in a crash even though I've setup the A20 line, Qemu tells me that it is enabled and that I've tested it manually.

I don't know if it's a problem of Qemu or mine (provably mine) but I've only been able to reproduce this weird behavior here so I'll just do as you guys said and load it in a lower address(maybe 0x10000(0x07C0:8400) and pass on to the next task.

Thanks for your time and advice. It will be very helpful for the building next steps of my project,
Bye

Re: Qemu error at boot

Posted: Wed Mar 11, 2020 2:13 pm
by PeterX
Onelio wrote:Hi again,

I think I'm giving up so this thread can be now closed or achieved.
Seems like everything over 0xA000:0000 ends in a crash even though I've setup the A20 line, Qemu tells me that it is enabled and that I've tested it manually.
Read the following, please:
bzt wrote:You can only use 0x00500 to 0xA0000, which is 640k in total. 0xA0000 linear address can be accessed by 0x0A000:0000 segment+offset combination, and that's the "top" of the usable memory, everything above (up to 1M) is device memory.
Happy hacking
Peter