Calling the Kernel on Real Hardware

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Octocontrabass
Member
Member
Posts: 5586
Joined: Mon Mar 25, 2013 7:01 pm

Re: Calling the Kernel on Real Hardware

Post by Octocontrabass »

human00731582 wrote:I'm certain that the DD program is writing to the disk at sector 0
Sure, but is it sector 0 of the whole disk, or just sector 0 of the partition? Windows tends to hide the difference between those two. I like to use HxD to make sure everything ends up where I want it. (It's also useful for manually editing the disk.)
human00731582 wrote:Interesting, my interrupts were disabled when I called it.
Are you sure?
human00731582
Member
Member
Posts: 38
Joined: Wed Jul 19, 2017 9:46 pm

Re: Calling the Kernel on Real Hardware

Post by human00731582 »

Octocontrabass wrote:Sure, but is it sector 0 of the whole disk, or just sector 0 of the partition? Windows tends to hide the difference between those two. I like to use HxD to make sure everything ends up where I want it. (It's also useful for manually editing the disk.)
I took a look at my disk in my editor (HexEdit), and sure enough, sector 0 has my MBR on it. I also know this was the case because while editing the BPB, Windows started to recognize my drive as FAT-formatted once I wrote the new BPB code into the first sector.
hex-disk-read.png
Octocontrabass wrote: Are you sure?
That is an updated file. I am absolutely positive it was CLI right before any A20 operation. Regardless, the A20 is unimportant at this time for the particular hardware I'd like to run on, so I'm gonna JMP over that issue for now. 8)

Starting to get kind of desperate to find a solution. :? It has to be something uninitialized on the hardware -- something the emulator takes care of that hardware just faults on. I already know that it is absolutely 100% that instruction (the CALL/JMP one) that is faulting. Am I missing a special CS / EIP error? Does the stack need to be relocated from 0x90000? Is there something VIA-specific that I am not understanding?

I just don't know at this point.


EDIT: Just went further in the hex editor. It looks like I have plenty of old HTML/C++ files from a long time ago still lingering on the disk, though Windows says the drive is 'empty'. Interesting. I'm going to zero everything on the disk out, as that may be the problem, and I will report back (most likely with another edit/update on this post) with the results of the cleaning. Not expecting anything new, so I'm going to continue sanity-checking myself in the meantime... Having old data on the disk should have nothing to do with the operation of my kernel, because it's self-contained and has no places where the IP will trail off into miscellaneous data. This is likely an irrelevant problem.
2024-05-07: Returning from a 7-year disappearing act; please be kind.
User avatar
~
Member
Member
Posts: 1228
Joined: Tue Mar 06, 2007 11:17 am
Libera.chat IRC: ArcheFire

Re: Calling the Kernel on Real Hardware

Post by ~ »

In the long run having garbage around will help you ensure you have good code.

Distinguishing garbage from actual data is not really difficult.
LtG
Member
Member
Posts: 384
Joined: Thu Aug 13, 2015 4:57 pm

Re: Calling the Kernel on Real Hardware

Post by LtG »

Have you tried disassembling the code to see if everything is doing what you'd expect? Especially as it comes to pointers and jump/call destinations?

If you can't find any issues, then you could post the disassembly here, I might have a chance tomorrow to try to look thru your code and the disasm.


edit. I don't think you answered whether or not you have attempted to print what is at the KERNEL_OFFSET location (the place you jump to) after reading disk, just to confirm that there's what you expect. You can for example just print it as hex, etc. Then compare to HexEdit or disasm if it's correct, or test with Qemu and see what value it gives.

edit2. Am I reading your code right? Is your data_S in the GDT intentionally 16-bit?
human00731582
Member
Member
Posts: 38
Joined: Wed Jul 19, 2017 9:46 pm

Re: Calling the Kernel on Real Hardware

Post by human00731582 »

LtG wrote:edit2. Am I reading your code right? Is your data_S in the GDT intentionally 16-bit?
I think I'm going to have an aneurysm, haha. ](*,) I swear I reviewed the GDT for a whole week, and my code was still tricky. I did change it, however there was still no difference. Thanks, though!
LtG wrote:Have you tried disassembling the code to see if everything is doing what you'd expect? Especially as it comes to pointers and jump/call destinations? If you can't find any issues, then you could post the disassembly here, I might have a chance tomorrow to try to look thru your code and the disasm.
edit. I don't think you answered whether or not you have attempted to print what is at the KERNEL_OFFSET location (the place you jump to) after reading disk, just to confirm that there's what you expect.
I hadn't implemented any kind of memdump/HexToASCII code yet, but I just expediently wrote some. With your suggestion, I made a hex dump for my memory (gonna have to add it as a command later too -- it seems like a standard beginner-kernel thing, lol) at phys addr 0x10000. We've gotten somewhere now! The output on my VGA monitor is worlds apart from the output from QEMU! Check out these screenshots...
memdump-PHYSICAL.jpg
memdump-QEMU.png
It's those DAMNED BIOS functions again! I had the same problem with hardware INTs getting the A20 status-check to even work!
The problem is neither of the read functions are working -- I can go back to the old one if I change the loading destination back to 0x1000. But that INT 13h didn't work either. So, does this mean I'm dealing with some kind of automatic CR0 switch from the BIOS that disables its own interrupts or secures them somehow before it calls a bootloader? Is there such a thing?

Thanks again so so so much for the help, and taking your time to view my source-code.

EDIT: Try not to mind the myriad of spaces, NASM likes to drop every bit of data in my entire kernel at the top of the assembled image, I suppose! That includes my ridiculous amounts of spaces for my gloriously pointless splash screen! :twisted:
~ wrote:In the long run having garbage around will help you ensure you have good code. Distinguishing garbage from actual data is not really difficult.
It may not be difficult, but I personally prefer not to keep it around. :mrgreen:
2024-05-07: Returning from a 7-year disappearing act; please be kind.
Octocontrabass
Member
Member
Posts: 5586
Joined: Mon Mar 25, 2013 7:01 pm

Re: Calling the Kernel on Real Hardware

Post by Octocontrabass »

human00731582 wrote:I took a look at my disk in my editor (HexEdit), and sure enough, sector 0 has my MBR on it. I also know this was the case because while editing the BPB, Windows started to recognize my drive as FAT-formatted once I wrote the new BPB code into the first sector.
Your screenshot shows you editing sector 0 of the logical volume, which may or may not be sector 0 of the physical disk. You might need to run HexEdit as an administrator for the physical disk to show up in the "open special" dialog.
human00731582
Member
Member
Posts: 38
Joined: Wed Jul 19, 2017 9:46 pm

Re: Calling the Kernel on Real Hardware

Post by human00731582 »

Octocontrabass wrote:Your screenshot shows you editing sector 0 of the logical volume, which may or may not be sector 0 of the physical disk. You might need to run HexEdit as an administrator for the physical disk to show up in the "open special" dialog.
Alright, my curiosity has been 100% piqued. I took a look as an administrator -- unfortunately, in my frantic rummaging earlier today, I managed to not notice the HexEdit program had me selecting the drives/volumes/partitions and not the physical disks themselves. Interestingly, the physical location of my bootloader on the disk is at exactly the physical location in memory I'd like to load my kernel, though that seems to be pure coincidence.

Here is my loader, sitting patiently at 64 KiB...
hmmm.png
Now, I did try to partition/clean my flash drive with diskpart earlier. Without any partitions, of course my computer didn't even notice there was a drive connected. I tried to run DD as an administrator and that did not add it to physical address 0. How do I force a partition to remain at the actual physical start of the USB's memory? Perhaps I have to write to it directly with my hex editor or some other binary write-capable application. OR I could just load the data from 0x10200 and see if that works.

On top of all of this, I assume the emulator works because it assumes the base of the partition is the physical base or something. But it begs the question: if I am able to load a bootloader with the actual PC while the MBR is starting at physical storage location 0x10000 as shown in the screenshot, why is it not recognizing the next sector loading if it's using the first as the MBR (or physical sector 0)? Does INT 13h read RAW sectors from physical addy 0?

Getting more and more confused, but this is getting closer to a breakthrough. Hopefully I'm making sense. :lol:

Perhaps your wisdom comes not only from experience, but also from these trite beginner mistakes we all make. :-k :)
2024-05-07: Returning from a 7-year disappearing act; please be kind.
Octocontrabass
Member
Member
Posts: 5586
Joined: Mon Mar 25, 2013 7:01 pm

Re: Calling the Kernel on Real Hardware

Post by Octocontrabass »

human00731582 wrote:Now, I did try to partition/clean my flash drive with diskpart earlier.
Here's the trick: diskpart always writes a MBR with a partition table to the disk. What you want is a disk with no partition table at all, and you can't do that with diskpart.
human00731582 wrote:How do I force a partition to remain at the actual physical start of the USB's memory?
I use a hex editor to clear out the MBR, unplug the disk, and plug it back in. Windows will prompt to format the unreadable disk, and when it does, it will format the entire disk like a partition. There will be no MBR or partition table, and the filesystem will start in the first sector of the disk.
human00731582 wrote:But it begs the question: if I am able to load a bootloader with the actual PC while the MBR is starting at physical storage location 0x10000 as shown in the screenshot,
If it's not at LBA 0, it's not a Master Boot Record. If a disk has a MBR, it will always be in LBA 0. There are no exceptions. What you're looking at is the Volume Boot Record. The VBR can be either at the start of the disk (if the disk is unpartitioned) or the start of the partition.
human00731582 wrote:why is it not recognizing the next sector loading if it's using the first as the MBR (or physical sector 0)?
Windows is kind enough to install a MBR that reads the partition table and loads the VBR for you automatically. That's why your bootloader is running even though it's not at LBA 0.
human00731582 wrote:Does INT 13h read RAW sectors from physical addy 0?
Yes.
human00731582
Member
Member
Posts: 38
Joined: Wed Jul 19, 2017 9:46 pm

Re: Calling the Kernel on Real Hardware

Post by human00731582 »

Octocontrabass wrote:Here's the trick: diskpart always writes a MBR with a partition table to the disk. What you want is a disk with no partition table at all, and you can't do that with diskpart.
Windows is kind enough to install a MBR that reads the partition table and loads the VBR for you automatically. That's why your bootloader is running even though it's not at LBA 0.
This is very good information. With your help, I managed to find a page over at Microsoft explaining this as well:

Code: Select all

The master boot code performs the following activities:
   1. Scans the partition table for the active partition.
   2. Finds the starting sector of the active partition.
   3. Loads a copy of the boot sector from the active partition into memory.
   4. Transfers control to the executable code in the boot sector.
...
If the boot device is on a hard disk, the BIOS loads the MBR. The master boot code in the MBR loads the boot sector of the active partition, and transfers CPU execution to that memory address. On computers that are running Windows 2000, the executable boot code in the boot sector finds Ntldr, loads it into memory, and transfers execution to that file.
Unfortunately, I've been making the simple mistake of believing that Windows would actually give you the utility to format a disk 100%, without initializing some sort of FS, and planting its own MBR there. So this whole time, I've been running a "two-stage" loader without even knowing it! Thanks, Microsoft...! I do understand why it's useful if you wanted to multiboot, or had separate partitions for completely separate data, but geez what a headache. :roll:

------------------------------------------------

Here is my process to get the MBR on the actual first sector; for you guys to see, and for any OSDever in the future. Perhaps I can add this to a Wiki page if it seems valuable enough, that way it isn't buried in the forums.
  • Insert USB, start DISKPART.
  • Select the USB disk in DISKPART and use the CLEAN command on it.
  • Start DD for Windows in the Administrative command line, using "DD --list" to get a list of all active partitions/drives/removable-media.
  • Copy over the target media into a local file, to make sure that it is the partition we want to write to. Be careful to add the --size directive! Otherwise, DD will malfunction and you will end up with a file 4x bigger than your USB drive at least!
  • Open the local file/binary in a hex editor -- verify it's the MBR partition that you're targeting by looking at the first sector.
  • Now, write your image over the beginning of the sector with DD (see attached screenshot).
  • It worked! Verify it with your hex editor (screenshot below)!
THE GOOD NEWS: It works! You guys are awesome! I wish there was a karma system on this site so I could give back at least somehow. But once I'm more experienced I'll be in your shoes helping newer members with their problems, I'm sure. :)

Now it's on to figuring out why IRQ 15 (47 after remapping) is firing even though there's a mask on everything but 0&1 (timer&keyboard)! It's halting my program, with no indication as to why. Off I go! \:D/
2024-05-07: Returning from a 7-year disappearing act; please be kind.
LtG
Member
Member
Posts: 384
Joined: Thu Aug 13, 2015 4:57 pm

Re: Calling the Kernel on Real Hardware

Post by LtG »

The MBR isn't part of any FS, it's part of the "bootability" of a disk. And it's not just MS, it's the industry standard these days for BIOS. While BIOS itself is agnostic about it (AFAIK BIOS does not require there to be a partition table), it's the expected "formatting" of a disk. It's what all disk tools rely on.

As for your multipart setup, couldn't you make it simpler by just using a hexeditor to replace the MBR? Just do it to the _disk_ not a partition on the disk.

Note also, that the correct thing would be to _NOT_ replace the MBR, instead put your bootsector on one of the volumes and let the MBR load your bootsector and off you go. Normally you shouldn't really mess with the MBR, except possibly adding/changing partitions (as per users wishes) and marking your partition active so the MBR loads your bootsector.

As for the IRQ15, have you checked spurious IRQ's:
http://wiki.osdev.org/8259_PIC#Spurious_IRQs
human00731582
Member
Member
Posts: 38
Joined: Wed Jul 19, 2017 9:46 pm

Re: Calling the Kernel on Real Hardware

Post by human00731582 »

LtG wrote:Note also, that the correct thing would be to _NOT_ replace the MBR, instead put your bootsector on one of the volumes and let the MBR load your bootsector and off you go. Normally you shouldn't really mess with the MBR, except possibly adding/changing partitions (as per users wishes) and marking your partition active so the MBR loads your bootsector.
Wonderful suggestion. Once I get a more dynamic loading system going (soon :) ), I'm going to make sure I maintain the original MBR that Windows drops in there, so I can actually have a partition table. Then, after writing a filesystem setup for the drive, I could make my own DISKPART utility to mark active partitions as well. Of course, this is long down the road and I'm staying level-headed about it -- but hey, a guy can dream right? O:)
LtG wrote:As for the IRQ15, have you checked spurious IRQ's:
http://wiki.osdev.org/8259_PIC#Spurious_IRQs

Code: Select all

szSATAINT db "Secondary ATA...", 0
szSpurious db "Spurious", 0
szRealIRQ db "Real IRQ", 0
ISR_SecondaryATA:
	mov bl, 0x02
	mov esi, szSATAINT
	call _screenWrite
	
	; Check for spurious IRQ
	call PIC_getISR		; AH = slave ISR, AL = master ISR
	cmp ah, 10000000b	; Was the bit set?
	je .realIRQ
	mov dl, 0		; only send the master an EOI.
	call PIC_sendEOI
	mov esi, szSpurious
	jmp .leaveFunc
 .realIRQ:	
	mov dl, 15		; if it was real, send the EOI to both the slave and the master.
	call PIC_sendEOI
	mov esi, szRealIRQ
	
 .leaveFunc:
	call _screenWrite
	ret
This is of course going to change to actually handle the interrupt later -- this is here for now to debug the interruption. See the attached screenshot, it's definitely spurious. However, I realized my Master PIC was still masking the cascade channel (IRQ2) back from the inception of my software, so I enabled that and TA-DA, no more spurious IRQs and everything works! =D>
2024-05-07: Returning from a 7-year disappearing act; please be kind.
Post Reply