OSDev.org

Posted: **Wed Jul 13, 2022 4:07 am**

So I have some perfectly working code I've developed on my Debian machine. It works there.
Trying to execute that same code in an Arch machine, it fails to read. If I compile the image in Debian I can execute it in Arch, so it's not something with QEMU.
It seems like some compiler weirdness but I'm using a cross compiler from within a debian docker container so it should be exactly the same across both machines...
right before int 13h i have

Code: Select all

AH=0x42, DL=0x80, DS=0, SI=0x7920
x /b 0x7920+0 -> 0x10
x /b 0x7920+1 -> 0x00
x /h 0x7920+2 -> 0x01
x /w 0x7920+4 -> 0x10880
x /q 0x7920+8 -> 0x01

and right after it i have

Code: Select all

CF=0
AH=0

which should mean that the read was succesfull, except the memory area is

Code: Select all

x /5b 0x10880 -> 0x41 0x0 0x42 0x0 0x43

with this alternating growing pattern.

idk if the code is relevant but here it is:
https://github.com/Bonfra04/BonsOS/blob ... disk.c#L34
(yes it's heavily "inspired" from Limine)

Posted: **Wed Jul 13, 2022 7:33 am**

What is the status of the packet after the transfer? Size and number of blocks transferred should get modified after the call.
When you disassemble the read_sector() and rm_int() routine and compare it, does it differ ? One from debian and one compiled on arch.

Posted: **Wed Jul 13, 2022 10:51 am**

mtbro wrote:What is the status of the packet after the transfer? Size and number of blocks transferred should get modified after the call.

They are left untouched, the entire packet is the same before and after the interrupt

mtbro wrote: When you disassemble the read_sector() and rm_int() routine and compare it, does it differ? One from Debian and one compiled on Arch.

This is the diff of the read sector function, it is the only compiler from c, rm_int is assembled directly and I highly doubt that would change (checking the first instructions manually it seems to be the same).
Seems like it uses the 64-bit version of some instructions in the working one? Which doesn't make a lot of sense since this is a 32-bit binary

Posted: **Wed Jul 13, 2022 12:58 pm**

The buffer address in the DAP is a 16-bit far pointer, not a 32-bit linear pointer.

Bonfra wrote:They are left untouched, the entire packet is the same before and after the interrupt

What are the register values after the interrupt? (Including EFLAGS.)

Bonfra wrote:Seems like it uses the 64-bit version of some instructions in the working one? Which doesn't make a lot of sense since this is a 32-bit binary

Your disassembler doesn't know those are 32-bit instructions so it's disassembling them as 64-bit. Tell your disassembler to disassemble 32-bit code and you should get results that make sense.

Posted: **Wed Jul 13, 2022 1:11 pm**

Octocontrabass wrote:The buffer address in the DAP is a 16-bit far pointer, not a 32-bit linear pointer.

but here it states that the transfer buffer is a 4 byte dword, I surely do belive you infact I'm trying to hopelessly update the code rn but why here does it say this? also why would this packet work in the (let's call it) Debian occasion but not int Arch's one.

Octocontrabass wrote: What are the register values after the interrupt? (Including EFLAGS.)

AH isset to zero and EFLAGS = [ IF ZF PF ]

Octocontrabass wrote: Your disassembler doesn't know those are 32-bit instructions so it's disassembling them as 64-bit. Tell your disassembler to disassemble 32-bit code and you should get results that make sense.

Yea you're right it makes sense now XD

Posted: **Wed Jul 13, 2022 1:23 pm**

Bonfra wrote:but here it states that the transfer buffer is a 4 byte dword, I surely do belive you infact I'm trying to hopelessly update the code rn but why here does it say this?

The BIOS EDD specification says it's a 4-byte dword, but also says that the dword contains a 16-bit far pointer instead of a 32-bit linear pointer. Whoever wrote that part of RBIL only copied the size of the data and not its type.

Bonfra wrote:also why would this packet work in the (let's call it) Debian occasion but not int Arch's one.

I would guess it doesn't work in either case. What data do you see in memory in the "working" version? Does that data actually match the contents of your disk?

Bonfra wrote:AH isset to zero and EFLAGS = [ IF ZF PF ]

The read is successful, but the data ends up in the wrong location in memory.

Posted: **Wed Jul 13, 2022 1:36 pm**

Octocontrabass wrote: The BIOS EDD specification says it's a 4-byte dword, but also says that the dword contains a 16-bit far pointer instead of a 32-bit linear pointer. Whoever wrote that part of RBIL only copied the size of the data and not its type.

so 16 out of the 32 bits of this field are for the far pointer; what are the other 16 bits used for? maybe a dumb question but it's been a while since I did anything that's not in 64 bits land.
This is the formula I used for 32-bit far pointers

Code: Select all

LIN_TO_FAR_ADDR(linaddr) (((linaddr >> 4) << 16) | (linaddr & 0xf))

I guess I can easily convert it to get a 16-bit far pointer as so(?):

Code: Select all

LIN_TO_FAR_ADDR(linaddr) (((linaddr >> 2) << 8) | (linaddr & 0xf))

Octocontrabass wrote: I would guess it doesn't work in either case. What data do you see in memory in the "working" version? Does that data actually match the contents of your disk?

well yes my entire kernel is loaded by this function so unless both QEMU, VMWare, VBox, and the motherboard I use for testing are doing some crazy stuff with the image generated by Debian (and not with the Arch one), it does work

Octocontrabass wrote: The read is successful, but the data ends up in the wrong location in memory.

yea it makes total sense

Posted: **Wed Jul 13, 2022 1:46 pm**

While it's possible to peek around in qemu console I personally prefer gdb and use qemu's gdb stub to connect to it. While real mode in gdb is pain you can use gdb's macros and/or help yourself with python to make things easier. Last time I had hidden mistake of mine I make this gdb command in python to ease the debugging:

Code: Select all

#!/usr/bin/python
import sys

def b2int( raw_bytes ):
        """ Convert raw bytes into number."""
        return int.from_bytes(raw_bytes, byteorder='little')

def peek_qword( addr_start ):
        return b2int(gdb.selected_inferior().read_memory(addr_start, 8))

class lba_dap(gdb.Command):
        """dump DAP packet"""

        def __init__ (self):
                super (lba_dap, self).__init__ ("lba_dap", gdb.COMMAND_USER)

        def invoke (self, arg, from_tty):
                if len(arg) == 0:
                        print ("Arguments required (address of the DAP).")
                        return 1

                args = arg.split(" ")
                # parse the arguments
                try:
                        addr_dap = int(args[0], 16)

                except:
                        print ("Failed to parse arguments. Arguments need to be in hex format.")
                        return 2

                t = peek_qword(addr_dap)
                t2 = peek_qword(addr_dap+8)

                dap_buf_seg = (t >> 48) & 0xffff
                dap_buf_ofst =(t >> 32) & 0xffff
                dap_linear = 0x10*dap_buf_seg + dap_buf_ofst

                print("DAP: %x"%(addr_dap) + "\nsize:\t0x%x"%(t & 0xff) + "\nblocks:\t%d"%( (t>>16) &0xffff)
                                + "\nbuf:\t%.04x:"%(dap_buf_seg) + "%.04x"%(dap_buf_ofst) + " ( 0x%x )"%(dap_linear) + "\nLBA:\t0x%x"%(t2))

lba_dap()

You source it in gdb with source command and then use lda_dap to dump the contents of the packet.

Posted: **Wed Jul 13, 2022 1:54 pm**

Bonfra wrote:so 16 out of the 32 bits of this field are for the far pointer; what are the other 16 bits used for? maybe a dumb question but it's been a while since I did anything that's not in 64 bits land.

The lower 16 bits are the offset, the upper 16 bits are the segment. It's a far pointer in 16-bit mode.

Bonfra wrote:well yes my entire kernel is loaded by this function so unless both QEMU, VMWare, VBox, and the motherboard I use for testing are doing some crazy stuff with the image generated by Debian (and not with the Arch one), it does work

But does the data in memory match the data on the disk? It's entirely possible some coincidence (or bug) happens to result in your kernel running despite the incorrect load address.

Posted: **Wed Jul 13, 2022 2:11 pm**

mtbro wrote:While it's possible to peek around in qemu console I personally prefer gdb and use qemu's gdb stub to connect to it. While real mode in gdb is pain you can use gdb's macros and/or help yourself with python to make things easier.

I... don't see anything weird about gdb in 16 bits, I've always used it with QEMU just fine, anyway, this is a pretty cool utility I'm surely adding it to my debugging tools :)

Octocontrabass wrote: But does the data in memory match the data on the disk? It's entirely possible some coincidence (or bug) happens to result in your kernel running despite the incorrect load address.

Weirdly enough, yes. All 486KB of my kernel were loaded correctly
Anyhow, I changed that address to be a far pointer and now it boots also in the Arch machine! Thanks a lot! I really need to find some better reference before continuing writing code XD

Posted: **Wed Jul 13, 2022 2:27 pm**

Bonfra wrote: I... don't see anything weird about gdb in 16 bits

Well gdb is not aware of segmented addresses, you need to do the translation yourself. By default setting the target architecture is not enough either - disassembling and single stepping can yield bogus output. Using set tdesc with proper xml description file helps. Also once asked here.

OSDev.org

int 13h/ah=42h fails if compiled on Arch

int 13h/ah=42h fails if compiled on Arch

Re: int 13h/ah=42h fails if compiled on Arch

Re: int 13h/ah=42h fails if compiled on Arch

Re: int 13h/ah=42h fails if compiled on Arch

Re: int 13h/ah=42h fails if compiled on Arch

Re: int 13h/ah=42h fails if compiled on Arch

Re: int 13h/ah=42h fails if compiled on Arch

Re: int 13h/ah=42h fails if compiled on Arch

Re: int 13h/ah=42h fails if compiled on Arch

Re: int 13h/ah=42h fails if compiled on Arch

Re: int 13h/ah=42h fails if compiled on Arch