IDE controller booting issues
Posted: Fri Jan 24, 2014 4:23 pm
Hello all,
This is my first post on OSDev.org. If this community is not the best place for this post, could someone please suggest a better-suited forum?
My colleagues and I are developing a PCIe (legacy endpoint), FPGA-based, custom expansion ROM, IDE controller. We have implemented our expansion ROM based upon the BIOS Enhanced Disk Drive Specification-4 (EDD-4). We have confirmed that our expansion ROM INT 0x13 responses are similar to those returned by a commercial IDE controller.
Our expansion ROM is setup with a Boot Connection Vector (BCV). The BCV hooks INT 0x13 so that we can implement the functions specific to our controller. During boot, BIOS makes INT 0x13 calls and we handle them per EDD-4. BIOS loads the MBR and jumps into it, passing control to the bootloader and then the kernel. Linux makes a variety of calls using our INT 0x13 function, all of them complete successfully. It is not until the kernel modules begin to load and initialize that boot fails.
We are testing on Ubuntu 12.04 LTS (kernel 3.2.24). We have three scenarios to present, with the hard drives defined as follows:
Disk A has Ubuntu installed behind GRUB (installed using mobo controller).
Disk B is formatted as ext4, has no OS installed, and has a few test files on it.
Scenario 1: (Please see attached: dmesg.txt and lspci.txt.)
For this scenario, we have Disk A connected to the mobo controller and Disk B connected to our controller. Disk A is set as the primary boot device in BIOS. Disk A will successfully boot, but Disk B is not accessible within Linux. Our controller is connected to PCI bus 1 and is represented as the Xilinx IDE Interface in lspci.txt. From dmesg.txt, it is apparent that the Linux driver, pata_acpi, disables our PCI interrupt (time index: 1.980056) and lspci.txt indicates that bus mastering (which is supported: prog-if 0x85) is being disabled. One other thing to note, the proper driver for our controller is ata_piix, not pata_acpi.
Scenario 2:
For this scenario, we have Disk A connected to the mobo controller and Disk B connected to our controller. Disk A is set as the primary boot device in BIOS. Where scenario 2 differs from scenario 1 is that we configure our controller to falsely report the status register (0x7F, no drive attached) when BIOS first interrogates our controller (BIOS only seems to interrogate our controller once during POST.); in subsequent calls we accurately return the status register. What we are seeing is that BIOS interrogates the controller within milliseconds after power-on and sees that no drive is attached and then continues to boot Linux from Disk A. Linux is able to successfully setup our controller (using ata_piix) and we are able to access it within Linux. So in essence, advertising initially no drive attached forces BIOS to bypass our controller early on in POST, but allows Linux to interact with the controller once a healthy status has been registered a few seconds later.
Scenario 3:
For this scenario, we have Disk A connected to our controller with no other hard drives connected to the system. Disk A is set as the primary boot device in BIOS. With this setup, the systems gets through BIOS POST, then executes the MBR, which in turn loads GRUB, which then loads the kernel. Boot fails when the kernel tries to mount the root file system with error message: "Gave up waiting for root device. ALERT! /dev/disk/by-uuid/... does not exist".
We are currently searching the Linux sources for any indication as to why the pata_acpi driver is being assigned (as opposed to ata_piix) and why our PCI interrupt is being disabled. If anyone in the community has any hints as to why this is happening we would greatly appreciate it.
Regards,
Eric
This is my first post on OSDev.org. If this community is not the best place for this post, could someone please suggest a better-suited forum?
My colleagues and I are developing a PCIe (legacy endpoint), FPGA-based, custom expansion ROM, IDE controller. We have implemented our expansion ROM based upon the BIOS Enhanced Disk Drive Specification-4 (EDD-4). We have confirmed that our expansion ROM INT 0x13 responses are similar to those returned by a commercial IDE controller.
Our expansion ROM is setup with a Boot Connection Vector (BCV). The BCV hooks INT 0x13 so that we can implement the functions specific to our controller. During boot, BIOS makes INT 0x13 calls and we handle them per EDD-4. BIOS loads the MBR and jumps into it, passing control to the bootloader and then the kernel. Linux makes a variety of calls using our INT 0x13 function, all of them complete successfully. It is not until the kernel modules begin to load and initialize that boot fails.
We are testing on Ubuntu 12.04 LTS (kernel 3.2.24). We have three scenarios to present, with the hard drives defined as follows:
Disk A has Ubuntu installed behind GRUB (installed using mobo controller).
Disk B is formatted as ext4, has no OS installed, and has a few test files on it.
Scenario 1: (Please see attached: dmesg.txt and lspci.txt.)
For this scenario, we have Disk A connected to the mobo controller and Disk B connected to our controller. Disk A is set as the primary boot device in BIOS. Disk A will successfully boot, but Disk B is not accessible within Linux. Our controller is connected to PCI bus 1 and is represented as the Xilinx IDE Interface in lspci.txt. From dmesg.txt, it is apparent that the Linux driver, pata_acpi, disables our PCI interrupt (time index: 1.980056) and lspci.txt indicates that bus mastering (which is supported: prog-if 0x85) is being disabled. One other thing to note, the proper driver for our controller is ata_piix, not pata_acpi.
Scenario 2:
For this scenario, we have Disk A connected to the mobo controller and Disk B connected to our controller. Disk A is set as the primary boot device in BIOS. Where scenario 2 differs from scenario 1 is that we configure our controller to falsely report the status register (0x7F, no drive attached) when BIOS first interrogates our controller (BIOS only seems to interrogate our controller once during POST.); in subsequent calls we accurately return the status register. What we are seeing is that BIOS interrogates the controller within milliseconds after power-on and sees that no drive is attached and then continues to boot Linux from Disk A. Linux is able to successfully setup our controller (using ata_piix) and we are able to access it within Linux. So in essence, advertising initially no drive attached forces BIOS to bypass our controller early on in POST, but allows Linux to interact with the controller once a healthy status has been registered a few seconds later.
Scenario 3:
For this scenario, we have Disk A connected to our controller with no other hard drives connected to the system. Disk A is set as the primary boot device in BIOS. With this setup, the systems gets through BIOS POST, then executes the MBR, which in turn loads GRUB, which then loads the kernel. Boot fails when the kernel tries to mount the root file system with error message: "Gave up waiting for root device. ALERT! /dev/disk/by-uuid/... does not exist".
We are currently searching the Linux sources for any indication as to why the pata_acpi driver is being assigned (as opposed to ata_piix) and why our PCI interrupt is being disabled. If anyone in the community has any hints as to why this is happening we would greatly appreciate it.
Regards,
Eric