PCI device scan algorithm hangs sometimes

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
sevobal
Member
Member
Posts: 63
Joined: Sun Oct 22, 2006 7:11 am

PCI device scan algorithm hangs sometimes

Post by sevobal »

Hi everyone!

I've written a simple algorithm which scans all possible PCI busses, devices and device functions of a system and stores the address where a device were detected. This works like a charm on emulators and some PCs here. But some other computers are going to hang up so I guess there are a few aspects where my code could be improved:

Code: Select all

scan_pci_bus:
    push r9
    push rax
    push rbx
    push rcx
    push rdx

    mov r9, 0x104F00              ; Memory address where we store the address of all devices we found
    xor rax, rax                  ; This register will address the bus
    xor rbx, rbx                  ; This register will address the device on the bus
    xor rcx, rcx                  ; This register will address the function of a device

.bus_loop:
    cmp eax, 0xFF
    je .end

.device_loop:
    cmp ebx, 0x0B
    je .end_device_loop

.function_loop:
    cmp ecx, 0x09
    je .end_function_loop

    push rax
    xor rdx, rdx                  ; We need offset 0
    call pci_config_read          ; ( eax = bus, ebx = device, ecx = function, edx = offset);

    cmp eax, 0xFFFF0000
    jae .continue

    pop rax
    push rax
    push rbx
    push rcx

    shl eax, 16
    or eax, 0x80000000
    shl ebx, 11
    shl ecx, 8
    or eax, ebx
    or eax, ecx

    mov [r9], rax
    add r9, 0x08

    pop rcx
    pop rbx

.continue:
    pop rax
    inc ecx
    jmp .function_loop

.end_function_loop:
    xor ecx, ecx
    inc ebx
    jmp .device_loop

.end_device_loop:
    xor ebx, ebx
    inc eax
    jmp .bus_loop

.end:
    mov rax, 0xFFFFFFFFFFFFFFFF
    mov [r9], rax
    pop rdx
    pop rcx
    pop rbx
    pop rax
    pop r9
    ret

;*
;
; eax  => bus
; ebx  => device
; ecx  => function
; edx  => offset
; eax  <= retval
;
;*
pci_config_read:
    cli
    push rbx
    push rcx
    push rdx

    shl eax, 16
    or eax, 0x80000000
    shl ebx, 11
    shl ecx, 8
    and edx, 0xFC
    or eax, ebx
    or eax, ecx
    or eax, edx

    mov dx, 0xCF8                 ; PCI config address / register
    out dx, eax

    mov dx, 0xCFC                 ; PCI config data / register
    in eax, dx

    pop rdx
    pop rcx
    pop rbx
    sti
    ret
So maybe there is a big fault I didn't figured out yet...

Thanks for any hints!
sevobal
Member
Member
Posts: 63
Joined: Sun Oct 22, 2006 7:11 am

Re: PCI device scan algorithm hangs sometimes

Post by sevobal »

Nobody can help me ? I've debugged the code a little bit and it seems that all machines are going to hang in the bus_loop. Some machines are going to hang after 5 bus_loop runs, some are going to hang after 10. There is no connection between them.

I'm a little bit wired about this.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: PCI device scan algorithm hangs sometimes

Post by Brendan »

Hi,
sevobal wrote:I've written a simple algorithm which scans all possible PCI busses, devices and device functions of a system and stores the address where a device were detected. This works like a charm on emulators and some PCs here. But some other computers are going to hang up so I guess there are a few aspects where my code could be improved:
I took a quick look and didn't see anything that'd cause it to lock up.

There's plenty of "wrongness" though. You're searching bus 0x00 to bus 0xFE (then stop before checking bus 0xFF), devices 0x00 to 0x0A and functions 0x00 to 0x08; but there's up to 256 buses, up to 32 devices per bus and up to 8 functions per device. Basically you're checking a function that can't exist, and only doing 25245 checks (255 * 11 * 9) instead of doing 65536 checks (256 * 32 * 8 ).

Next, scanning the PCI buses like this is slow (and it could be several hundred times faster). There's flags in a device's PCI header that determine if it has more than one function or not; so for a single-function device you can check the first function then skip the other 7 functions. Also, if/when you find a "PCI to PCI bridge" you can get the "secondary bus" number from it's header to determine the number of the bus on the other side of the bridge; which means you only need to scan PCI buses that actually exist and don't need to scan all possible bus numbers.

The easiest way to do this is with a recursive routine (e.g. a routine that scans one bus; that calls itself if/when it finds a PCI to PCI bridge). To keep the code clean/maintainable I'd also use a separate routine for checking one device, and checking one function. For example:

Code: Select all

checkBus:
    for(device = 0; device < 32; device++) {
        checkDevice(bus, device);
    }
    ret


checkDevice:
    if(isDeviceMultifunction(bus, device) == TRUE) {
        for(function = 0; function < 8; function++) {
            checkFunction(bus, device, function);
        }
    } else {
        checkFunction(bus, device, 0x00);
    }
}


checkFunction:
    type = getFunctionType(bus, device, function);
    switch(type) {
        case PCItoPCIbridge:
            secondaryBus = getSecondaryBusForBridge(bus, device, function);
            if(secondaryBus != 0x00) checkBus(secondaryBus);
            break;
        default:
            break;
    }
    ret

Basically what I'm saying is that regardless of whether your code works or not, it's probably better to rewrite it than fix it.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
sevobal
Member
Member
Posts: 63
Joined: Sun Oct 22, 2006 7:11 am

Re: PCI device scan algorithm hangs sometimes

Post by sevobal »

Hi,
thanks for your reply. I modified my code and now he only scans bus 0 and the busses which are connected to it over a PCI-to-PCI bridge. Now the algorithm works like a charm in emulators AND on real machines (and it is much faster than the original one). Thank you!
Post Reply