Weird behaviour of PCIe controller on RPI4

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
pvc
Member
Member
Posts: 201
Joined: Mon Jan 15, 2018 2:27 pm

Weird behaviour of PCIe controller on RPI4

Post by pvc »

I want to get PCIe controller running on RPI4. It's almost working, but not exactly right. There is a weird problem with accessing configuration space. USB controller (which is, I believe, the only PCIe device in RPI4, apart from root complex) is responding to every even device number on bus 0. Odd numbers all return all FFs in their configuration space (which is expected). But the weirdest part is that I get SErr exceptions (which is basically bus error in AArch64) when accessing odd numbered devices AFTER accessing even numbered devices or accessing odd numbered device twice. I can access even devices as much as I want without generating exception. There is also that delay (about 12 seconds) before exceptions is signalled after bad access.

Here is my initialization code

Code: Select all

static void writeField(ulong addr, u32 mask, unsigned shift, u32 val)
{
    u32 tmp = CPU::MMIORead32(addr);
    tmp &= ~mask;
    tmp |= (val << shift) & mask;
    CPU::MMIOWrite32(addr, val);
}

static void setGen(unsigned gen)
{
    u32 lnkcap = CPU::MMIORead32(pcieBase + CAP_REGS + PCI_EXP_LNKCAP);
    u16 lnkctl2 = CPU::MMIORead16(pcieBase + CAP_REGS + PCI_EXP_LNKCTL2);

    lnkcap = (lnkcap & ~PCI_EXP_LNKCAP_SLS) | gen;
    CPU::MMIOWrite32(pcieBase + CAP_REGS + PCI_EXP_LNKCAP, lnkcap);

    lnkctl2 = (lnkctl2 & ~0xFu) | static_cast<u16>(gen);
    CPU::MMIOWrite16(pcieBase + CAP_REGS + PCI_EXP_LNKCTL2, lnkctl2);
}

static inline unsigned makeIdx(unsigned bus, unsigned dev, unsigned fun)
{
    bus &= BUS_MASK; dev &= DEV_MASK; fun &= FUN_MASK;
    return bus << 20 | dev << 15 | fun << 12;
}

errcode PCI::Initialize()
{
    errcode stat = ESUCCESS;
    pcieBase = Paging::MapMMIO(&stat, PCIE_BASE, PCIE_SIZE);
    if(stat != ESUCCESS)
    {
        pcieBase = 0;
        return stat;
    }

    // assert bridge reset
    writeField(pcieBase + RGR1_SW_INIT_1, RGR1_SW_INIT_1_INIT_MASK, RGR1_SW_INIT_1_INIT_SHIFT, 1);
    Time::Sleep(1, false);

    // assert fundamental reset
    writeField(pcieBase + RGR1_SW_INIT_1, RGR1_SW_INIT_1_PERST_MASK, RGR1_SW_INIT_1_PERST_SHIFT, 1);
    Time::Sleep(1, false);

    // deassert bridge reset
    writeField(pcieBase + RGR1_SW_INIT_1, RGR1_SW_INIT_1_INIT_MASK, RGR1_SW_INIT_1_INIT_SHIFT, 0);
    Time::Sleep(1, false);

    // enable serdes
    writeField(pcieBase + HARD_PCIE_HARD_DEBUG, HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK,
               HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_SHIFT, 0);
    Time::Sleep(1, false);

    // get hardware revision
    hwRev = CPU::MMIORead32(pcieBase + PCIE_REVISION) & 0xFFFFu;

    // disable and clear any pending interrupts
    CPU::MMIOWrite32(pcieBase + MSI_INTR2_BASE + INTR_CLR, 0xFFFFFFFFu);
    CPU::MMIOWrite32(pcieBase + MSI_INTR2_BASE + INTR_MASK_SET, 0xFFFFFFFFu);

    // what does this exactly do ???
    CPU::MMIOWrite32(pcieBase + CPU_2_PCIE_MEM_WIN0_LO, 0);
    CPU::MMIOWrite32(pcieBase + CPU_2_PCIE_MEM_WIN0_HI, 0);
    CPU::MMIOWrite32(pcieBase + CPU_2_PCIE_MEM_WIN0_BASE_LIMIT, 0);
    CPU::MMIOWrite32(pcieBase + CPU_2_PCIE_MEM_WIN0_BASE_HI, 0);
    CPU::MMIOWrite32(pcieBase + CPU_2_PCIE_MEM_WIN0_LIMIT_HI, 0);

    // initialize SCB_MAX_BURST_SIZE, CFG_READ_UR_MODE, SCB_ACCESS_EN and CTRL_SCB0_SIZE
    CPU::MMIOWrite32(pcieBase + MISC_CTRL,
                     CTRL_SCB0_SIZE(LOG2DMASIZE - 15) |
                     MAX_BURST_SIZE(BURST_SIZE_128) |
                     CFG_READ_UR_MODE(1) |
                     SCB_ACCESS_EN(1));

    // setup inbound memory view
    CPU::MMIOWrite32(pcieBase + RC_BAR2_CONFIG_LO, (LOG2DMASIZE - 15));
    CPU::MMIOWrite32(pcieBase + RC_BAR2_CONFIG_HI, 0);

    // disable PCIe->GISB and PCIe->SCB
    CPU::MMIOWrite32(pcieBase + RC_BAR1_CONFIG_LO, 0);
    CPU::MMIOWrite32(pcieBase + RC_BAR3_CONFIG_LO, 0);

    // setup MSIs
    CPU::MMIOWrite32(pcieBase + MSI_BAR_CONFIG_LO, (MSI_TARGET_ADDR & 0xFFFFFFFFu) | 1);
    CPU::MMIOWrite32(pcieBase + MSI_BAR_CONFIG_HI, MSI_TARGET_ADDR >> 32);
    CPU::MMIOWrite32(pcieBase + MSI_DATA_CONFIG, hwRev >= HW_REV_33 ? 0xffe06540 : 0xFFF86540);
    // TODO: add MSI handler registration here

    // cap controller to Gen2
    setGen(2);

    // deassert fundamental reset
    writeField(pcieBase + RGR1_SW_INIT_1, RGR1_SW_INIT_1_PERST_MASK, RGR1_SW_INIT_1_PERST_SHIFT, 0);
    for(unsigned int i = 0; i < 10; ++i)
    {
        if((CPU::MMIORead32(pcieBase + PCIE_STATUS) & 0x30) == 0x30)
            break;
        Time::Sleep(100, false);
    }

    // check if link is up
    if((CPU::MMIORead32(pcieBase + PCIE_STATUS) & 0x30) != 0x30)
    {
        Debug::PutFmt("PCIe link is down\n");
        return EERROR;
    }
    Debug::PutFmt("PCIe link is up\n");

    // check if controller is running in root complex mode
    if((CPU::MMIORead32(pcieBase + PCIE_STATUS) & 0x80) != 0x80)
    {
        Debug::PutFmt("PCIe controller is not running in root complex mode\n");
        return EERROR;
    }

    // set proper class Id
    CPU::MMIOWrite32(pcieBase + RC_CFG_PRIV1_ID_VAL3, 0x060400);

    // set proper endian
    writeField(pcieBase + RC_CFG_VENDOR_VENDOR_SPECIFIC_REG1,
               RC_CFG_VENDOR_VENDOR_SPECIFIC_REG1_ENDIAN_MODE_BAR2_MASK,
               RC_CFG_VENDOR_VENDOR_SPECIFIC_REG1_ENDIAN_MODE_BAR2_SHIFT,
               DATA_ENDIAN);

    // set debug mode
    writeField(pcieBase + HARD_PCIE_HARD_DEBUG, HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK,
               HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_SHIFT, 1);

    return ESUCCESS;
}
I don't fully understand all of it, because, as it always is with Broadcom devices, there no hint of documentation to be found anywhere. This code is based on 2 only pieces of Broadcom PCIe code I found on the Internet. One is Linux driver and second is Plan 9 driver.

I also don't fully understand used terminology either. What is MDIO, GISB, SCB, inbound and outbound memory?

And here is how I access configuration space

Code: Select all

u32 PCI::ConfigRead32(unsigned bus, unsigned dev, unsigned fun, unsigned reg)
{
    cfgLock.Lock();
    u32 res = 0;
    if(!bus && !dev)
        res = CPU::MMIORead32(pcieBase + reg);
    else
    {
        CPU::MMIOWrite32(pcieBase + EXT_CFG_INDEX, makeIdx(bus, dev, fun));
        res = CPU::MMIORead32(pcieBase + EXT_CFG_DATA + reg);
    }
    cfgLock.Unlock();
    return res;
}

void PCI::ConfigWrite32(unsigned bus, unsigned dev, unsigned fun, unsigned reg, u32 val)
{
    cfgLock.Lock();
    if(!bus && !dev)
        CPU::MMIOWrite32(pcieBase + reg, val);
    else
    {
        CPU::MMIOWrite32(pcieBase + EXT_CFG_INDEX, makeIdx(bus, dev, fun));
        CPU::MMIOWrite32(pcieBase + EXT_CFG_DATA + reg, val);
    }
    cfgLock.Unlock();
}
All controller MMIO space (0xFE500000 to 0xFE50A000) is configured as nGnRE device memory. Caches and MMU are enabled.
Last edited by pvc on Thu Jan 30, 2020 2:16 pm, edited 2 times in total.
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Weird behaviour of PCIe controller on RPI4

Post by bzt »

Hi,

I didn't had the chance to play with the RPi4 yet, so I'm just guessing here. As far as I know, there should be at least a network controller on the bus too (unlike RPi3, it's not connected to the USB any more, rather connected directly to the PCIe).

Your problem sounds like a cache-related mapping issue, I think you're on the right track. Have you mapped the MMIO area with outer shareable, accessed, no execute and attridx pointing to a device nGnRE in MAIR? I mean, I know you wanted to use that, but can you confirm it is really what's configured? I think what you need for PCIe is 0x04 in MAIR, and (1UL<<54)|(1<<10)|(2<<8)|(attrIdx<<2) in the page table attributes.

If you issue a data barrier (before read and after write), does that solve the issue?

This "CPU::MMIOWrite16" looks peculiar to me, as far as I know, Broadcom SoCs only allow 32 bit read/writes on only 32 bit aligned addresses in the MMIO region, so it seems correct to throw a bus error exception otherwise (at least that was the case for BCM2835 - BCM2837, not sure about BCM2711). What if you replace those reads with "CPU::MMIORead32() & 0xFFFF" and the writes with "CPU::MMIOWrite32((CPU::MMIORead32() & 0xFFFF0000) | (newvalue & 0xFFFF))"? Again, just a guess. Sorry I lack empirical experience here, I can't provide an exact solution, but hopefully I could gave you some pointers.

Further tips: check out Circle++, it runs on RPi4, written in C++ and has a PCIe driver. It also defines a different memory range for the PCIe than yours (another definition here, also out of your range). Could this mean you don't actually map the entire PCIe region as device memory?

Good luck!
bzt
User avatar
pvc
Member
Member
Posts: 201
Joined: Mon Jan 15, 2018 2:27 pm

Re: Weird behaviour of PCIe controller on RPI4

Post by pvc »

My MAIR_EL1 is set to 0x44444444444404FF.

PCIe MMIO range mappings are:

Code: Select all

0xffffffd000011000: 0x00000000fd500607
...
0xffffffd00001a000: 0x00000000fd509607
So outer shareable, accesed, attridx = 1. XN and PXN bits didn't make any difference (as expected).

Changed these 16 bit accesses to 32 bit. Tried adding memory barriers to my MMIORead and MMIOWrite routines (just to be sure). And sprinkled good amount of cache flushes. But still no difference.
These extra ranges present in Circle++ (and also in Linux .dtb for that matter) may be worth investigating. But, from what I understand, these are for DMA and device accesses. But, yet again, I may be completely wrong.

Thanks for letting me know about Circle++. Maybe I can get some more information from it.
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Weird behaviour of PCIe controller on RPI4

Post by bzt »

That looks okay to me. That's exactly what I would have set up :-( It must be the range then?
pvc wrote:Thanks for letting me know about Circle++. Maybe I can get some more information from it.
Welcome! You can ask the author on the RPi forums if you have questions, goes by the name "rst".

I'm sorry I couldn't help more. I'm usually very confident about things as I usually have empirical experience on the matter. Well, this is definitely not the case with RPi4, I'm just guessing and I'm only trying to help because I know not many forum members have written code for ARM yet (but their numbers are growing, with you for example :-)). Keep up the good work!

Cheers,
bzt
User avatar
pvc
Member
Member
Posts: 201
Joined: Mon Jan 15, 2018 2:27 pm

Re: Weird behaviour of PCIe controller on RPI4

Post by pvc »

I am still having configuration space issues. But now I can, at least, access USB controller over PCIe. To do that I had to enable root bridge (just like any other PCI-PCI bridge). That moved USB controller to PCI address 1.0.0 (instead of 0.0.0), so it doesn't interfere with root complex (which sits at 0.0.0 by design). For now I just hardcoded USB controller address and I am not using automatic PCI bus enumeration. Now I can peek and poke USB controller registers just fine.

I also had to setup outbound memory range at base = 0x600000000.

Code: Select all

    u64 limit = base + size - 1;

    CPU::MMIOWrite32(pcieBase + CPU_2_PCIE_MEM_WIN0_LO, base & 0xFFFFFFFu);
    CPU::MMIOWrite32(pcieBase + CPU_2_PCIE_MEM_WIN0_HI, base >> 32);

    base >>= 20;
    limit >>= 20;

    CPU::MMIOWrite32(pcieBase + CPU_2_PCIE_MEM_WIN0_BASE_LIMIT, static_cast<u32>((base & 0xFFF) << 4 | (limit & 0xFFF) << 20));
    CPU::MMIOWrite32(pcieBase + CPU_2_PCIE_MEM_WIN0_BASE_HI, static_cast<u32>(base >> 12));
    CPU::MMIOWrite32(pcieBase + CPU_2_PCIE_MEM_WIN0_LIMIT_HI, static_cast<u32>(limit >> 12));
I did setup inbound window as well (which is for DMA I think) at address 0 to end of RAM. But that I did not test yet.

Code: Select all

    CPU::MMIOWrite32(pcieBase + RC_BAR2_CONFIG_LO, (MEM_PCIE_DMA_RANGE_PCIE_START & 0xFFFFFFFFu) | (MEM_PCIE_DMA_RANGE_LOG2SIZE - 15));
    CPU::MMIOWrite32(pcieBase + RC_BAR2_CONFIG_HI, MEM_PCIE_DMA_RANGE_PCIE_START >> 32);
There is also MSI window

Code: Select all

    CPU::MMIOWrite32(pcieBase + MSI_BAR_CONFIG_LO, (MSI_TARGET_ADDR & 0xFFFFFFFFu) | 1);
    CPU::MMIOWrite32(pcieBase + MSI_BAR_CONFIG_HI, MSI_TARGET_ADDR >> 32);
    CPU::MMIOWrite32(pcieBase + MSI_DATA_CONFIG, hwRev >= HW_REV_33 ? 0xFFE06540 : 0xFFF86540);
but I have no idea how MSIs work and what I am going to do with them yet.
crowston
Posts: 1
Joined: Sun May 03, 2020 8:41 am

Re: Weird behaviour of PCIe controller on RPI4

Post by crowston »

I too am writing a driver for the Rpi4 PCI-e controller, although mine is in C. So I guess this post may be off topic, if so, please delete it, no rule breaking was intended.

Anyway, I have found that the hardware has a lot of sharp edges. It is especially difficult because for licensing reasons I am very limited in what I can read in the GPL'd sources without becoming polluted with the GPL.
pvc wrote:Odd numbers all return all FFs in their configuration space (which is expected). But the weirdest part is that I get SErr exceptions (which is basically bus error in AArch64) when accessing odd numbered devices AFTER accessing even numbered devices or accessing odd numbered device twice. I can access even devices as much as I want without generating exception. There is also that delay (about 12 seconds) before exceptions is signalled after bad access.
This was my experience as well. This problem also disturbs the JTAG link in such a way that further accesses to memory over the JTAG fail, requiring a hardware reset. It is a most irritating design.

In the end I too stopped enumerating the bus and have hardcoded the valid peripherals on the bus.
bzt wrote:As far as I know, there should be at least a network controller on the bus too (unlike RPi3, it's not connected to the USB any more, rather connected directly to the PCIe).
That is not the case. The ethernet controller is connected to the CPU over RGMII. (Indeed, in FreeBSD current, we now have working ethernet but still no functioning PCI-e.) The only end point on the PCI-e bus is the VIA USB controller. That said, it is possible to desolder the VIA chip and connect a PCI-PCI bridge in its place; folks have reported success with the Linux driver doing that.
pvc wrote:I am still having configuration space issues. But now I can, at least, access USB controller over PCIe. To do that I had to enable root bridge (just like any other PCI-PCI bridge). That moved USB controller to PCI address 1.0.0 (instead of 0.0.0), so it doesn't interfere with root complex (which sits at 0.0.0 by design). For now I just hardcoded USB controller address and I am not using automatic PCI bus enumeration. Now I can peek and poke USB controller registers just fine.
Yes, the root complex misindentifies itself at boot.
pvc wrote:There is also MSI window

Code: Select all

    CPU::MMIOWrite32(pcieBase + MSI_BAR_CONFIG_LO, (MSI_TARGET_ADDR & 0xFFFFFFFFu) | 1);
    CPU::MMIOWrite32(pcieBase + MSI_BAR_CONFIG_HI, MSI_TARGET_ADDR >> 32);
    CPU::MMIOWrite32(pcieBase + MSI_DATA_CONFIG, hwRev >= HW_REV_33 ? 0xFFE06540 : 0xFFF86540);
but I have no idea how MSIs work and what I am going to do with them yet.
I too am stuck at this point.
pvc wrote:I don't fully understand all of it, because, as it always is with Broadcom devices, there no hint of documentation to be found anywhere. [...] I also don't fully understand used terminology either. What is MDIO, GISB, SCB, inbound and outbound memory?
Tell me about it. I was able to get some very limited documentation, but I can't really share it, and it's next to useless anyway.

MDIO is a concept associated with ethernet devices attached by MII. MII is a very simple bus used to transport ethernet frames from a controller directly to a CPU, which is popular on embedded devices (and used by the rpi4 for its ethernet). MDIO, as I understand, is an out-of-band bus used to configure the interface. I don't know why it would be relevant in this context.

GISB is some kind of proprietary system bus Broadcom uses on some embedded devices. We don't need to care about it on the rpi. I'm not sure what it stands for.

SCB is system control bus. Another bus, I don't know any more details.

Inbound/outbound memory accesses. So the controller has a view of the system memory. The system has a view of the PCI-e devices. The addressing schemes on each side of the controller are not the same.

So you probably configured bus address 0xf8000000 corresponds to CPU address 0x600000000 (that's the default on Linux at least). There is also a size to that window. So if a memory access from the CPU arrives at the controller intended for address 0x600000000, the controller translates it to 0xf8000000 before forwarding it to the PCI-e bus.
User avatar
pvc
Member
Member
Posts: 201
Joined: Mon Jan 15, 2018 2:27 pm

Re: Weird behaviour of PCIe controller on RPI4

Post by pvc »

By trial and error I've somehow got PCIe and USB controllers to generally work but thanks to all these issues with RPi and its documentation I've decided not to bother with it anymore and focus more on PC and FPGA. I may try to work with ARM again later, but not with Broadcom. That's for sure. IMO. RPi line is one the worst choices for OSDev platform, as it is today.
osdevsimon
Posts: 1
Joined: Sat Sep 12, 2020 9:01 am

Re: Weird behaviour of PCIe controller on RPI4

Post by osdevsimon »

pvc wrote:By trial and error I've somehow got PCIe and USB controllers to generally work but thanks to all these issues with RPi and its documentation I've decided not to bother with it anymore and focus more on PC and FPGA. I may try to work with ARM again later, but not with Broadcom. That's for sure. IMO. RPi line is one the worst choices for OSDev platform, as it is today.
Hey pvc, any chance you could share your final solution that worked for you?

I'm walking down the same path myself right now and have hit a bit of a wall. Seeing what worked for you would be massively helpful!

Thanks
Post Reply