Page 1 of 1

PCI Bus Mastering DMA

Posted: Wed May 11, 2022 1:30 pm
by laen
Hello everyone.
I'm kind of new to os developpement and so far I've been able to setup the idt, the gdt, paging, memory map, pci device enumeration... And then I started struggling at something that seems not hard, but no matter the amount of material I read, i don't quite understand it.
I know what a DMA is, i've been using it for embedded systems such as STM32 and esp32 cards. If i understand it correctly, it is a peripheral that copies data from one place to another without requiring CPU instructions except for it's setup. A friend explained to me that bus mastering DMA is the same idea as paging/segmentation except it's for other device memory.
I also understood (i believe partially) how to remap BARs, by writing to them new ram places where to map to, with the good memory alignment.
As most tutorials talk about setting up bus mastering dma and i stuggle to find a tutorial on it I guess it's just writing to BAR ? am I right ? or else how to set it up ?
Have a nice day !

Re: PCI Bus Mastering DMA

Posted: Wed May 11, 2022 4:28 pm
by Octocontrabass
laen wrote:I also understood (i believe partially) how to remap BARs, by writing to them new ram places where to map to, with the good memory alignment.
BARs determine the physical address for MMIO. If you program a BAR to point to RAM, the RAM and the MMIO will conflict with each other. You should only set BARs to point to unused physical addresses.
laen wrote:As most tutorials talk about setting up bus mastering dma and i stuggle to find a tutorial on it I guess it's just writing to BAR ? am I right ? or else how to set it up ?
There's no tutorial on it because bus mastering DMA is different for each device class. If you want to set up bus mastering DMA for a particular device, read the device's specification.

There is one thing all PCI bus mastering DMA has in common though: you must set bit 2 of the command register in the configuration space to enable bus mastering.

Re: PCI Bus Mastering DMA

Posted: Thu May 12, 2022 12:30 am
by laen
Thanks for your answer!
Octocontrabass wrote: There's no tutorial on it because bus mastering DMA is different for each device class. If you want to set up bus mastering DMA for a particular device, read the device's specification.

There is one thing all PCI bus mastering DMA has in common though: you must set bit 2 of the command register in the configuration space to enable bus mastering.
I guess I kind of get it now: the PCI bus mastering DMA is actually located on the device, it's not an embedded peripheral in the CPU chip or PCI controller as it is the case in embedded systems right? So I have to configure the device to tell itself to enable the feature and how to map itself/where to write the data it sends me right? No matter I couldn't find anything in the intel x86 manual !!
Octocontrabass wrote: There's no tutorial on it because bus mastering DMA is different for each device class. If you want to set up bus mastering DMA for a particular device, read the device's specification.
Is it defined once for a standard device class (let's say all AHCI devices share the same bus mastering configuration mechanism for PCI bus mastering) or is it defined for one particular PCI device (let's says seagate SSD) or something in-between?
Octocontrabass wrote: BARs determine the physical address for MMIO. If you program a BAR to point to RAM, the RAM and the MMIO will conflict with each other. You should only set BARs to point to unused physical addresses.
This I understand, I was trying to say in the memory hole meant for this purpose, which is marked as "reserved" for the bios int 0x15, eax=0xE820. I guess calling it RAM is wrong as RAM refers to physical memory.

Re: PCI Bus Mastering DMA

Posted: Thu May 12, 2022 12:57 am
by rdos
laen wrote: I guess I kind of get it now: the PCI bus mastering DMA is actually located on the device, it's not an embedded peripheral in the CPU chip or PCI controller as it is the case in embedded systems right? So I have to configure the device to tell itself to enable the feature and how to map itself/where to write the data it sends me right? No matter I couldn't find anything in the intel x86 manual !!
You should think of it as a program in the PCI device that can access physical memory in your computer. In the typical case, the device defines some method for how to communicate with the operating system (through a device driver), and BARs are the entry-points for this. When you access a BAR you actually access memory in the device and not your own physical memory. This is typically slow, and so many PCI devices let the operating system / device driver build memory based schedules in physical memory. It's these schedules that the PCI device access with bus mastering. However, the PCI device can access any memory location it likes (and without asking the operating system for permission). A PCI device can also trigger MSI interrupts by writing to the APIC interrupt controller address, and it can communicate with other PCI devices. Actually, a hostile PCI device can take over the operating system by sending boot messages and manipulating the boot code.

Re: PCI Bus Mastering DMA

Posted: Thu May 12, 2022 3:37 am
by laen
rdos wrote:
You should think of it as a program in the PCI device that can access physical memory in your computer. In the typical case, the device defines some method for how to communicate with the operating system (through a device driver), and BARs are the entry-points for this. When you access a BAR you actually access memory in the device and not your own physical memory. This is typically slow, and so many PCI devices let the operating system / device driver build memory based schedules in physical memory. It's these schedules that the PCI device access with bus mastering. However, the PCI device can access any memory location it likes (and without asking the operating system for permission). A PCI device can also trigger MSI interrupts by writing to the APIC interrupt controller address, and it can communicate with other PCI devices. Actually, a hostile PCI device can take over the operating system by sending boot messages and manipulating the boot code.
So, if I get your point, the aim is to tell the PCI device "rather than me writing and reading to your memory which is slow (through addresses in the MMIO pointed by BARs) and take up a lot of CPU time, read periodically the info I write in this memory space in physical memory and answer in this other physical RAM space, and when you are done answering tell me by toggling this interrupt through MSI". Did I get this right?

Re: PCI Bus Mastering DMA

Posted: Thu May 12, 2022 12:20 pm
by Octocontrabass
laen wrote:Is it defined once for a standard device class (let's say all AHCI devices share the same bus mastering configuration mechanism for PCI bus mastering) or is it defined for one particular PCI device (let's says seagate SSD) or something in-between?
Standard devices implement bus mastering according to the specification. For example, the AHCI specification says all AHCI controllers must implement bus mastering in the same way, so you only need to write one driver to support all AHCI controllers. On the other hand, PCI IDE controllers are not required to implement bus mastering, so a vendor could choose to add some vendor-specific bus mastering hardware instead of following the standard.

In theory, a vendor could add vendor-specific bus mastering hardware alongside the standard bus mastering hardware, but I doubt it's very common.
laen wrote:This I understand, I was trying to say in the memory hole meant for this purpose, which is marked as "reserved" for the bios int 0x15, eax=0xE820. I guess calling it RAM is wrong as RAM refers to physical memory.
The firmware memory map doesn't include available addresses. PCI MMIO can't use reserved addresses, typically because some other device is already mapped there. Keep in mind the firmware memory map is not complete: it doesn't include PCI or ACPI devices.
laen wrote:So, if I get your point, the aim is to tell the PCI device "rather than me writing and reading to your memory which is slow (through addresses in the MMIO pointed by BARs) and take up a lot of CPU time, read periodically the info I write in this memory space in physical memory and answer in this other physical RAM space, and when you are done answering tell me by toggling this interrupt through MSI". Did I get this right?
Usually you still need to access MMIO to notify the device when you've written info into memory; otherwise that's correct.

Re: PCI Bus Mastering DMA

Posted: Thu May 12, 2022 12:21 pm
by rdos
laen wrote:
rdos wrote:
You should think of it as a program in the PCI device that can access physical memory in your computer. In the typical case, the device defines some method for how to communicate with the operating system (through a device driver), and BARs are the entry-points for this. When you access a BAR you actually access memory in the device and not your own physical memory. This is typically slow, and so many PCI devices let the operating system / device driver build memory based schedules in physical memory. It's these schedules that the PCI device access with bus mastering. However, the PCI device can access any memory location it likes (and without asking the operating system for permission). A PCI device can also trigger MSI interrupts by writing to the APIC interrupt controller address, and it can communicate with other PCI devices. Actually, a hostile PCI device can take over the operating system by sending boot messages and manipulating the boot code.
So, if I get your point, the aim is to tell the PCI device "rather than me writing and reading to your memory which is slow (through addresses in the MMIO pointed by BARs) and take up a lot of CPU time, read periodically the info I write in this memory space in physical memory and answer in this other physical RAM space, and when you are done answering tell me by toggling this interrupt through MSI". Did I get this right?
Yes, something like that. Typically, the PCI device has a control area in the BAR where you tell the device you have some new work for it. It will know that you write to that area since it goes to local memory in the PCI device. It will not know that you write to orinary physical memory, and would waste memory bandwidth by polling schedules in that area, so the protocol typically requires writing to the BAR to add new work for the device. The notication from the PCI device to the operating systems is through interrupts, ideally triggered with the MSI or MSI-X mechanism.

Re: PCI Bus Mastering DMA

Posted: Thu May 12, 2022 1:53 pm
by rdos
The best way to understand PCI is to construct & program a PCI device and then do a driver in an operating system.

Just to show the point that PCI devices can do anything you program them to do, I can give you an example from my FPGA project:

So, I use an FPGA to interface a two channel high-speed ADC & DAC. The FPGA has 4x PCI express lanes which allows streaming data close to 4GB/s by using large request sizes (128 bytes or more). Because synthezing FPGA programs is a slow process, I defined a parameter & control area in BAR0. BAR0 allows to send commands to the clock control logic of the FPGA and the control logic of the ADC & DAC. The FPGA uses an SPI protocol internally, but the driver just writes a function number & data and then the FPGA executes the SPI function. This way different configurations & sampling frequences can be setup. It's also possible to setup test patterns from the ADC to verify that the streaming works from source (ADC) to physical memory of the PC. BAR1 contains pointers to 2MB physical memory areas that the ADC streams data to. The ADC is started by setting a bit in the control register in BAR0. The driver can setup an MSI interrupt to be notified every time the FPGA is done with a 2M block, and can thus keep track of the progress.

It's notable that in order to achieve high throughput on PCI, there is a need to use large request sizes. This is because there is a considerable overhead consisting of addresses & control info that preceeds every transfer. The CPU can normally only issue 4 byte or 8 byte accesses, which results in considerable overhead. An PCI-device can issue much larger requests. That's the primary reason why busmastering by a PCI device can acheive much higher speed than the CPU accessing data from the device (through BARs).

Re: PCI Bus Mastering DMA

Posted: Fri May 13, 2022 5:42 am
by laen
First of all, thanks to everyone for your and help !
rdos wrote:The best way to understand PCI is to construct & program a PCI device and then do a driver in an operating system.

Just to show the point that PCI devices can do anything you program them to do, I can give you an example from my FPGA project:
I might have something like this to do at some point in the future as my girlfriend is currently working on a PhD about the algorithm/architecture adequation (not sure if this exists in English, not a native speaker) regarding the grid network simulation. So she has a big amount of data to give to the [hardware] she uses. As for now she is working on GPU, but she will have to switch to FPGA eventually, and as I'm going to have to help her with VHDL/co-design, i began wondering how i could interface with it, which is the main reason I restarted working on my os (the issue I always had was that I had to make the storage drivers and more specifically the USB ones, which made me lazy, but which I'll do when I'll have understood simpler ones).
She needs to achieve high bandwidth with her FPGA to retrieve big amounts of data as there is a lot of nodes in the electrical grid, so i guess PCIe Bus mastering DMA is the best option to implement.

But for now I'll focus on having a good enough understanding to resume my os development.

Once again thousand thanks !!