Physical mem management when a driver requests a physical ad

AndrewAPrice · Post by **AndrewAPrice** » Sun May 10, 2020 7:45 pm

I was wondering how other operating systems handle devices and drivers that request physical memory addresses?

The situation I'm wondering about is when a driver requests access to a physical address, but that memory was already allocated to another process.

In this case, I'd assume you'd have to copy that memory to another physical page, and then update the process's page table.

How do people implement this? I could build a physical address -> {pid, virtual address} map, but this might waste a lot of memory. Instead, I was thinking of scanning through every known page directory. This would be slow, but hopefully it's only done during a driver's startup.

What do other people do?

Octocontrabass · Post by **Octocontrabass** » Sun May 10, 2020 9:26 pm

AndrewAPrice wrote:The situation I'm wondering about is when a driver requests access to a physical address, but that memory was already allocated to another process.

Why would a driver ever need to do that?

nullplan · Post by **nullplan** » Sun May 10, 2020 9:33 pm

Well, Linux just trusts its drivers. There is a function called ioremap(), and it just maps whatever address you put in into memory, and then the driver can go and use it. The thing is, drivers have pretty much no reason to ever request a specific physical address out of RAM. Device memory is not in RAM, and for DMA there are other calls you can make to allocate memory and translate the virtual address for the device. Therefore, what you are worried about is unlikely to happen. Unless a driver were bad, but in that case you have lost the machine, anyway, since the driver runs with highest privileges.

MollenOS · Post by **MollenOS** » Sun May 10, 2020 11:08 pm

My devicemanager determines the available io-ranges (memory, ports) that the driver can request by reading PCI/PCIe buses, other physical addresses the driver won't be able to access. My drivers run in userspace code, so they can only be given what is determined is for them.

AndrewAPrice · Post by **AndrewAPrice** » Mon May 11, 2020 4:22 am

Octocontrabass wrote:
AndrewAPrice wrote:The situation I'm wondering about is when a driver requests access to a physical address, but that memory was already allocated to another process.
Why would a driver ever need to do that?

Memory mapped IO such as PCI Express, HPET, VESA. Or do these addresses always exist beyond physical RAM?

OSwhatever · Post by **OSwhatever** » Mon May 11, 2020 6:16 am

One interesting implementation of this is the original L4 kernel with its strange recursive mapping.

http://os.inf.tu-dresden.de/L4/l4doc.html

This is of course most applicable for microkernels. A process could basically give another process its pages and the kernel kept track of it. This design is now obsolete and is not longer used by later version of the L4 kernel, however it serves as an example how you could do it.

In general, a good approach is that a physical page has one owner, that be the kernel or a user process. A page can then be given (or granted in L4 language) to another process. I opted for a solution that physical pages are never exposed to a user space process, they can only be given pages from another process (based on virtual location) or retrieve a physical area by name and map it with name + offset.

AndrewAPrice · Post by **AndrewAPrice** » Mon May 11, 2020 7:20 am

OSwhatever wrote:In general, a good approach is that a physical page has one owner, that be the kernel or a user process. A page can then be given (or granted in L4 language) to another process. I opted for a solution that physical pages are never exposed to a user space process, they can only be given pages from another process (based on virtual location) or retrieve a physical area by name and map it with name + offset.

Thanks. I understand this part. How does the reverse mapping work?

The problem I'm thinking about is your graphics driver might ask for a physical address such as 0xB8000 or perhaps 0xE0000000 because this is where the frame buffer is mapped.

If another process has this physical page mapped to its virtual address space, we'd have to copy whatever was this page to another page and udate their page table, so this memory is now free for the driver. no? So, unless I want to do a slow scan of all page tables, I will need to build a reverse page table (that goes from physical page -> virtual page + owner PID)?

Or, is memory mapped IO guaranteed to not overlap physical RAM? (In which case, if 0xC0000000->0xFFFFFFFF was reserved for IO, a true 32-bit system without PAE couldn't access more than 3GB of RAM?)

AndrewAPrice · Post by **AndrewAPrice** » Mon May 11, 2020 8:14 am

Doing a bit of reading, it looks like hardware mapped memory doesn't overlap available memory:

This sentence is a little concerning:

With 4 GB or more of RAM installed, and with RAM occupying a contiguous range of addresses starting at 0, some of the MMIO locations will overlap with RAM addresses.

I boot with GRUB 2. Can I trust these regions not be tagged as 'available' in the multiboot memory map?

I just have to make sure if a driver releases memory mapped hardware pages, I don't add them back into the pool of available system memory.

(I could store this stuff in bits 9-11 of the page table entry.)

But, with some DMA hardware only supporting 24-bit or 32-bit memory addresses, it's possible that I will have to move a page out of the way in lower physical memory, thus will still need an inverted page table (so I can move this memory to another location, and update that process's page table so it wasn't aware of the move)?

neon · Post by **neon** » Mon May 11, 2020 9:42 am

Hi,

If the system has sufficient RAM to occupy the entire available address space, regions of the address space will be reserved by hardware (and thus will no longer be available memory.) All this means is the amount of available memory becomes less then the amount of RAM actually installed (you'll see this phenomenon in any 32-bit OS with 4GB of RAM installed.) In other words, if the BIOS returns the region is available (or available+reclaimable) it is free for use and will not be hardware mapped.

AndrewAPrice · Post by **AndrewAPrice** » Mon May 11, 2020 10:05 am

neon wrote:In other words, if the BIOS returns the region is available (or available+reclaimable) it is free for use and will not be hardware mapped.

Thanks!

I do feel that an inverted page table is neccessary for allocating physical memoy for DMA. That way, I can implement "grab me 16 KB under 16 MB" and my kernel can copy what is currently occupied out of the way.

nullplan · Post by **nullplan** » Mon May 11, 2020 11:45 am

AndrewAPrice wrote:I do feel that an inverted page table is neccessary for allocating physical memoy for DMA. That way, I can implement "grab me 16 KB under 16 MB" and my kernel can copy what is currently occupied out of the way.

Alternative idea: Since lower addresses are more valuable than higher addresses, prioritize giving away the higher addresses before you move on to the lower ones. That way, constrained allocations that need to be below some limit are more likely to find free memory in the area they need. Common limits on the PC include 1MB, 16MB, and 4GB.
Moving allocations is usually a really big hassle best avoided, especially when referential data structures enter the mix. Besides, changing a major part of the OS just to support ISA DMA, a rarely used thing these days, seems overkill to me.

Of course, this means you must change your physical allocator to one that can differentiate between these memory zones. Currently you only have a single stack allocator. But even just splitting that up into four different lists would do the trick

AndrewAPrice · Post by **AndrewAPrice** » Mon May 11, 2020 12:04 pm

nullplan wrote:
AndrewAPrice wrote:I do feel that an inverted page table is neccessary for allocating physical memoy for DMA. That way, I can implement "grab me 16 KB under 16 MB" and my kernel can copy what is currently occupied out of the way.
Alternative idea: Since lower addresses are more valuable than higher addresses, prioritize giving away the higher addresses before you move on to the lower ones. That way, constrained allocations that need to be below some limit are more likely to find free memory in the area they need. Common limits on the PC include 1MB, 16MB, and 4GB.
Moving allocations is usually a really big hassle best avoided, especially when referential data structures enter the mix. Besides, changing a major part of the OS just to support ISA DMA, a rarely used thing these days, seems overkill to me.

Of course, this means you must change your physical allocator to one that can differentiate between these memory zones. Currently you only have a single stack allocator. But even just splitting that up into four different lists would do the trick

Thanks for the advice!

I'm thinking of implementing my inverted page directory (incase I do need to move something out of the way, I can find out who is using it), and also populating my stack of free pages so the first to pop off are the highest addresses.

Then, I can have a system call (for drivers only) to allocate physical memory for DMA that takes the following parameters:

The maximum address (e.g. 1MB, 16MB, 4GB) that we'll start scanning backwards from.
The number of pages we need.
Alignment (e.g. it might have to be at 4 KB, 16 KB.)
Barrier that contiguous pages can't cross (e.g. 64 KB.)

This would let me implement it in a generic microkernel way that has no concept of the 1MB/16MB/4GB address limitations of the hardware we might be talking to, and by scanning backwards we can use the highest physical memory address supported by the device.

OSwhatever · Post by **OSwhatever** » Mon May 11, 2020 6:48 pm

AndrewAPrice wrote:Thanks. I understand this part. How does the reverse mapping work?

The problem I'm thinking about is your graphics driver might ask for a physical address such as 0xB8000 or perhaps 0xE0000000 because this is where the frame buffer is mapped.

If another process has this physical page mapped to its virtual address space, we'd have to copy whatever was this page to another page and udate their page table, so this memory is now free for the driver. no? So, unless I want to do a slow scan of all page tables, I will need to build a reverse page table (that goes from physical page -> virtual page + owner PID)?

Or, is memory mapped IO guaranteed to not overlap physical RAM? (In which case, if 0xC0000000->0xFFFFFFFF was reserved for IO, a true 32-bit system without PAE couldn't access more than 3GB of RAM?)

I'm not sure I understand the question. In the system I proposed was that the driver will not ask for physical pages but a name for a certain physical area and thus hide the need for dealing with physical pages. This requires that some process or part of the kernel goes through the physical memory tables (whatever method the system provides, like ACPI) and give them some name during boot. When the kernel grants another process a certain page, then it of course deals with physical pages but the user process will not see those. It just looks whatever physical page it is and maps that into the other user process page table.

AndrewAPrice wrote: Thanks!

I do feel that an inverted page table is neccessary for allocating physical memoy for DMA. That way, I can implement "grab me 16 KB under 16 MB" and my kernel can copy what is currently occupied out of the way.

This is the infamous x86 legacy problems that DMA can only work under certain physical addresses. For example Linux solves this by having several pools of physical pages. So one pool under 1MB for example for old ISA stuff and is restricted somewhat so that the OS use pages in other pools first. This is really horrible but that's what you have to do in order to support x86 legacy HW. Here the driver would just ask "map me some ISA compatible pages", so you don't really have to ask for specific physical pages again.

I have still not approached any use case where user processes or user drivers (microkernel design) need to know the addresses of physical pages. It has just occurred to me once if a user process needs to own physical pages that aren't mapped in its virtual address space, which was just a though that I had which gave up quickly. Problem is that with demand paging/swapping the kernel can just grab a page right under the user process nose and later insert another physical page. This also relates a little bit to the model of L4 kernel, if a user process starts to give pages to several user processes the kernel really needs to go through all those processes when swapping out pages. Complicated but that's what the tracking is there for among other things.

Speaking of inverted page tables, they seem to be the thing of the past in most modern CPU architectures. They blew up during the 90s and after that CPU designers went back to the old forward page tables again.

OSDev.org

Physical mem management when a driver requests a physical ad

Physical mem management when a driver requests a physical ad

Re: Physical mem management when a driver requests a physica

Re: Physical mem management when a driver requests a physica

Re: Physical mem management when a driver requests a physica

Re: Physical mem management when a driver requests a physica

Re: Physical mem management when a driver requests a physica

Re: Physical mem management when a driver requests a physica

Re: Physical mem management when a driver requests a physica

Re: Physical mem management when a driver requests a physica

Re: Physical mem management when a driver requests a physica

Re: Physical mem management when a driver requests a physica

Re: Physical mem management when a driver requests a physica

Re: Physical mem management when a driver requests a physica