Novel modular kernel design

Korona · Post by **Korona** » Sat Oct 02, 2021 12:19 pm

Solar wrote:Evil hardware could just short out and fry your machine, drivers be damned. Focus on the problems you can solve. Malicious hardware is not on the OS to protect against.

It's not too difficult to imagine situations where silently extracting data (and sending it to a 3rd party) is more useful (and "evil") than frying a machine.

rdos · Post by **rdos** » Sat Oct 02, 2021 1:02 pm

Korona wrote:
Solar wrote:Evil hardware could just short out and fry your machine, drivers be damned. Focus on the problems you can solve. Malicious hardware is not on the OS to protect against.
It's not too difficult to imagine situations where silently extracting data (and sending it to a 3rd party) is more useful (and "evil") than frying a machine.

Agreed, and particularly network cards that can do this without going through the OS. And practically all modern network cards are based on PCIe. Also remember that most WiFi solutions are closed designs with no driver source. And they often are manufactured in China.

nexos · Post by **nexos** » Sat Oct 02, 2021 1:23 pm

The problem is, the OS can do nothing to protect itself against malicious hardware. Malicious hardware has to be nstalled either by the motherboard manufacturer, the user, or by a computer store. Simple solution: get your hardware and computers from reputable places

Malicious drivers, OTOH, or much more of a threat. A malicious PCIe card can't magically find itself attached to a PCIe slot on the motherboard. The user installed it more then likely. A malicious driver, however, could be maliciously installed on Linux / Windows / MacOS if the user found a way to circumvent their security systems.

Of course, more concerning is that ring 0 software other then the kernel installs drivers these days. Look at GRUB. There's an attack surface that would be even more tricky to recover from!

Or, the firmware. I like UEFI for a lot of its stuff, but allowing drivers to be loaded from the ESP and ROM is a disaster waiting to happen.

Hence, instead of worrying about what can't be controlled, OS developers should be worried about things that can be controlled. Although it appears there is still some software out of out control nowadays

Ethin · Post by **Ethin** » Tue Oct 05, 2021 6:03 pm

nexos wrote:The problem is, the OS can do nothing to protect itself against malicious hardware. Malicious hardware has to be nstalled either by the motherboard manufacturer, the user, or by a computer store. Simple solution: get your hardware and computers from reputable places

Malicious drivers, OTOH, or much more of a threat. A malicious PCIe card can't magically find itself attached to a PCIe slot on the motherboard. The user installed it more then likely. A malicious driver, however, could be maliciously installed on Linux / Windows / MacOS if the user found a way to circumvent their security systems.

Of course, more concerning is that ring 0 software other then the kernel installs drivers these days. Look at GRUB. There's an attack surface that would be even more tricky to recover from!

Or, the firmware. I like UEFI for a lot of its stuff, but allowing drivers to be loaded from the ESP and ROM is a disaster waiting to happen.

Hence, instead of worrying about what can't be controlled, OS developers should be worried about things that can be controlled. Although it appears there is still some software out of out control nowadays

That's what secure boot/measured boot/verified boot is designed to prevent. Granted, it isn't perfect, but what is? The idea is that if secure boot is enabled and the platform firmware isn't compromised, and the user uses their own keys (and nobody obtains the MS private key), the firmware won't load binaries that aren't signed. So you just add your own private keys, sign the binaries you want to load in your firmware usingsomething like hashtool (I think its called), and then you'll be fine. If an attacker tries to install a malicious driver on the ESP, FW won't load it until you sign it, or that's how its supposed to work. Again, it isn't perfect, but its a good start. The ROM is a totally different can of worms.
Edit: I think that measured boot takes care of the ROM part. If I understand it right, the idea with that is that the TPM "measures" all the firmware components based on hashes stored within the TPM PCRs while the platform is being initialized, and if any of the hashes don't match, the entire initialization process is halted. But again, I might be wrong about that.

nexos · Post by **nexos** » Tue Oct 05, 2021 7:08 pm

Secure Boot in theory is good, but the key got leaked, so secure boot, which really is the best they could do, isn't foolproof. But that's not the UEFI forum's fault

Ethin · Post by **Ethin** » Tue Oct 05, 2021 7:31 pm

nexos wrote:Secure Boot in theory is good, but the key got leaked, so secure boot, which really is the best they could do, isn't foolproof. But that's not the UEFI forum's fault

I edited my post so you might want to re-read it. But no, its not the UEFI forums fault, and if you don't use Microsoft's keys and sign everything yourself, you'll avoid that issue entirely. (We're also running under the assumption that MS did not reissue their keys in, say, a Windows Update. I don't think MS is that stupid, and I'm pretty sure they did, in fact, reissue their key in a Windows Update; Windows Updates can and do sometimes modify the firmware -- some updates even update the firmware itself.) That's why BZT and I always vigorously disagreed on this topic; I don't believe that, just because MS's key was leaked and Secure Boot doesn't protect the entire boot process, its somehow a security risk or shouldn't be used. Its a very good feature, and it works exactly as intended, because it prevents unsigned binaries from being loaded by firmware protocols/boot services. It doesn't prevent those binaries from causing havoc once they've been executed, because that's pretty impossible to do without instrumenting every loaded binary, which would be not only incredibly complicated but would be very impractical (can you imagine the firmware trying to instrument something like the windows kernel or boot manager?).

eekee · Post by **eekee** » Mon Dec 27, 2021 6:34 am

StudlyCaps wrote:This system is not without precedent, for many years now graphics drivers have contained compilers which generate machine code from shader and parallel computing languages for uploading and execution on the GPU. These use a restricted language and similarly code which cannot be expressed as operations of the GPU is not recognized as a valid program.

This is/was not enough to prevent Intel graphics hardware from being exploited to run code (presumably bitcoin mining) which continued to cripple graphics performance long after the exploiting process (a web script) had been terminated. As with signing, it's not a complete security solution.

nullplan wrote:Driver signing will exclude malicious drivers, yes, but not insecure ones.

Remember Sony's rootkit! If you accept driver code from 3rd-party vendors, you are at some degree of risk of receiving malicious code. Note that it's not necessarily possible to tell the difference. If the code you received is vulnerable, there's no way to tell if the vulnerability was an honest mistake or a deliberate backdoor.

I have a couple of positive ideas, but they're rather outside the code space. Code auditing would catch some vulnerabilities, and signing may guarantee that only audited code will run. Contracting suppliers to produce secure drivers with penalties for vulnerabilities found may also help, if you can get any suppliers to sign such a contract.

I've actually been thinking of requiring all code be distributed (and perhaps even stored) in a high level language. Compilation can be extremely fast if you don't need excessive optimization and many tasks can be done with remarkably little code. I mean to make the experiment and see how it goes. Also, it's the cheap and easy way with a native Forth.

h0bby1 wrote:Additionally you have things like vx32 or 64 bits equivalent to do machine code parsing/disasembling, to check memory access and code path, and filter instructions.

Uh... vx32 itself is specific to the x86_32 since it requires a segment register trick, but I don't think your idea requires it. Machine code parsing/disasembling may be done on any architecture. However, what happens if memory access is computed and your checks can't figure out what it might be computed to? It's basically the class of problem which gets lumped under the halting problem with the claim it won't work, but here's something which could work for this specific case: Memory address computations may be altered to clamp the results to safe values. If available, MIN and MAX instructions seem the obvious choice, but may ruin the computation if you haven't evaluated it correctly. (What happens if separate computations require subsequent addresses, all of which are clamped to a specific maximum?) Another route would be bit-wise AND as the fastest possible instruction to limit the range, and then add a base address. Out of bounds access then wraps, which may be easier to debug. This is what I was thinking of doing with all untrusted code in my Forth; modifying Forth's relatively few memory load and store instructions to do this before compiling.

nullplan wrote:
rdos wrote:Actually, you don't even need to be able to install a software driver into the system. Any PCIe device could start writing anywhere in memory, including taking over or taking down the operating system. Driver signing, JIT compilation, and user account models won't help a bit to fix this.
OK, evil hardware is a whole different can of worms. The OS is basically powerless against it, so only install trustworthy hardware! That's why Thunderbolt now has an authorization procedure. Thunderbolt is basically exporting PCIe to an external interface.

So IOMMU will do nothing to stop this? Ouch! :/

nullplan wrote:But I was talking about well behaved hardware and evil drivers. That's what a restriction on the drivers (like running them at lower privilege) is supposed to consider.

Yeah, but again, remember Sony's rootkit! Sony are a hardware manufacturer first and foremost, and they're absolutely one of the companies most people would think of as reputable. "Only install trustworthy hardware" is another non-code solution.

rdos wrote:In the area of viruses, trojans, and indeed evil drivers & hardware too, the biggest risk factor is how widespread an OS is. The creators obviously won't bother to attack a hobby OS that almost nobody uses, while Windows, Linux, and Macs will be primary targets. I also don't think building something superior in this area would make your OS popular, and so this is a poor argument. Generally speaking, all these protective measures create obstacles for users, and so are not popular. If your OS is not popular, it won't become widespread.

Mostly true, but if you have a specific enemy and/or enough money is involved, popularity won't matter.

rdos wrote:Another thing is that filesystem-based protection measures are not very hard to break if you have a mature OS. For instance, I plan to do ext and ntfs drivers that just ignore the ACL lists and allows the OS to read & write anything it likes. Would be a fun way to spy on closed systems I cannot log in to.

So you as an OS supplier are planning to ship malware now?

Resist the temptation! (says a guy who eats far too much chocolate and bacon.)

rdos wrote:Besides, you can potentially stop a PCIe device from doing anything it likes on PCI (disable bus mastering), but I fear this is just a software setting that an evil device could ignore too. Then you could selectively enable bus mastering on trusted devices only.

I'd love an authoritative statement on whether the bus mastering setting is software or not.

Octocontrabass · Post by **Octocontrabass** » Mon Dec 27, 2021 1:17 pm

eekee wrote:So IOMMU will do nothing to stop this? Ouch! :/

The IOMMU is exactly how you defend against this type of attack. In Windows, it's called Kernel DMA Protection.

eekee wrote:I'd love an authoritative statement on whether the bus mastering setting is software or not.

I see no reason why malicious hardware couldn't ignore that bit. However, this setting also exists on bridge devices, so you can use it to block downstream devices (provided the bridge is not also malicious).

rdos · Post by **rdos** » Tue Dec 28, 2021 7:55 am

Octocontrabass wrote:
eekee wrote:So IOMMU will do nothing to stop this? Ouch! :/
The IOMMU is exactly how you defend against this type of attack. In Windows, it's called Kernel DMA Protection.

Kernel DMA protection? How is that supposed to work?

I'm pretty confident that Windows cannot detect that my malicious FPGA board is reading the whole Windows image, and then patch it to take control of Windows.

While the OS will typically provide memory schedules that a device is supposed to used (these could be checked), the OS cannot stop the PCIe device from reading & writing whatever it likes.

Microsoft's description of kernel DMA protection:
https://docs.microsoft.com/en-us/window ... hunderbolt

This doesn't look like anything that would stop my malicious FPGA from collection whatever information it wants.
The solution appears to be software driver based, and therefore can do nothing about evil hardware that acts on it's own.

Octocontrabass · Post by **Octocontrabass** » Tue Dec 28, 2021 12:13 pm

rdos wrote:Kernel DMA protection? How is that supposed to work?

The IOMMU controls which memory the device is allowed to access using DMA. It can be configured to block all DMA.

rdos wrote:I'm pretty confident that Windows cannot detect that my malicious FPGA board is reading the whole Windows image, and then patch it to take control of Windows.

Kernel DMA Protection only applies to unauthorized devices. By default, devices are authorized when a user logs in.

It might also be limited to Thunderbolt and M.2. As far as I know, your FPGA board is an ordinary PCIe card.

rdos · Post by **rdos** » Tue Dec 28, 2021 3:19 pm

Octocontrabass wrote:
rdos wrote:Kernel DMA protection? How is that supposed to work?
The IOMMU controls which memory the device is allowed to access using DMA. It can be configured to block all DMA.

rdos wrote:I'm pretty confident that Windows cannot detect that my malicious FPGA board is reading the whole Windows image, and then patch it to take control of Windows.
Kernel DMA Protection only applies to unauthorized devices. By default, devices are authorized when a user logs in.

It might also be limited to Thunderbolt and M.2. As far as I know, your FPGA board is an ordinary PCIe card.

Yes, Thunderbolt is a kind of controller on top of PCIe, much like M.2, USB & AHCI. Therefore, the OS has a driver for Thunderbolt which can authorize devices and block their operation.
Link:
https://developer.apple.com/library/arc ... 38-CH1-SW1

The case for a PCIe device is very different. It has a vendor Id & product Id, but a PCIe device can return any Ids it likes. The devices will typically be enumerated by BIOS, but even if the OS redo it assigning other memory windows, it cannot know that a PCIe device is evil and not enable it. As soon as the PCIe device is enabled it potentially can start to access system memory using the PCIe bus.

I don't think Kernel DMA protection can ever fix this.

As for IOMMU, isn't this the special area assigned to the BARs? The OS will assign these to physical memory and link them to the PCIe device during enumeration, but that area is not very interesting. Few PCIe devices will use it for anything else than configuration since it is slow & burdensome to implement in devices.

Instead, the PCIe device can start their own PCIe cycles that can access any location in system memory. Maybe Thunderbolt or the PCI hardware has filters for this, although it seems to be hard to implement these in real-time given the speed of the bus. Actually, PCIe devices can also generate interrupts to CPU cores, and can use this to take over cores. It can also access other PCI devices.

Octocontrabass · Post by **Octocontrabass** » Tue Dec 28, 2021 4:57 pm

rdos wrote:Yes, Thunderbolt is a kind of controller on top of PCIe, much like M.2, USB & AHCI. Therefore, the OS has a driver for Thunderbolt which can authorize devices and block their operation.

I'm not clear on the details, but I'm pretty sure some Thunderbolt controllers allow PCIe hotplug without waiting for the OS driver to authorize the attached PCIe-over-Thunderbolt device.

rdos wrote:As for IOMMU, isn't this the special area assigned to the BARs?

No, that's MMIO.

rdos wrote:Instead, the PCIe device can start their own PCIe cycles that can access any location in system memory. Maybe Thunderbolt or the PCI hardware has filters for this, although it seems to be hard to implement these in real-time given the speed of the bus.

The IOMMU is the hardware filter.

rdos wrote:Actually, PCIe devices can also generate interrupts to CPU cores, and can use this to take over cores. It can also access other PCI devices.

Some IOMMUs can filter interrupts and peer-to-peer DMA too.

eekee · Post by **eekee** » Fri Jan 07, 2022 3:59 am

rdos wrote:the PCIe device can start their own PCIe cycles that can access any location in system memory. Maybe Thunderbolt or the PCI hardware has filters for this, although it seems to be hard to implement these in real-time given the speed of the bus.

That's what I was afraid of. From my grounding in old hardware, it seems trivial to block access to all but a limited address range, but I'm told the problem with RAM perfomance isn't the memory itself, it's decoding the address. Adding an extra step to that decoding would be a killer. But... what if an extra step isn't needed? What if the upper part of the address is decoded in parallel, and the result used to gate access to physical memory? Decoding only the upper part allows the safety circuit to get a match ahead of the RAM's decoding, and the only catch is a minimum region size, much like pages. But again, I'm assuming this could hook into existing gating of the chip/module select lines without incurring further delay. There's a possibility I may be wrong about that.

Octocontrabass · Post by **Octocontrabass** » Fri Jan 07, 2022 3:55 pm

eekee wrote:What if the upper part of the address is decoded in parallel, and the result used to gate access to physical memory?

You've basically described the IOTLB. Unfortunately, the IOTLB is usually very small, so it's not especially helpful unless you're frequently performing DMA to the same set of memory or performing DMA that matches the IOMMU's TLB prefetch.

PopcornKirin · Post by **PopcornKirin** » Mon Jun 09, 2025 1:29 am

Regarding to the customized bytecode part, not sure if anyone brought it up but there's already one called BPF and it has been extensively used in Linux.

It is able to communicate with Linux kernel safely under the design of eBPF, and there are already some works to use eBPF to offload/design network stack. I think it is pretty much what you proposed. However, I truly doubt if that would solve the problem because it is really hard for all kinds of static analyzer to check(or even predict) runtime value under a relaxed type system. The only way at top of my head is to use formal proof method but that's off topic of OS design.

OSDev.org

Novel modular kernel design

Re: Novel modular kernel design

Re: Novel modular kernel design

Re: Novel modular kernel design

Re: Novel modular kernel design

Re: Novel modular kernel design

Re: Novel modular kernel design

Re: Novel modular kernel design

Re: Novel modular kernel design

Re: Novel modular kernel design

Re: Novel modular kernel design

Re: Novel modular kernel design

Re: Novel modular kernel design

Re: Novel modular kernel design

Re: Novel modular kernel design

Re: Novel modular kernel design