Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Sik wrote:...I'm surprised we aren't encouraging programmers to optimize their code as much as possible for slower CPUs...
"We" are doing that. "They" aren't, because what's selling is moar features, moar eye candy, moar FPS, semi-transparent windows, inlined videos, ...
Also company more willing to add 50 middlewares to do simple things, in fact it's the "correct way" to use bloatwares in their philosophy.
Think about RN, why you want to run javascript with java glue code which in turn run inside a JVM, on a crappy mobile processor? It's crazy talk back in the old day.
Well, to be quite honest, I vividly remember my first x86 laptop, purchased back in 1999, which had just enough CPU ooomph (PIII @ 500MHz) to play back a DVD without stuttering -- if I didn't do anything else.
(I had Amigas before, and while being fine machines in their own right, playing back DVDs was quite outside their range except for the very lastest PowerPC-driven ones.)
We do want HD quality today, going forward to 4k and 3D, and we do want it stutter-free while we're doing something else in another window.
So it's not as if all that CPU power were unnecessary if not for bloatware.
But we're getting sidetracked here. Let's keep this focussed on Meltdown / Spectre and their mitigation.
Every good solution is obvious once you've found it.
Probably Meltdown/Spectre are just the beginning of a public attempt to use any measurable hardware effects as a programming language from a simple microcontroller.
If that's the case, probably using paging as a measure to leak data will only be one of the weakest ways to gather any data in a system, probably wouldn't really be the fault of the architecture implementation (beyond doing temporary operations that leave measurable traces before making sure that the operation is allowed as when loading pages before making sure that privilege access will be allowed for operands), and probably there will always be ways to determine valuable things and which will become easier as processors become more advanced.
The only task of interest here is dealing with hardware effects as if they were input/output gates to manipulate, and if electronics prove to be always suitable for observing any desired data and effects through manual analysis from software, then the best choice will definitely be to isolate a machine from code designed for that in one way or another, because there would always be a way to measure things electronically through analyzing code.
Things like this will always be usable if privileged malware is installed.
Solar wrote:Well, to be quite honest, I vividly remember my first x86 laptop, purchased back in 1999, which had just enough CPU ooomph (PIII @ 500MHz) to play back a DVD without stuttering -- if I didn't do anything else.
(I had Amigas before, and while being fine machines in their own right, playing back DVDs was quite outside their range except for the very lastest PowerPC-driven ones.)
We do want HD quality today, going forward to 4k and 3D, and we do want it stutter-free while we're doing something else in another window.
So it's not as if all that CPU power were unnecessary if not for bloatware.
And GPUs are better equipped for that task nowadays (while in 1999 they were still using fixed pipeline or very primitive shaders). For that kind of heavy tasks it's better to not use the CPU actually.
Solar wrote:But we're getting sidetracked here. Let's keep this focussed on Meltdown / Spectre and their mitigation.
It's sorta related, seeing as the big issue is that it's a side-effect of speculation and whether we could live without it if really needed.
Reminds me, I'm seeing talks about there being patches to protect against Spectre. I thought it was unfixable? What do those patches do? (besides browser updates to mitigate the javascript proof of concept)
They do what I posted on page 2: Replacing indirect jumps with a sequence that does loops forever (or discards speculative state) when it is reached by speculative execution. Of course, this comes at a performance cost and should only be enabled in programs that are expected to be exposed to many attacks. Here is a discussion of this "retpoline" mitigation: Click me!. Similar techniques exist to mitigate the bounds-check exploit.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
bluemoon wrote:Seems the issue not only affect OS, but drivers as well, anyone knows the detail?
Spectre affects almost all software, including drivers. KPTI (and the equivalent methods employed by Windows and other OSes) only defends against Meltdown.
bluemoon wrote:Seems the issue not only affect OS, but drivers as well, anyone knows the detail?
Spectre affects almost all software, including drivers. KPTI (and the equivalent methods employed by Windows and other OSes) only defends against Meltdown.
In reality, spectre only affect software that does its own code isolation (e.g. chrome isolating js, or isolation between modules which don't have the "data pointer"). For a single program it can access to all its memory anyway.
That make me wonder, what have been isolated by GPU driver? Or are they just recompile with updated toolchain just to be safe?
bluemoon wrote:Seems the issue not only affect OS, but drivers as well, anyone knows the detail?
Spectre affects almost all software, including drivers. KPTI (and the equivalent methods employed by Windows and other OSes) only defends against Meltdown.
In reality, spectre only affect software that does its own code isolation (e.g. chrome isolating js, or isolation between modules which don't have the "data pointer"). For a single program it can access to all its memory anyway.
That make me wonder, what have been isolated by GPU driver? Or are they just recompile with updated toolchain just to be safe?
That is not exactly true. If software can be manipulated into executing the needed instructions (which is definitely a risk for any software that takes input, particularly those that take external input), it is possible to use external software to do the cache timing measurements.
This means that any software that takes some form of external input is vulnerable, which is problematic for things like windowing servers and kernels (including their drivers if they take input directly enough, hence the GPU driver updates), which have access to sensitive information.
From what I understand, google has suggested and implemented some tricks into compilers that makes the resulting output code less likely to contain instruction sequences that can be exploited for a spectre attack, but I am unsure of how it works exactly.
One thing I'm trying to wrap my head around is how a userspace program can even attempt to make an access to a kernel space address. All a userspace program can do is access a VA (except through syscall I guess). Does it rely on the VA translating into kernel space PA before the mismatch in address space (user VA hits a TLB entry that has a valid kernel space VA->PA translation)?
How does the userspace program attempt to access a specific PA space address?
Yeah, the kernel's virtual addresses are still mapped. They can't be properly accessed (they're limited to ring=0, and hence will throw an error if userspace code tries to normally access it), but the issue is that the CPU knows how to map them to physical addresses and on Intel CPUs the speculated code can load them, presumably on the assumption that if the access should have been forbidden then it wouldn't matter as the speculated results would just be tossed away. Except it affected the cache state, which affects its timing, which is what Meltdown is probing to leak the data. The patches against Meltdown change the page tables so the kernel is not mapped in even that way (at the expense of wasting more cycles).
AMD CPUs don't attempt to speculate across privilege boundaries so they're safe from that (but still vulnerable to Spectre which doesn't need a privilege violation). Other CPUs may or may not do it.
The only one I am aware of is googles retpoline construction. This is designed to effectively inhibit branch prediction on specific jumps, details can be found here: https://support.google.com/faqs/answer/7625886.
As far as I know 'jmp reg' is always treated as unpredicted (but maybe I am wrong, but it definately WAS in the past).
Is it possible to replace all indirect jumps with this to mitigate the issue Korona describes: