Hi,
btbx wrote:If someone want to create new operating system kernel (ie for Linux, BSD), is there any difference between Kernel source code for:
1. A motherboard with 2 sockets and 2 cores.
2. A motherboard with 2 sockets and 4 cores (2 cores + 2 cores)
3. A motherboard with 2 sockets and 3 cores (2 cores + 1 core)
4. A motherboard with 1 socket and 4 cores.
For all possible 80x86 computers (with any mixture of hyper-threading, multi-core, multi-socket and NUMA) an OS can treat all logical CPUs as identical with no special treatment. It's only when you start optimizing that things get tricky.
For NUMA, the main thing to optimise is memory accesses. For example, you might want to use processor affinity to restrict processes to NUMA domains, and then allocate physical pages for those processes that are "close" to the CPU/s in that NUMA domain. The idea is to reduce the chance of processes accessing memory that isn't "close". There's also other things you can do though (e.g. load balancing, and optimizing the kernel's memory usage).
For hyper-threading, the load on one logical CPU effects the performance of other logical CPUs in the same core. Because of this you might want to optimise CPU usage. For example, if you've got 2 cores with 2 logical CPUs per core and 2 of the logical CPUs are idle, then it's better to have one logical CPU per core idle than it is to have both logical CPUs in one core idle and both logical CPUs in another core busy.
In all cases, you might prefer to run threads on the same CPU that they ran on last time, as the thread's data might still be in that CPUs cache.
For both hyper-threading and multi-core, some caches may be shared between CPUs. For an example, with a multi-core CPU that supports hyper-threading the L1 caches might be shared by logical CPUs in the same core and the L3 cache might be shared by all logical CPUs in the chip. In this case you don't necessarily need to run the thread on the same CPU that it ran on last time to get the same benefits from cached data - you might get the same advantage running the thread on a different logical CPU in the same core (or any CPU in the chip). In a similar way, there may also be advantages in running different threads that belong to the same process on the same chip or core (so that several threads running on seperate logical CPUs in the same chip can share data in that chip's cache). The same thinking can apply IPC (it might better for threads that communicate with each other to run on the same core or chip).
Of course there's other things you could do in certain situations, like not bothering to set MTTRs on all CPUs (the MTTRs are shared for hyper-threading, so you only need to set one logical CPU's MTTRs in each core and can skip the other logical CPUs), or optimizing IRQ handling.
Also, I'm not sure if any operating system actually does all of the above (I'd assume most might do some of it).
Cheers,
Brendan