SYSCALL Performance multicores
Posted: Wed Nov 18, 2015 6:45 pm
Hi, I have Four 1.90 GHz Twelve-Core processors. 48 Cores @ 1.90 GHz
It would appear that the CPUs are in groups of 8 and when SYSCALL is called at the same time within the same group the processor shares the load.
I am in 64 bit long mode.
I am testing SYSCALL and SYSRET performance.
I am calling SYSCALL from user code in a loop, the SYSCALL Kernel code increases a counter for the CPU and returns.
If I just have Core #0 running it processes 10.6 million per second.
If I have Core #0 and Core #8 running, each process 10.6 million per second.
The same goes for Core #0, Core #8, Core #16, Core #24, Core #32 and Core #48, all 10.6 million per second.
Giving me around 60 million per second.
But if I ask for Core #0 and Core #1 both run at 5.1 million per second.
It appears if the CPUs are grouped, 0..7, 8..15, 16..23, 24..31, 32..39, 40..47
The performance slows the more cores are added. if Core #0 through to Core #7 are all running, then each has 700,000 per second.
This gives me for the 0-7 group around 10.2 million, very close to the single Core at 10.6 million.
If Cores #0 through to Core#7 are running as above at 700K/sec, and just Core #8 in the Core #8 through to Core #15 is just running it runs at 10.6 Million per second.
So it appears that the Core are grouped and that one group does not affect the others.
Any ideas on what the system is doing?
Many thanks. Alistair
It would appear that the CPUs are in groups of 8 and when SYSCALL is called at the same time within the same group the processor shares the load.
I am in 64 bit long mode.
I am testing SYSCALL and SYSRET performance.
I am calling SYSCALL from user code in a loop, the SYSCALL Kernel code increases a counter for the CPU and returns.
If I just have Core #0 running it processes 10.6 million per second.
If I have Core #0 and Core #8 running, each process 10.6 million per second.
The same goes for Core #0, Core #8, Core #16, Core #24, Core #32 and Core #48, all 10.6 million per second.
Giving me around 60 million per second.
But if I ask for Core #0 and Core #1 both run at 5.1 million per second.
It appears if the CPUs are grouped, 0..7, 8..15, 16..23, 24..31, 32..39, 40..47
The performance slows the more cores are added. if Core #0 through to Core #7 are all running, then each has 700,000 per second.
This gives me for the 0-7 group around 10.2 million, very close to the single Core at 10.6 million.
If Cores #0 through to Core#7 are running as above at 700K/sec, and just Core #8 in the Core #8 through to Core #15 is just running it runs at 10.6 Million per second.
So it appears that the Core are grouped and that one group does not affect the others.
Any ideas on what the system is doing?
Many thanks. Alistair