[SOLVED] How IDs are assigned to cpu cores?

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
nop
Posts: 20
Joined: Sat Jan 07, 2012 7:42 am
Location: Italy

[SOLVED] How IDs are assigned to cpu cores?

Post by nop »

Hi everyone!!!
I go stright to the point: are there any assumptions I can make about the way IDs are assigned by the BIOS to cpu cores?
Are they always incremented by one or have I to expect a different assignation schema in certain systems? For example a blade server with four multicore CPUs will have id ranging from 0,1,2,...n or I might find something different?

This apparently innocent question can have a deep impact on code design because I'm feeling "unsafe" to use the cpu id as index in some per-cpu data array...

As always thank you in advance!!!

Regards, Teo

P.S.: have nice holidays!!!
Last edited by nop on Wed Dec 26, 2012 11:42 am, edited 1 time in total.
OS development is the intelligent alternative to drugs
User avatar
gravaera
Member
Member
Posts: 737
Joined: Tue Jun 02, 2009 4:35 pm
Location: Supporting the cause: Use \tabs to indent code. NOT \x20 spaces.

Re: How IDs are assigned to cpu cores?

Post by gravaera »

Yo:

Logical CPUs do not need to be numbered sequentially. The only assumption you can make is that they will be unique (and you can panic if they are not).

--Peace out,
gravaera.
17:56 < sortie> Paging is called paging because you need to draw it on pages in your notebook to succeed at it.
nop
Posts: 20
Joined: Sat Jan 07, 2012 7:42 am
Location: Italy

Re: How IDs are assigned to cpu cores?

Post by nop »

gravaera wrote:Logical CPUs do not need to be numbered sequentially
This is what i expected, because I haven't read any different (but I hoped for a different answer :D)... Now I'll have a nice time trying to figure out how to map a hardware cpu id to a logical CPU number with a constant time algorithm...

Thank you!!!

Teo
OS development is the intelligent alternative to drugs
sounds
Member
Member
Posts: 112
Joined: Sat Feb 04, 2012 5:03 pm

Re: How IDs are assigned to cpu cores?

Post by sounds »

Logical CPU IDs can be anything, so watch out for:
  1. Do not assume every logical CPU ID is used
  2. Do not assume a certain logical CPU ID is always used. Logical CPU 0 for example is not used for some AMD Trinity chips.
On the other hand, here are some assumptions that you can start from:
  1. From the type of Local APIC you can determine the maximum logical CPU ID - for instance, 4-bit CPU IDs have a maximum of 15; 8-bit CPU IDs have a maximum of 255, etc.
  2. Logical CPU IDs tend to be in groups, though there is no guarantee. Try to write code that performs well on CPU IDs where the IDs are grouped together (e.g. 193-196 for an example 4-core system), then don't worry about performance or memory wasted for completely random CPU IDs. But do take care to write code that operates correctly on any random CPU IDs.
nop
Posts: 20
Joined: Sat Jan 07, 2012 7:42 am
Location: Italy

Re: How IDs are assigned to cpu cores?

Post by nop »

sounds wrote:But do take care to write code that operates correctly on any random CPU IDs.[/list]
This is exactly what I've done so far (I wake up al APs returned by the ACPI without distinctions).

The problem arises, for example, when the core #123 wants to access a per-cpu structure and there are 4 cores in the system: I need a conversion from the hardware id to a sort of logical-id which can be used for example as an index of an array.

And I need this to be fast and constant in time for all the cores: I cannot permit a cpu to access cached data faster than another (by cached data I mean, for example, array of free pages etc etc)
OS development is the intelligent alternative to drugs
User avatar
bluemoon
Member
Member
Posts: 1761
Joined: Wed Dec 01, 2010 3:41 am
Location: Hong Kong

Re: How IDs are assigned to cpu cores?

Post by bluemoon »

nop wrote:The problem arises, for example, when the core #123 wants to access a per-cpu structure and there are 4 cores in the system: I need a conversion from the hardware id to a sort of logical-id which can be used for example as an index of an array.
For small quantity of cores, you may just assume there are 256 cores and only 4 are usable (252 non-existent cores are "faulty / not-usable")
For more core, the whole kernel should be redesigned and this lookup is the least thing you want to care.
nop wrote:And I need this to be fast and constant in time for all the cores: I cannot permit a cpu to access cached data faster than another (by cached data I mean, for example, array of free pages etc etc)
There are many factors affecting the actual timing, including temperature, hyper-threading, etc; you can not assure they are in same speed even they are executing the same instructions.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: How IDs are assigned to cpu cores?

Post by Brendan »

Hi,
bluemoon wrote:
nop wrote:And I need this to be fast and constant in time for all the cores: I cannot permit a cpu to access cached data faster than another (by cached data I mean, for example, array of free pages etc etc)
There are many factors affecting the actual timing, including temperature, hyper-threading, etc; you can not assure they are in same speed even they are executing the same instructions.
I'm wondering if the original poster meant "constant time" in the "O(1)" sense (e.g. a constant number of operations, where an operation may take a variable length of time) rather than "constant time" in the literal sense (e.g. every case as slow as the worst case).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
nop
Posts: 20
Joined: Sat Jan 07, 2012 7:42 am
Location: Italy

Re: How IDs are assigned to cpu cores?

Post by nop »

bluemoon wrote:For small quantity of cores, you may just assume there are 256 cores and only 4 are usable (252 non-existent cores are "faulty / not-usable")
For more core, the whole kernel should be redesigned and this lookup is the least thing you want to care.
Since I make no assumptions and my LAPIC id is 32 bits wide (x2APIC mode enabled) I'm taking care of the worst case: luckily I don't have to redesign the whole kernel (in part because I don't have a whole kernel :D ) but I have two options:
  • coding a hash table (directly addressed, memory consuming but I don't think I'll have 65k processors in the system)
  • use some trick
bluemoon wrote:There are many factors affecting the actual timing, including temperature, hyper-threading, etc; you can not assure they are in same speed even they are executing the same instructions.
You are perfectly right: what I meant was that a routine which wants to access per-cpu cached data array shouldn't be affected by different cpu id lookup times
Brendan wrote:I'm wondering if the original poster meant "constant time" in the "O(1)" sense (e.g. a constant number of operations, where an operation may take a variable length of time) rather than "constant time" in the literal sense (e.g. every case as slow as the worst case).
For the algorithm I'm thinking of, I intend "constant time" in both O(1) and constant execution time
OS development is the intelligent alternative to drugs
User avatar
Velko
Member
Member
Posts: 153
Joined: Fri Oct 03, 2008 4:13 am
Location: Ogre, Latvia, EU

Re: [SOLVED] How IDs are assigned to cpu cores?

Post by Velko »

In my kernel, CPU uses its LAPIC ID for self-identification only on startup, when it does not really matter if it's O(1), O(n) or O(whatever). It uses other means later on.

I used to store a pointer to CPU-specific data structure at the bottom of the kernel stack. And pointing kernel stack entrypoint (the value of stack pointer in TSS) just above that. Since my stack is fixed to single 4 KiB page, it takes a simple math to round up the value of stack pointer to the next page and then use the previous __SIZEOF_POINTER__ bytes as a pointer to CPU-specific data. Works nice, but requires updating every time you switch stacks. Not a big deal since you need to update a few values (like setting kernel stack in TSS) before returning to usermode anyway. However, the thing I didn't like there was that I had to build a "one single struct" that holds all the necessary values. I do not like to cast types much, so this struct required a lot of #includes, sometimes leading to circular references. But if you're not concerned about that - it's the simplest way to go.

Since then, however, I have moved to use ABI-specific TLS (Thread Local Storage). The one provided by GCC's attribute __thread. If a (global or class’s static) variable is marked by that attribute, the compiler generates an ABI-specific code where the values are independent for each thread. For i386 or amd64 architectures it means that the variable is accessed by using GS or FS segment registers. For example, accessing the first variable boils down to mov %gs:0x0,%eax (for i386) or mov (%fs:0x0,%rax) (for x86_64) respectively. ARM architecture uses a function call to __aeabi_read_tp and then resolving the pointer from there (I will skip the details for now).

Now, it's your call to manage GS and FS offsets. For i386 I modified the GDT and added a base GS entry for each core right after the the TSS entry. This way the GS can be loaded by few instructions:

Code: Select all

str %eax
 add $8, %eax
 mov %ax, %gs
For x86_64 ABI uses FS segment in a similar way. The idea, however, is to store the relevant values in MSR KernelGS and then restoring it into necessary registers when needed.

Code: Select all

mov $0xc0000102, %rcx // Read KernelGS
rdmsr
mov $0xc0000100, %rcx // Set FS.base
wrmsr
Managing an extra entries in the GDT or CPU's MSRs may look like an extra hassle, but in the end they works out pretty well. And if you're about to support the TLS in your userspace programs - it is well worth to checking out.
If something looks overcomplicated, most likely it is.
nop
Posts: 20
Joined: Sat Jan 07, 2012 7:42 am
Location: Italy

Re: [SOLVED] How IDs are assigned to cpu cores?

Post by nop »

Velko wrote:
Since then, however, I have moved to use ABI-specific TLS (Thread Local Storage). The one provided by GCC's attribute __thread. If a (global or class’s static) variable is marked by that attribute, the compiler generates an ABI-specific code where the values are independent for each thread. For i386 or amd64 architectures it means that the variable is accessed by using GS or FS segment registers. For example, accessing the first variable boils down to mov %gs:0x0,%eax (for i386) or mov (%fs:0x0,%rax) (for x86_64) respectively. ARM architecture uses a function call to __aeabi_read_tp and then resolving the pointer from there (I will skip the details for now).

Now, it's your call to manage GS and FS offsets. For i386 I modified the GDT and added a base GS entry for each core right after the the TSS entry. This way the GS can be loaded by few instructions:

Code: Select all

str %eax
 add $8, %eax
 mov %ax, %gs
For x86_64 ABI uses FS segment in a similar way. The idea, however, is to store the relevant values in MSR KernelGS and then restoring it into necessary registers when needed.

Code: Select all

mov $0xc0000102, %rcx // Read KernelGS
rdmsr
mov $0xc0000100, %rcx // Set FS.base
wrmsr
Managing an extra entries in the GDT or CPU's MSRs may look like an extra hassle, but in the end they works out pretty well. And if you're about to support the TLS in your userspace programs - it is well worth to checking out.
I had very similar ideas!!!

I'm going to use the SWAPGS instruction on kernel entry which was introduced for this purpose.
OS development is the intelligent alternative to drugs
Post Reply