Page 1 of 1

Is per-CPU var same as thread local storage? Can I use TLS with %gs?

Posted: Sat Jul 13, 2024 7:40 pm
by songziming
I use %gs as prefix for per-cpu vars. Offset to per-cpu area is kernel only, so I need to swapgs when switching between kernel mode and user mode.

In the code, I use macro and inline assembly to access per-cpu vars of current CPU, just use movsq with %gs prefix.

Recently I heard thread local storage (TLS), and it uses %fs as prefix. What's good about TLS is compilers have good support for it. Just mark the variable with __thread and access it, compiler will generate code with %fs prefix automatically.

It seems per-cpu variables are just kernel version of TLS. Sometimes I need to access per-cpu var of another CPU, but most time I just access per-cpu var of current CPU. So is there a way to configure GCC to use %gs as TLS self pointer? Since x86 only has swapgs, no swapfs.

If I can make kernel TLS use %gs, then the thiscpu() macro can be completely removed, code will be alot simpler.

Thanks.

Re: Is per-CPU var same as thread local storage? Can I use TLS with %gs?

Posted: Sat Jul 13, 2024 9:30 pm
by nullplan
__thread makes many assumptions about how your %fs is set up, and setting it up correctly is not simple. For example, you would need to allocate your TLS and CPU descriptor consecutively, and put your %fs between them. There is also no convenient swapfs instruction, like there is for %gs.

If you want to use fewer macros, I would suggest collecting all per-cpu variables into a structure. Call it "struct cpu". Then start that structure with a pointer to self:

Code: Select all

struct cpu {
  struct cpu *self;
  int nr;
  ...
The set %gs to point at the correct struct cpu, and then you can use a trick like this:

Code: Select all

static inline struct cpu *cpu_self(void) {
  struct cpu *r;
  __asm__("mov %%gs:0, %0" : "=r"(r));
  return r;
}
And now you can use the CPU pointer like all other pointers.

Re: Is per-CPU var same as thread local storage? Can I use TLS with %gs?

Posted: Sat Jul 13, 2024 10:09 pm
by Octocontrabass
I don't know of any easy way to change which register GCC uses for thread-local storage, but maybe named address spaces are closer to what you want.

This can be combined with nullplan's suggestion of using a struct that points to itself, like so:

Code: Select all

static inline struct cpu *cpu_self(void) {
    return ((__seg_gs struct cpu *)0)->self;
}