Page 1 of 1

Compiler generates call to memset, how to avoid?

Posted: Tue Jan 09, 2024 1:06 am
by songziming
Hi,

If I declare a struct with initializer, clang would generate a call to memset, probably to zero out the struct.

But in my kernel, the function should be kmemset, not memset.

I don't want to rename the function to memset, since that will cause conflict in unit test program.

I already added -fno-builtin to CFLAGS, but clang still calls memset. Is there any way to disable that? If I can tell clang to use kmemset would be better.

Code: Select all

// declare a struct with initializer, calls to memset
struct page my_page_desc = { .type = 0 };

// declare a struct without initializer, no call to memset
// I have to use this way, but I don't like it
struct page my_page_desc;
my_page_desc.type = 0;
Thanks in advance.

Re: Compiler generates call to memset, how to avoid?

Posted: Tue Jan 09, 2024 8:20 am
by Solar
I've been out of the loop for a while; is -ffreestanding (as per the Bare Bones tutorial) still the way to go here?

Re: Compiler generates call to memset, how to avoid?

Posted: Tue Jan 09, 2024 9:40 am
by nullplan
As per the documentation, the compiler will emit calls to memcpy(), memmove(), memset(), and memcmp() even with -ffreestanding, and expects the environment to provide those functions. Since there already were optimizers turning memcpy() et al. into infinitely recursive functions, I would simply implement those in assembler and move on. And while there is no reason to have renamed versions of these around, if you must have those, you can just define them as wrappers around the standard functions.

Honestly, I couldn't find the clang documentation here, but the GCC documentation has been the same since forever and a day, so I'm guessing the clang people are emulating the behavior.

Re: Compiler generates call to memset, how to avoid?

Posted: Tue Jan 09, 2024 12:00 pm
by Octocontrabass
Clang is meant to be a (mostly) drop-in replacement for GCC, so the GCC documentation usually applies. Here's the part that lists the four functions you must provide in your freestanding environment.

I see the GCC documentation has recently changed to add a requirement: your memcpy implementation must be well-behaved when the source and destination pointers are the same. Turns out GCC and Clang have required that for a while.

Personally, I'd use inline assembly to give LTO a chance to inline the function, but I like doing things the hard way.

Re: Compiler generates call to memset, how to avoid?

Posted: Tue Jan 09, 2024 2:00 pm
by nullplan
Octocontrabass wrote:Personally, I'd use inline assembly to give LTO a chance to inline the function, but I like doing things the hard way.
And here I thought me using out-of-line assembler was the hard way.

Re: Compiler generates call to memset, how to avoid?

Posted: Wed Jan 10, 2024 12:39 am
by songziming
I came up with a dirty solution. I'll define function memset as weak symbol, then define kmemset as alias to memset.

Code: Select all

__attribute__((weak)) void *memset(void *dst, int ch, size_t len) {
    // My Implementation
}

void *kmemset(void *dst, int ch, size_t len) __attribute__((alias("memset")));
In my kernel, I can use memset, which can be overriden by arch-specific implementation.

In unit test program, memset is overridden by libc, and I can use kmemset in testcases.

I've checked that even memset is overriden, kmemset still references my implementation.