nullplan wrote:Aw, poor Fefe. There is some work still being done on it, the most recent commits are from a week ago. And I thank you for that invitation to a refresher on just how bad cvs is. Even finding out that much took a Google search. I still don't know what was changed, and I no longer care to know.
Ouch. By just looking at:
https://www.fefe.de/dietlibc/, I had no idea he was still working on it: I switched to libmusl in 2018, before the release of dietlibc 0.34. The last release in 2018 was from 2013 (5 years old) and yeah, I didn't even consider to checkout the CVS repo. I searched a little online and people complained about that library (bugs etc.) so I thought that it was the right choice to look for something else. Thanks for pointing out that he is still working on it. Libmusl is much better, but at least we know that dietlibc is not a completely abandoned project. Maybe we should suggest the guy to move dietlibc to Github?
nullplan wrote:Rich Felker was of the opinion that a single-threaded program is merely a multi-threaded program in waiting. Anyway, set_thread_area() on AMD64 should boil down to setting FS.Base. Admittedly, on x86, it is more involved.
I can't talk about AMD64, but on x86 it literally means allowing each process to own a limited set of GDT entries. The tricky part is it has to work also in fork-ed children, because some libc functions use TLS variables after _start, obviously. That meant also implementing a ref-count mechanism for GDT entries, dynamic expansion of the GDT etc. A hell lot of work. I just still wonder: why don't set_thread_area() set LDT instead of GDT? Actually, I wanted to implement this way initially, but I quickly realized that's impossible because the LDT/GDT bit is in the selector itself, there's nothing I can do. But why? It's an architecture specific feature anyway, why don't just having a small LDT per process? Maybe there's some overhead is setting the LDT on every task switch?
nullplan wrote:That was changed recently (committed Nov 30), the current implementation only requires getcwd() and readlink().
That's great news! Thanks for sharing. That means that the super-stable pre-built toolchains I use will have musl 1.2.2, at some point. But, I'll have to wait since version 1.2.2 has been released on Jan 15, 2021, just a month ago.
nullplan wrote:I have only read the source code. Based on that alone, musl wins by a landslide. Newlib has so many #ifdefs all over the place, and finding what you are looking for takes ages. Whereas musl has straightforward code, is not configurable as you said, and that means it has no bloody #ifdefs if it can possibly help it, which aids readability.
Also the choice of algorithms is better. newlib's sorting algorithm is a standard quicksort, whereas musl is implementing smoothsort. Newlib's malloc() is a bog-standard dlmalloc (requiring sbrk()) whereas musl has its own version. The old one I could read and understand on its own (the new one requires a bit more brain power than I have yet given it). Unfortunately, the old malloc has a tiny little race condition that can cause unconstrained heap growth in multi-threaded applications. But it could be fixed with a big malloc lock.
I'm happy to hear that you believe libmusl is better. I don't regret my choice, in particular now that I know it has also a better realpath() implementation. And yeah, I totally understand the advantages of being non-configurable: it's easier to maintain and test. In one of their FAQs I read at the time that the decision to not being configurable is based on the fact that even simple binary options cause an exponential (2^N) growth in the number of possible configurations. There's no way to test all the 2^N configurations and testing the N configurations independently is not the same thing.