Why do people say C is better than Assembly?

iansjack · Post by **iansjack** » Fri Jan 13, 2017 7:00 am

TheDev100 wrote:Imagine a boot sector in C. It would reach 1KB instead of 512 bytes.

You don't have to imagine it; it's been done. And it doesn't reach 1KB.

glauxosdever · Post by **glauxosdever** » Fri Jan 13, 2017 8:33 am

Hi,

Concerning the speed and size of the executables, one of the main problems with C toolchains (and not C itself) is the necessity of an ABI. For x86, the calling conventions specified by the ABI are slow by design (write parameters to the stack, read parameters from the stack). For x86_64, the calling conventions specified by the ABI specify that the first six parameters should be passed in registers and the next ones on the stack. While this solves the problems with the x86 calling conventions, it can create new problems in case some of the registers specified for parameter passing are already used for something else. In this case, the compiler has to move the values from these registers to the stack (or other registers) and then move the parameters to those registers, just to comply with the ABI.

However, this can be solved by using non-machine-code object files and, fortunately, many popular compilers implement this functionality. It's called Link Time Optimisation (LTO) and as far as I know it's implemented by GCC and Clang. A compiler using LTO is able to pull out the bloat caused by calling conventions, possibly inline functions and therefore not need "call" and "ret" instructions, convert code snippets that happen to do the same thing as a function into a call to that specific function, etc.

So it's not necessarily true that C itself is slow, but rather how toolchains happen to implement it.

Regards,
glauxosdever

Roman · Post by **Roman** » Fri Jan 13, 2017 9:52 am

In fact, even a C++ bootsector has been done.

Love4Boobies · Post by **Love4Boobies** » Fri Jan 13, 2017 10:06 am

glauxosdever, that is unrelated to C. You need calling conventions and to mess up your register allocation a little whenever you want to link code, even in assembly. Link-time optimization helps to some extent. For routines that don't have any external linkage, the results should just be better.

Anyway, this thread is pretty ridiculous. A lot of people don't seem to understand where software bottlenecks usually appear. If assembly is where you think you will get your performance boost from, think again. All this sounds like worrying whether your car is aerodynamic so you can get to your destination faster without worrying which route you take. Which one do you think will make a difference?

dozniak · Post by **dozniak** » Fri Jan 13, 2017 11:06 am

Love4Boobies wrote:All this sounds like worrying whether your car is aerodynamic so you can get to your destination faster without worrying which route you take. Which one do you think will make a difference?

Beautiful analogy!

glauxosdever · Post by **glauxosdever** » Fri Jan 13, 2017 12:11 pm

Hi,

Love4Boobies wrote:You need calling conventions and to mess up your register allocation a little whenever you want to link code, even in assembly.

When linking code written in one only high-level language and one assembly language per architecture, there are only 4 cases of linking code together:

Assembly with assembly: In this case the programmer can simply decide which calling convention to use per function.

Assembly with high-level: In this case the assembler could support a special construct that isn't actually a 1:1 assembly:machine code representation and that calls a high-level function just like it would be called from high-level code. The compiler (specifically the register allocator) would put there whatever the high-level function expects as inputs and outputs so the generated code would be the most efficient possible. I definitely need to think somehow about how would its exact syntax would like, but the idea itself is good. After all you are calling a high-level function from assembly - why couldn't the call inside the assembly code itself be a high-level construct too, if you are going to have benefits from it?

High-level with assembly: In this case the assembler could support a special construct that is similar to function prototypes. Except it would specify where does the assembly function expect its inputs and outputs, instead of using variable names. This way, the compiler compiling a high-level function would simply pass the inputs and get the outputs in the most efficient way.

High-level with high-level: In this case the compiler can simply decide which calling convention to use per function.

However, all of the above require an Intermediate Representation that:

Preserves function prototypes for both high-level code and assembly
Represents the code in a format that's not machine code (could be LLVM's Intermediate Representation or Abstract Syntax Tree)

However, implementing that requires a lot of work and, hopefully, I'll manage to do that.

Love4Boobies wrote:glauxosdever, that is unrelated to C.

Definitely - it applies to every compiler that compiles a high-level language without using LTO (by default).

Regards,
glauxosdever

Love4Boobies · Post by **Love4Boobies** » Sat Jan 14, 2017 12:49 am

@glauxosdever

"X with Y" and "Y with X" don't convey all the information. I had to read through the nonsense in order to understand which is the caller and which the callee, a distinction that doesn't matter but you were trying to make nonetheless. You seem to be confusing compilation with linking. You say things like "when linking assembly with assembly, just have the programmer pick the right calling convention" or "when linking high-level code with high-level code, just let the compiler pick the right calling convention". However, neither the assembly programmer nor the compiler are involved in the linking process and the calling convention has already been established by that point. How would the assembly programmer or compiler know which calling convention makes the most sense for the caller a priori? Not to mention that the answer would be different from caller to caller. These are also the reasons for which the supposedly good idea of abstracting routine calls in assembly is meaningless---it would esentially involve a calling convention.

In fact, there are only two scenarios you need to think of:

Linking assembly with anything else (regardless of who is calling what). You don't seem to understand the trade-off being made here. We intentionally trade an unnoticeable amount of running time in order to greatly improve the maintainability of our software. If you added up all the running time ever saved by running optimal machine code, it wouldn't even come close to the time invested in writing or maintaining such low-quality code. And even in the incredibly unlikely scenario where only optimal code has acceptable performance, what do you think saves up more costs for the solution's implementor (the end user)? Having a fast piece of software that's full of bugs because the code is undisciplined or investing in slightly faster hardware and running code with less bugs? You are trying to solve the wrong problem.
Linking "high-level" code with "high-level" code. Maintainability isn't an issue here because you have automated tools taking care of the process. Obviously, if you really want optimal machine code, you have to give up linking altogether, at the expense of increased compilation times (again, you are trying to solve an implicit problem without realizing that people make intentional trade-offs). If the whole project is written in a single language, then whole-program compilation would do the trick (e.g., in C you would use #include directives instead of extern declarations). If the project is written in multiple languages, then a common IR (which, by the way, is different from an AST) could be used so that the last two stages of a compiler (the optimizer and the code generator) can be actively involved in producing the final output.

EDIT: Fixed a couple of grammar issues.

ggodw000 · Post by **ggodw000** » Sat Jan 14, 2017 1:07 am

C has got functions like gets(); that cause viruses and worms!

who says assembly equivalent call gets will not get viruses and worms

Love4Boobies · Post by **Love4Boobies** » Sat Jan 14, 2017 1:12 am

To be fair, C hasn't had a gets for about 6 years and there isn't much legacy code that uses it either.

glauxosdever · Post by **glauxosdever** » Sat Jan 14, 2017 6:12 am

Hi,

I don't confuse compilation with linking. In fact, I just wish it was done differently.

I probably wasn't clear enough about "X with Y" and "Y with X". In the former case, X is the caller and Y is the callee. In the latter case, Y is the caller and X is the callee. I'll try to explain that more thoroughly in the paragraphs below.

For calling high-level code from assembly code, as I said (but not explained well) a special construct could be used. Something like:

Code: Select all

(eax) = my_high_level_function(ebx, edx)

could be inserted by the programmer in the caller assembly function, which tells the compiler to compile the callee high-level function as if the first input value was in ebx, the second input value in edx, and the first return value in eax.

There is however an issue here, as someone could write multiple assembly stubs that call the high-level function in different ways and the compiler would need to generate multiple, similar yet, high-level functions where the only difference is their calling conventions. In this case, it's up to the assembly programmer to ensure a high-level function is called always the same way.

Also, you said that abstracting calls in assembly code would essentially involve a calling convention. That's true, I agree, but it's a calling convention that definitely fits the caller assembly code, and to which the callee high-level code can adapt without issues (unless it's called in different ways each time, as explained in the previous paragraph).

For calling assembly code from high-level code, as I said (but not explained well) a special construct similar to function prototypes in high-level code could be used. Something like:

Code: Select all

(eax) my_assembly_function(ebx, edx) { /* Assembly code goes here */ }

could be inserted by the programmer in the callee assembly function, which tells the compiler to compile the caller high-level function as to put the first input value in ebx, the second input value in ebx, and to expect the first return value in eax.

This also essentially involves a calling convention, although it's a calling convention that definitely fits the callee assembly code, and to which the caller high-level code can adapt without issues.

For calling assembly code from assembly code, assuming someone knows the constructs described above, it should be easy to imagine how would it work.

For calling high-level code from high-level code, provided the compiler can see the whole code in the form of an Intermediate Representation, it should be easy to imagine how would it work.

As for bugs caused by unmaintainable code, I don't really see how code that adheres to these principles would be unmaintainable.

Regards,
glauxosdever

Seahorse · Post by **Seahorse** » Sat Jan 14, 2017 12:29 pm

Learning C is more than just YouTube tutorials. YouTube alone doesn't cover it. At least, for the depth of C knowledge requiredeth here.

Octocontrabass · Post by **Octocontrabass** » Sun Jan 15, 2017 1:35 am

glauxosdever wrote:For calling assembly code from high-level code, as I said (but not explained well) a special construct similar to function prototypes in high-level code could be used. Something like:
Code: Select all
(eax) my_assembly_function(ebx, edx) { /* Assembly code goes here */ }
could be inserted by the programmer in the callee assembly function, which tells the compiler to compile the caller high-level function as to put the first input value in ebx, the second input value in ebx, and to expect the first return value in eax.

GCC's inline assembly can work like this, although it's really only intended for code that can be placed inline, not function calls.

rdos · Post by **rdos** » Tue Jan 17, 2017 3:01 pm

DixiumOS wrote:Another way to say that C is good is that in Assembly you can't use numbers as large as... 1.7x10^308.

Wrong. I have an arbitrary size integer package similar to bignum, entirely written in assembler, and far more efficient than any C compiler would be able to produce. That's mostly because C compilers cannot handle carry-through and similar things in a good way.

rdos · Post by **rdos** » Tue Jan 17, 2017 3:11 pm

From my point of view, the major reason why C sucks for OS development is that it cannot handle segmentation properly, and thus any C coded OS always rely on paging and IPC, which is horribly slow, or use a memory model that can easily create errors that never will be detected and that lies in the kernel as latent bombs.

So, the argument really isn't if hand-coded assembly is faster than an optimizing C compiler, but if a micro-kernel design with lots of TLB shootdowns and context switches ever can beat a monolithic kernel written in assembly using segmentation, which I think it never can. So the C people continue with their flat monolithic kernels that are prone to memory corruption issues, as this is the only way they can reach decent performance.

dozniak · Post by **dozniak** » Tue Jan 17, 2017 4:04 pm

rdos wrote:but if a micro-kernel design with lots of TLB shootdowns and context switches ever can beat a monolithic kernel written in assembly using segmentation, which I think it never can. So the C people continue with their flat monolithic kernels that are prone to memory corruption issues, as this is the only way they can reach decent performance.

I guess it's a good thing that you have never heard about L4.

OSDev.org

Why do people say C is better than Assembly?

Re: Why do people say C is better than Assembly?

Re: Why do people say C is better than Assembly?

Re: Why do people say C is better than Assembly?

Re: Why do people say C is better than Assembly?

Re: Why do people say C is better than Assembly?

Re: Why do people say C is better than Assembly?

Re: Why do people say C is better than Assembly?

Re: Why do people say C is better than Assembly?

Re: Why do people say C is better than Assembly?

Re: Why do people say C is better than Assembly?

Re: Why do people say C is better than Assembly?

Re: Why do people say C is better than Assembly?

Re: Why do people say C is better than Assembly?

Re: Why do people say C is better than Assembly?

Re: Why do people say C is better than Assembly?