Assembly vs C

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
feare56
Member
Member
Posts: 97
Joined: Sun Dec 23, 2012 5:48 pm

Assembly vs C

Post by feare56 »

I would like to know while my OS is still young what would be better to do my OS in. I know I need assembly for bits and pieces and I also know c was made for OS development but so far I have it all in assembly and wondering if I would need to redo it and have it in c or just go in the path I am going, strictly assembly. And what is some advantages/disadvantages of assembly and c, besides c was made for OS dev.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Assembly vs C

Post by Brendan »

Hi,
feare56 wrote:I would like to know while my OS is still young what would be better to do my OS in. I know I need assembly for bits and pieces and I also know c was made for OS development but so far I have it all in assembly and wondering if I would need to redo it and have it in c or just go in the path I am going, strictly assembly. And what is some advantages/disadvantages of assembly and c, besides c was made for OS dev.
Advantages of C:
  • Faster to write and easier to maintain
  • Produces faster code
  • More portable
Advantages of assembly:
  • Much easier to port the toolchain/assembler to your OS later, or write your own toolchain/assembler (especially if your OS is not a boring *nix clone)
  • Produces smaller code
  • Able to make direct use of CPU features (e.g. easy to write things like CPU mode switches, task switching code, etc)
  • Easier to detect/avoid certain types of bugs (e.g. overflows in calculations)

Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Antti
Member
Member
Posts: 923
Joined: Thu Jul 05, 2012 5:12 am
Location: Finland

Re: Assembly vs C

Post by Antti »

Brendan wrote:Much easier to port the toolchain/assembler to your OS later
It is definitely true that writing an assembler is much easier to do than a C compiler. However, I am a little bit confused about the porting issue. Is it so that current assemblers are usually written in C? It requires "C toolchain" to compile it if doing a self-hosting system. If the system can compile an assembler written in C, it is probably capable of compiling a C compiler too. If the assembler to be ported is written in assembly itself, this is not a problem. FASM? BASM (=Brendan's Assembler)?
User avatar
bluemoon
Member
Member
Posts: 1761
Joined: Wed Dec 01, 2010 3:41 am
Location: Hong Kong

Re: Assembly vs C

Post by bluemoon »

assembler usually rely on less dependency compared to giant C compiler, thus I agree porting an assembler is easier, by very little (less number of dependency lib to be ported).
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Assembly vs C

Post by Combuster »

You don't *need* to use something draconic like GCC (or LLVM for that matter) for the effort to port a C compiler - there are simpler alternatives :wink:
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
bewing
Member
Member
Posts: 1401
Joined: Wed Feb 07, 2007 1:45 pm
Location: Eugene, OR, US

Re: Assembly vs C

Post by bewing »

"Porting" a compiler or assembler does not mean that it has to be self-compiling. It just means "getting a copy into your OS that runs" using whatever means necessary (cross compilers, hand editing, etc.).

I will also slightly disagree with Brendan about the speed issue. It can go either way, depending on how good you are.

And personally, I find that there are big benefits from converting from ASM to C and back again. When you write in ASM you find out what the machine can do efficiently, but you introduce a lot of very hard-to-find bugs. When you convert the code to C: you can suddenly see some smarter algorithms than the originals, you can see "cleaner logic", you can see and fix a big percentage of those old bugs without even trying, you can also add a lot of new code really fast, and do big rewrites on your data structures, the meanings of your bitfields, or your whole boot process or memory allocator or whatever. In ASM you get cemented into place, because you don't want to break 16 months of code by changing something major. But once you've written the code for a while in C and done a lot of debugging rins, you can see how inefficient some of the compiled code is -- and then it becomes nice to convert all the stable parts of the OS back into ASM.

But the process of translating both ways (especially after you have example code in both languages) isn't nearly as nasty as you might think.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Assembly vs C

Post by Brendan »

Hi,
bewing wrote:I will also slightly disagree with Brendan about the speed issue. It can go either way, depending on how good you are.
In theory, the performance of assembly can always be as good as C and can typically be better than C. This is what gives people the impression that assembly is faster than C.

In practice (especially for "large" projects like a kernel), to make the assembly as good as the C takes extra time and can severely degrade the maintainability of the source code. For one example, in C you could have a simple function that the compiler will inline to avoid function call overhead, and to do the same in assembly you end up duplicating that simple function's code all over the place (so that if you need to change it slightly, you need to find and then modify each occurrence of it with no real indication of where it's been duplicated or if you've missed an occurrence). In general, (just like projects written in C) you only bother optimising a small amount of the code, and (unlike projects written in C) the remainder ends up being code that was not optimised by anything at all.

However, in theory it would be possible to develop an optimising assembler that does some of the optimisations that a compiler would do without creating maintainability problems (peep-hole, limited inlining, instruction scheduling, etc). Sadly, it's hard to tell how much effect an optimising assembler would have on the "performance vs. maintainability" compromise, because (as far as I know) nobody has ever bothered to actually implement an assembler that does more optimisation than just finding the best opcode/encoding for each instruction.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
phillid
Member
Member
Posts: 58
Joined: Mon Jan 31, 2011 6:07 pm

Re: Assembly vs C

Post by phillid »

bewing wrote:I will also slightly disagree with Brendan about the speed issue. It can go either way, depending on how good you are.
Me too, it all depends on the programmer.

Take a look at http://www.azillionmonkeys.com/qed/asmexample.html

Brendan wrote:...you need to find and then modify each occurrence of it with no real indication of where it's been duplicated or if you've missed an occurrence
Have you not heard of macros?

But my personal preference is Assembly, I'm just taking a break from OS dev for a few months or so to do a bit of research on the theory a bit more. I've dipped my toes into C os dev, but I don't like it so far. C's for application-level development, but I don't really like it for OS dev. Again, just a personal preference.
phillid - Newbie-ish operating system developer with a toy OS on the main burner
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Assembly vs C

Post by Brendan »

Hi,
phillid wrote:
Brendan wrote:...you need to find and then modify each occurrence of it with no real indication of where it's been duplicated or if you've missed an occurrence
Have you not heard of macros?
Yes, but macros won't optimise anything and just make optimising by hand harder.

For example, consider this code:

Code: Select all

int foo(int x) {
    return bar(x) + 3;
}

int bar(int x) {
    return x - 3;
}
A C compiler would hopefully inline "bar()", then optimise it down to "nothing", then hopefully inline "foo" too.

Now imagine it's in assembly. The unoptimised version might be:

Code: Select all

foo:
    call bar
    add eax,3
    ret

bar:
    sub eax,3
    ret
By using a macro for "bar" so that it's inlined, you might end up with this:

Code: Select all

%macro BAR 1
    sub %1,3
%endmacro

foo:
    BAR eax
    add eax,3
    ret
And after macro expansion that would become:

Code: Select all

foo:
    sub eax,3
    add eax,3
    ret
See how that doesn't get optimised?

To beat C, you can't use macros and you have to optimise manually. For example, you might do:

Code: Select all

foo:
;    sub eax,3       (cancelled out)
;    add eax,3      (cancelled out)
    ret
Of course then if you change "bar" and want to subtract 4 you're screwed.

Worse, what if what you really want is this:

Code: Select all

int foo(int x) {
    return bar(x) + MAX_NUMBER_OF_GOATS;
}

int bar(int x) {
    return x - MAX_NUMBER_OF_CHICKENS;
}
phillid wrote:But my personal preference is Assembly, I'm just taking a break from OS dev for a few months or so to do a bit of research on the theory a bit more. I've dipped my toes into C os dev, but I don't like it so far. C's for application-level development, but I don't really like it for OS dev. Again, just a personal preference.
I've always used assembly for all of my OS's code (not including various utilities that run on Linux, to create disk images, etc for the OS - I use C for that). However, my choice has nothing to do with performance.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
rdos
Member
Member
Posts: 3306
Joined: Wed Oct 01, 2008 1:55 pm

Re: Assembly vs C

Post by rdos »

When writing a complete system in assembly, you won't use the interfaces of C (like random register allocations or passing parameters on stack). You setup specific registers for specific uses. Additionally, using CY flag for status rather than a register (like eax) saves one register. It is also faster to check for CY than to check if some variable has some value.

So comparing C vs assembly by using examples that are C-interfacable doesn't give a true picture of the possibilities of assembly.

When I use C in my kernel, I need complex #pragmas which define calling conventions for C that are non-standard. In some cases, the only option for interfacing with C is to use an assembly-stub. This is the case for syscalls that return more than one parameter in registers for instance.
phillid
Member
Member
Posts: 58
Joined: Mon Jan 31, 2011 6:07 pm

Re: Assembly vs C

Post by phillid »

Brendan wrote:However, my choice has nothing to do with performance.
Same :) In this case, I was just arguing one possible advantage of assembly :P

@feare56 I would suggest trying both for a while and then decide which you want to use. In the OS dev, you're always going to get 4 types of people; those who use mostly C, those who use a mix of C and asm, those use use entirely assembly... and the n00bs who think they will be able to use python :D
IMO it's up to you to decide, really.
phillid - Newbie-ish operating system developer with a toy OS on the main burner
feare56
Member
Member
Posts: 97
Joined: Sun Dec 23, 2012 5:48 pm

Re: Assembly vs C

Post by feare56 »

Ok thanks
Casm
Member
Member
Posts: 221
Joined: Sun Oct 17, 2010 2:21 pm
Location: United Kingdom

Re: Assembly vs C

Post by Casm »

The only thing I have against doing the whole thing in assembly is that the C code looks prettier.

On the other hand, I have spent forever looking for a C compiler I like, but now, at long last, Watcom have got a 64 bit compiler planned.
User avatar
bewing
Member
Member
Posts: 1401
Joined: Wed Feb 07, 2007 1:45 pm
Location: Eugene, OR, US

Re: Assembly vs C

Post by bewing »

Just for the record:

How to beat GCC with ASM:

There are several rules that GCC mostly has to follow when compiling code. These rules make GCC binary code less efficient than it might be. The C standard currently (and probably forever into the future) requires that all immediate variables be saved on the stack. The second rule is the generic cdecl ABI. What this comes down to is that GCC uses memory a lot more than it should, and registers a lot less than it should.

The next main point is that size = speed (approximately) when it comes to binaries. If binary1 is half the size of binary2, it will take half as long to load from disk. It will get pageswapped out half as often. When it gets swapped back in, it will take half the time to swap. When it runs, a small binary will require fewer cache-line loads to get from the beginning to the end of the logic. A bigger percentage of a small binary will fit in L1 rather than overflowing into L2 cache. Etc. As brendan already pointed out, handwritten ASM code will always be smaller than GCC code. With cleverness, that size difference can be very large.

So, Rule #1: minimize your use of memory (especially including the stack), and maximize your use of registers. Choose your algorithms to have few enough variables that you can hold almost all of them in registers. It is true that the stack is almost always in L1 cache, but that still means that a register access is 3 to 20 times faster than a stack access on any CPU. When you use a big memory access, try to access it linearly -- the CPU will try to help with automatic prefetch, if the stepsize is obvious.

Rule #2: break the ABI. The ABI exists for the purpose of allowing users to trace into your code. But you are writing an OS, and you don't WANT people tracing into your code anyway. As rdos said, use something resembling a fastcall interface. Ie. pass arguments to functions in registers (not the stack!). GCC has to assume that certain registers will be clobbered by any function call, and preserve their values if necessary on the stack. Any function call must preserve all other registers that it clobbers on the stack. As a clever ASM programmer human, you can do better than that. Preserving registers uses memory accesses, and you need to obey rule #1. So pay attention to which register values you actually USE, and which get clobbered, and only preserve the ones you must. Sometimes you can even preserve the value of an important register in another non-important register.

Rule #3: simplify the looping. In C, the do {} while(); construction is visually kludgy, and not many people use it. In ASM, it's several percent faster than any other loop. Counting your loops down from n to 0, rather than up from 0 to n is also several percent faster. Coding in ASM will encourage you to be smarter with your looping -- and anything that gets done 10 thousand times will get you a big speed bonus if you can just make a small improvement in it.

Rule #4: return multiple values from functions when it makes sense (in registers, of course). This is the biggest advantage that ASM has over C. Make sure to utilize it. As rdos said, you can also gain an instruction cycle or two by using the carry flag as a boolean error flag.
User avatar
Love4Boobies
Member
Member
Posts: 2111
Joined: Fri Mar 07, 2008 5:36 pm
Location: Bucharest, Romania

Re: Assembly vs C

Post by Love4Boobies »

bewing wrote:There are several rules that GCC mostly has to follow when compiling code. These rules make GCC binary code less efficient than it might be. The C standard currently (and probably forever into the future) requires that all immediate variables be saved on the stack. The second rule is the generic cdecl ABI. What this comes down to is that GCC uses memory a lot more than it should, and registers a lot less than it should.
The C standard does not require anything to be saved on the stack. In fact, it doesn't even mandate stacks or heaps (these words don't even appear in the specification). As far as calling conventions go, this is the sad reality of static linking. I have talked about this on my bog and even proposed a solution. Still, I'd like to add two things:
  • If the project is not large enough that build times are an issue, one could do whole-project compilation instead.
  • Even if you do decide to use static linking, calling conventions are only required for functions that are exported from an object file. Hence, if you make them private by using the "static" keyword, compilers should produce more efficient code. I'd also look through the compiler's documentation for any relevant optimization flags.
However, the performance gains from calling functions in a more optimal manner should generally be so close to zero that they're not even worth talking about. It's only very tight loops ran for insane amounts of time where this overhead would add up to anything noticeable.
bewing wrote:The next main point is that size = speed (approximately) when it comes to binaries.
I disagree with that statement. At most, you can say something like Po ~ c * S, where Po is the performance overhead (not performance per se since that has mostly has to do with the algorithms and data structures used), c is the constant average overhead caused by machien code, and S is the code size. Once you have that, you can plug it into P = Po * A where P is the actual performance and A is a function of all algorithm running times and their inputs. Notice that this is a linear equation so Po should only be relevant for really small values of A.

Also, as I've recently mentioned on a different thread, there's an 80-20 rule, which implies that, usually, 80% of the time is spent on 20% of the code.

I didn't have the time to read the rest of your post but you seemed to be talking about assembly micro-optimizations, which I would have likely agreed about.
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
Post Reply