Kernel Performance
Kernel Performance
Will there be a noticable difference in Kernel performance if written in C++ or C as opposed to Assembly?
- Pype.Clicker
- Member
- Posts: 5964
- Joined: Wed Oct 18, 2006 2:31 am
- Location: In a galaxy, far, far away
- Contact:
Re:Kernel Performance
IMH knowledge, C code will add some 'overhead' for function calls, etc. that you could skip by writing your code in pure ASM (provided that you code specific pushes/pops in your code and no pusha/popa as i used to do to avoid misordering of such operations)
On the other side, a good C compiler (gcc in mind) is able to produce in a few ms machine code that pairs better on pentium CPUs than what human beings can do in an evening ...
Same compiler can (if enough optimizations are chosen) automatically inline some functions, use registers for frequent variables, etc.
For C++ vs C, i didn't see major differences in generated code if you use classes with static (err. i mean non-virtual) methods. However, things like multiple inheritance, run-time typing informations (dynamic casting) or exceptions handling are usually implemented behind the scene with heavy computationnal power (such as multi-pass unrolling stacks, etc.)
If you have to choose between
[table][tr][td]
{
res_hndl = new resource();
hazardous_operation();
/* normal processing goes on */
}
/* exceptions as well as normal processing
will call the res_hndl dtor at the end of
the block, so resources are freed at the
end of hazardous_operation() even if
this fails and ends with an exception throw
*/
[/td][td]
{
int errcode=FAIL;
void *handler = get_resource();
if (errcode = hazardous_operations()) goto fail;
/* normal processing continues */
release_resource(handler);
return <good value>;
fail:
if (handler) release_resource(handler);
return <error code>;
}
[/td][/tr][/table]
The rightmost(and not leftmost as i mistyped earlier ) code will be from far faster. There can be many other ways to implement it, but i used to have a "goto fail" trick because you can centralize error processing at one place of your function (this is philosophical choice from an old ASM-coder and C prophets would call me heretic if they knew it
On the other side, a good C compiler (gcc in mind) is able to produce in a few ms machine code that pairs better on pentium CPUs than what human beings can do in an evening ...
Same compiler can (if enough optimizations are chosen) automatically inline some functions, use registers for frequent variables, etc.
For C++ vs C, i didn't see major differences in generated code if you use classes with static (err. i mean non-virtual) methods. However, things like multiple inheritance, run-time typing informations (dynamic casting) or exceptions handling are usually implemented behind the scene with heavy computationnal power (such as multi-pass unrolling stacks, etc.)
If you have to choose between
[table][tr][td]
{
res_hndl = new resource();
hazardous_operation();
/* normal processing goes on */
}
/* exceptions as well as normal processing
will call the res_hndl dtor at the end of
the block, so resources are freed at the
end of hazardous_operation() even if
this fails and ends with an exception throw
*/
[/td][td]
{
int errcode=FAIL;
void *handler = get_resource();
if (errcode = hazardous_operations()) goto fail;
/* normal processing continues */
release_resource(handler);
return <good value>;
fail:
if (handler) release_resource(handler);
return <error code>;
}
[/td][/tr][/table]
The rightmost(and not leftmost as i mistyped earlier ) code will be from far faster. There can be many other ways to implement it, but i used to have a "goto fail" trick because you can centralize error processing at one place of your function (this is philosophical choice from an old ASM-coder and C prophets would call me heretic if they knew it
Re:Kernel Performance
10%-40% improvment in speed using assembly hand-optimized by an expert. Don't forget to add in the "cascade effect", the "speed" of your OS will be slower expodentally by using strict C code as comapred to a kernel done professionally in assembly.
- Pype.Clicker
- Member
- Posts: 5964
- Joined: Wed Oct 18, 2006 2:31 am
- Location: In a galaxy, far, far away
- Contact:
Re:Kernel Performance
Cetrainly true, but assembly-expert are hard to find. Even within a family of processors, (80286-386-486-pentium-...), you have things that can speed up or slow down the processing according to the CPU model (think at lodsb or loop instructions).DynatOS wrote: 10%-40% improvment in speed using assembly hand-optimized by an expert. Don't forget to add in the "cascade effect", the "speed" of your OS will be slower expodentally by using strict C code as comapred to a kernel done professionally in assembly.
Also take in account that assembly will often limit you to simple structures like arrays or lists while btrees, avl or hash tables could speed up your operations...
Don't forget that it's hard as hell to keep a maintainable hand-optimized code: the slighest debug can break all your pipe-line ...
again, choosing pure asm kernel coding against C coding is rather philosophical, but you shouldn't expect a performance gain if you're not an asm guru ...
Re:Kernel Performance
Thank you for the advice. I have another question, would there be speed differences in C and C++ kernel coding?
P.S. You may have already answered this, but I didn't think I followed.
P.S. You may have already answered this, but I didn't think I followed.
- Pype.Clicker
- Member
- Posts: 5964
- Joined: Wed Oct 18, 2006 2:31 am
- Location: In a galaxy, far, far away
- Contact:
Re:Kernel Performance
C++ becomes highly slower when you come to use multiple inheritance (IMHO), exception handling (because of the stack manipulations performed behind the scene) or run-time typechecking (the dynamic casting if i remember well): all these features require complex processing from a run-time support library and should be avoided if possible.
Beside these points, most C/C++ compilers i've see generate similar code, and C++ virtual functions cost can be compared to C function pointers.
Another thing that must be kept under control is new() / delete() operators implementation: the slower it will be, the slower your whole code is likely to run (due to the huge amount of allocations operations OOP usually introduces), so having a cell pool (i.e. a list of pre-allocated slices of memory having the same right size) for highly dynamic classes is an optimization must ...
Hope it's clearer like this ...
Beside these points, most C/C++ compilers i've see generate similar code, and C++ virtual functions cost can be compared to C function pointers.
Another thing that must be kept under control is new() / delete() operators implementation: the slower it will be, the slower your whole code is likely to run (due to the huge amount of allocations operations OOP usually introduces), so having a cell pool (i.e. a list of pre-allocated slices of memory having the same right size) for highly dynamic classes is an optimization must ...
Hope it's clearer like this ...
Re:Kernel Performance
Of course, you don't have to use MI, or exceptions, or RTTI or new/delete if you don't want to. Use new and delete where you would normally use malloc and free (i.e. as much as you reasonably need to); use virtual functions where normally you'd use pointers-to-functions in C. IMHO virtual functions are a lot cleaner than tables of pointers to functions anyway, and it should be the same overhead (which isn't much anyway).