better registers representation in C?
better registers representation in C?
Hi all,my way for representing registers in C is to create a shadow register by a bit-field structure, for example:
structe SomeRegister {
union {
struct b {
WORD a: 1;
WORD b: 7;
WORD c: 5;
WORD d: 3;
};
WORD wAll;
};
};
and then read the register value to wAll, change the values in the b struct and write the wAll value back to the register.
the pros of this method is that it's very readable, do you think it's better then "just" directly bitwise the register?
do you know a better approach for registers handling?
thanks!
structe SomeRegister {
union {
struct b {
WORD a: 1;
WORD b: 7;
WORD c: 5;
WORD d: 3;
};
WORD wAll;
};
};
and then read the register value to wAll, change the values in the b struct and write the wAll value back to the register.
the pros of this method is that it's very readable, do you think it's better then "just" directly bitwise the register?
do you know a better approach for registers handling?
thanks!
Re: better registers representation in C?
I would think the opposite, bitwise operation is more readable to me.the pros of this method is that it's very readable
your solution require going back to the structure definition regularly just to make sure which bit it operate on.
In C you can't directly manipulate registers, you most likely to move it to memory/stack,do you know a better approach for registers handling?
then either directly use the bit operator, or bit_set, bit_test and bit_* macros, or compiler builtin functions.
In some case you may also consider inline asm.
- xenos
- Member
- Posts: 1118
- Joined: Thu Aug 11, 2005 11:00 pm
- Libera.chat IRC: xenos1984
- Location: Tartu, Estonia
- Contact:
Re: better registers representation in C?
I use a rather similar approach for CPU feature detection:
CPU.h
CPU.cpp
The code takes the uint32 values returned from a CPUID call and saves them in a union. The bit field part of the union makes it then very easy to check for CPU features - and the compiler highly optimizes these functions. For example, testing for a single bit is translated to a single BT instruction.
However, for hardware registers I use a different approach. These should be read / written as a whole, so I don't want any single bit fiddling in there. For example, here is my I/O APIC code:
IOApic.h
IOApic.cpp
Here the registers are simply read and written as uint32 values.
CPU.h
CPU.cpp
The code takes the uint32 values returned from a CPUID call and saves them in a union. The bit field part of the union makes it then very easy to check for CPU features - and the compiler highly optimizes these functions. For example, testing for a single bit is translated to a single BT instruction.
However, for hardware registers I use a different approach. These should be read / written as a whole, so I don't want any single bit fiddling in there. For example, here is my I/O APIC code:
IOApic.h
IOApic.cpp
Here the registers are simply read and written as uint32 values.
- Love4Boobies
- Member
- Posts: 2111
- Joined: Fri Mar 07, 2008 5:36 pm
- Location: Bucharest, Romania
Re: better registers representation in C?
What's the point of writing C code for things that C can't actually do? Write the function in assembly, pass the appropriate structure(s) and use the linker. Ta-daa! Also, don't use things like "unsigned int" (or worse, "WORD") because not all compilers adhere to the same ABI---you will get different results with different compilers. Instead, use "uint32_t" and friends (defined in <stdint.h> (C) or <cstdint> (C++)).
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
[ Project UDI ]
- xenos
- Member
- Posts: 1118
- Joined: Thu Aug 11, 2005 11:00 pm
- Libera.chat IRC: xenos1984
- Location: Tartu, Estonia
- Contact:
Re: better registers representation in C?
Actually it seems that C(++) works quite well in these cases - with volatile pointers for reading and writing hardware registers, and a bit of inline assembly for CPUID. My primary motivation here is to encapsulate all functions that belong to a certain device or feature (such as the I/O Apic or the CPUID feature bits) in a C++ class. Besides that, I try to avoid writing whole functions in assembler and to perform any manual optimizations, and instead leave this to the compiler.Love4Boobies wrote:What's the point of writing C code for things that C can't actually do? Write the function in assembly, pass the appropriate structure(s) and use the linker. Ta-daa!
I agree with you, and this is indeed one thing I intend to change in my code. I tried to avoid the use of standard headers completely, but I guess in this case it is better to rely on fixed size integers like uint32_t since they are part of the C(++) standard, instead of making assumptions on the size of an int.Also, don't use things like "unsigned int" (or worse, "WORD") because not all compilers adhere to the same ABI---you will get different results with different compilers. Instead, use "uint32_t" and friends (defined in <stdint.h> (C) or <cstdint> (C++)).
Re: better registers representation in C?
Thanks for the second tip.Love4Boobies wrote:What's the point of writing C code for things that C can't actually do? Write the function in assembly, pass the appropriate structure(s) and use the linker. Ta-daa! Also, don't use things like "unsigned int" (or worse, "WORD") because not all compilers adhere to the same ABI---you will get different results with different compilers. Instead, use "uint32_t" and friends (defined in <stdint.h> (C) or <cstdint> (C++)).
I didn't understand quit well the first one - what is the big diff between bitwising registers in C and ASM? and why do you think the second is better?
- Love4Boobies
- Member
- Posts: 2111
- Joined: Fri Mar 07, 2008 5:36 pm
- Location: Bucharest, Romania
Re: better registers representation in C?
Using pointers like that is undefined behavior. The functionality needed to access hardware registers was added to C with TR 18037.XenOS wrote:Actually it seems that C(++) works quite well in these cases - with volatile pointers for reading and writing hardware registers, and a bit of inline assembly for CPUID.Love4Boobies wrote:What's the point of writing C code for things that C can't actually do? Write the function in assembly, pass the appropriate structure(s) and use the linker. Ta-daa!
Why would you want to avoid the standard headers entirely? The C standard defines two types of implementations: freestanding (for GCC, use -ffreestanding, or -fno-hosted) and hosted (for GCC, use -fhosted, or -fno-freestanding). The freestanding headers that are available to you under both types of implementations are: <float.h>, <iso646.h>, <limits.h>, <stdalign.h> (C1X only), <stdarg.h>, <stdbool.h>, <stddef.h>, and <stdint.h>.XenOS wrote:I tried to avoid the use of standard headers completely, but I guess in this case it is better to rely on fixed size integers like uint32_t since they are part of the C(++) standard, instead of making assumptions on the size of an int.
It seems like a code design flaw: you have to use assembly to do the CPU detection but you move bits of the code to C; in doing that, you're also exposing the implementation. Take a look at malloc for instance---you have no idea how it works but the API is portable and well encapsulated. CPU detection should be similar, esp. if you intend to port your code to other architectures: your build system will just link a different object file.Jonatan44 wrote:I didn't understand quit well the first one - what is the big diff between bitwising registers in C and ASM? and why do you think the second is better?
Isn't this solution more elegant than having a big chunk of code with #ifdef's for picking the architecture, inline assembly (which is not only optional but also implemented differently by every compiler) mixed with C (which is meant to be portable but in this case is only used for some operations that logically belong to the assembly block)? IMO, it's a matter of readability and maintainability.
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
[ Project UDI ]
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: better registers representation in C?
In what way is it undefined behavior? If you're referring to the <iohw.h> component of TR 18037, that operates at a completely different level of abstraction than that which we are discussing. volatile pointers have explicitly defined behavior, and were indeed defined for access to MMIO registers. Note that <iohw.h> on its own defines nothing useful.Love4Boobies wrote:Using pointers like that is undefined behavior. The functionality needed to access hardware registers was added to C with TR 18037.XenOS wrote:Actually it seems that C(++) works quite well in these cases - with volatile pointers for reading and writing hardware registers, and a bit of inline assembly for CPUID.Love4Boobies wrote:What's the point of writing C code for things that C can't actually do? Write the function in assembly, pass the appropriate structure(s) and use the linker. Ta-daa!
One place I might use IOHW would be to abstract things like PCI IO space to keep that code portable... but the lack of clean integration with the named address space support that TR also adds is a huge annoyance. I think IOHW will probably prove more useful when interacting with busses like I2C (and maybe SPI) where there is no corresponding "processor address space".
- xenos
- Member
- Posts: 1118
- Joined: Thu Aug 11, 2005 11:00 pm
- Libera.chat IRC: xenos1984
- Location: Tartu, Estonia
- Contact:
Re: better registers representation in C?
I didn't know that these are available in a freestanding environment as well, so thanks for pointing that out. I guess the reason why I didn't use them before is that I'm using C++ and the C++ standard (I'm working with C++0x) defines some headers such as cstdint, but those are part of the C++ standard library and I couldn't find anything about the availability of those headers in a freestanding environment.Love4Boobies wrote:Why would you want to avoid the standard headers entirely? The C standard defines two types of implementations: freestanding (for GCC, use -ffreestanding, or -fno-hosted) and hosted (for GCC, use -fhosted, or -fno-freestanding). The freestanding headers that are available to you under both types of implementations are: <float.h>, <iso646.h>, <limits.h>, <stdalign.h> (C1X only), <stdarg.h>, <stdbool.h>, <stddef.h>, and <stdint.h>.
Actually the CPU detection code I posted is written solely for x86 architectures - both 32 bit and 64 bit. Of course I could write to assembler files, one for 32 bit, one for 64 bit, put the whole CPU detection in there, play around with some bits and so on. But why should I do that when all this can be done within a single file of C++ code, which can be compiled for either 32 or 64 bits, and the only assembly instruction I need to take care of is CPUID?It seems like a code design flaw: you have to use assembly to do the CPU detection but you move bits of the code to C; in doing that, you're also exposing the implementation. Take a look at malloc for instance---you have no idea how it works but the API is portable and well encapsulated. CPU detection should be similar, esp. if you intend to port your code to other architectures: your build system will just link a different object file.
And indeed you are right, for a different CPU architecture I need a different CPU detection mechanism. That's a fact which is independent of using either inline assembly or separate assembler code, so it doesn't matter which of these approaches I use for feature detection of different CPUs.
In fact, I don't need any #ifdef's here. The CPU detection code belongs to the x86 specific part of my kernel (both 32 and 64 bit) and compiled and linked in only if the host architecture is set to x86. If I compile for, say, m68k, this file is not compiled at all, and it is not even used by any code which is not x86 specific. The reason is that I need the information my x86 CPU class provides only for the low level x86 specific routines. (For example: Is fxsave / fxrstor supported? If yes, let the CPU know we want to use it.) It would be completely useless for a m68k CPU anyway, where I need to ask completely different questions (Are CPU32 instructions available?).Isn't this solution more elegant than having a big chunk of code with #ifdef's for picking the architecture, inline assembly (which is not only optional but also implemented differently by every compiler) mixed with C (which is meant to be portable but in this case is only used for some operations that logically belong to the assembly block)? IMO, it's a matter of readability and maintainability.
That's what I thought as well - a volatile variable is some piece of memory about which the compiler cannot make any assumptions. It may change randomly, so it needs to be read and written explicitly even if the compiler "thinks" that this is not necessary.Owen wrote:In what way is it undefined behavior? If you're referring to the <iohw.h> component of TR 18037, that operates at a completely different level of abstraction than that which we are discussing. volatile pointers have explicitly defined behavior, and were indeed defined for access to MMIO registers. Note that <iohw.h> on its own defines nothing useful.
Coping with different address spaces indeed is a good point. But again I see no reason why the support routines for accessing these different address spaces should be written entirely in assembly.One place I might use IOHW would be to abstract things like PCI IO space to keep that code portable... but the lack of clean integration with the named address space support that TR also adds is a huge annoyance. I think IOHW will probably prove more useful when interacting with busses like I2C (and maybe SPI) where there is no corresponding "processor address space".
- Love4Boobies
- Member
- Posts: 2111
- Joined: Fri Mar 07, 2008 5:36 pm
- Location: Bucharest, Romania
Re: better registers representation in C?
Volatile pointers have explicitly defined behavior. On the other hand, volatile pointers pointing to non-objects don't (well, other than NULL and &array[sizeof array / sizeof array[0]]---but you can't read/write). What do you mean by "different level of abstraction"? I really see no incompatibilities.Owen wrote:In what way is it undefined behavior? If you're referring to the <iohw.h> component of TR 18037, that operates at a completely different level of abstraction than that which we are discussing. volatile pointers have explicitly defined behavior, and were indeed defined for access to MMIO registers. Note that <iohw.h> on its own defines nothing useful.
The freestanding C++0x headers are: <ciso646>, <cstddef>, <cfloat>, <limits>, <climits>, <cstdint>, <cstdlib>, <new>, <typeinfo>, <exception>, <initializer_list>, <cstdalign>, <cstdarg>, <cstdbool>, <type_traits>, and <atomic>.XenOS wrote:I didn't know that these are available in a freestanding environment as well, so thanks for pointing that out. I guess the reason why I didn't use them before is that I'm using C++ and the C++ standard (I'm working with C++0x) defines some headers such as cstdint, but those are part of the C++ standard library and I couldn't find anything about the availability of those headers in a freestanding environment.
I guess I was trying to stress on the following points:XenOS wrote:... several things you mentioned but which I am too lazy to individually select---read my rant and you will understand...
- Using inline assembly is dangerous because it ties your code to a single compiler---you don't really want that. The only thing it is actually good at is that when using it you don't need to worry about calling conventions---but that's not such a big maintainability advantage as one might think because the rest of the ABI is not abstracted away.
- If you prefer using an external assembler and a linker (for the above reason), it makes more sense to put all the code in one place (i.e., in the assembly file) since it logically belongs together, as I've already mentioned; it's not much code anyway (I presume that your real reason was using assembly as little as possible) and it's certainly not meant to be portable so there's no advantage to doing it the other way (i.e., mixing external assembly files with C files; in fact, this would be worse than your current solution because it would make maintainability even harder).
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
[ Project UDI ]
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: better registers representation in C?
Nope.The freestanding C++0x headers are: (...) cstdlib
- xenos
- Member
- Posts: 1118
- Joined: Thu Aug 11, 2005 11:00 pm
- Libera.chat IRC: xenos1984
- Location: Tartu, Estonia
- Contact:
Re: better registers representation in C?
Thanks a lot. Just one more thing I'm wondering about. I built and installed my cross compiler (actually quite a lot of them) following the Wiki article on GCC Cross-Compiler. When I compile some C code with this cross compiler using the -ffreestanding option, it correctly finds the C headers such as stdint.h, and they are indeed part of my cross compiler toolchain (i.e., they are somewhere in /usr/xxx-elf/.../include). However, if I compile some C++ code, it cannot find any of these C++ headers such as cstdint. So either I did something wrong or the steps in the Wiki do not lead to a full freestanding C++ cross compiler toolchain including the freestanding header files. (Or maybe they are not included because my GCC build does not adhere to C++0x yet.)Love4Boobies wrote:The freestanding C++0x headers are: <ciso646>, <cstddef>, <cfloat>, <limits>, <climits>, <cstdint>, <cstdlib>, <new>, <typeinfo>, <exception>, <initializer_list>, <cstdalign>, <cstdarg>, <cstdbool>, <type_traits>, and <atomic>.
Of course this is true. However, I'm really not too worried about portability across different compilers. I decided to use GCC for my project just because it fits my needs and it targets a lot of different architectures.Using inline assembly is dangerous because it ties your code to a single compiler---you don't really want that.
That's another thing I'm not too worried about. I had to learn the calling conventions anyway for calling the C entry point of my kernel from the startup code or from interrupt handlers, both of which are written in assembly. So this is really not the point why I'm not using inline assembly.The only thing it is actually good at is that when using it you don't need to worry about calling conventions---but that's not such a big maintainability advantage as one might think because the rest of the ABI is not abstracted away.
What I like about GCC's way of handling inline assembly is that it gives me a fine-grained way of telling GCC what I want to do, without imposing too many restrictions. For example, I write some instructions, some of which require using CPU registers, but I don't need to specify which registers to use. GCC may decide which registers to use so that it fits nicely its own register usage - it can optimize both C code and inline assembly. Simple instructions such as "in al, dx" can be placed in a single header file with inline functions, so they can simply be inlined. There is need to do something like "push port number on the stack, call inportb(...), read port number from stack into dx, perform port read, return from inportb, clean up stack" as it would be the case if "in al, dx" and the code using this function were in different compilation units.
- Love4Boobies
- Member
- Posts: 2111
- Joined: Fri Mar 07, 2008 5:36 pm
- Location: Bucharest, Romania
Re: better registers representation in C?
It is, according to the latest C++0x draft (see section 17.6.1.3).Combuster wrote:Nope.The freestanding C++0x headers are: (...) cstdlib
I attempted to answer XenOS but reading is hard, esp. when I'm dead drunk, so I guess I'll have to do it tomorrow.
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
[ Project UDI ]
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: better registers representation in C?
Nope, you only get the 20% of stdlib you don't want to use. (Specifically, you get all the functions that are worthless in a kernel context like exit(), you get none of the actually portable functions, and you obviously don't get malloc et al)
- Love4Boobies
- Member
- Posts: 2111
- Joined: Fri Mar 07, 2008 5:36 pm
- Location: Bucharest, Romania
Re: better registers representation in C?
Ah, that's what you meant. Yes, that's true. Also, I just noticed that <thread> was also listed there, in a previous C++0x draft.Combuster wrote:Nope, you only get the 20% of stdlib you don't want to use. (Specifically, you get all the functions that are worthless in a kernel context like exit(), you get none of the actually portable functions, and you obviously don't get malloc et al)
Freestanding C++03 (with or without TR1) implementations must provide fewer headers: <cstddef>, <limits>, <cstdlib> (see Combuster's comments), <new>, <typeinfo>, <exception>, and <cstdarg>. Perhaps passing -std=c++0x fixes things; I don't know how complete GCC's C++0x support is.XenOS wrote:However, if I compile some C++ code, it cannot find any of these C++ headers such as cstdint. So either I did something wrong or the steps in the Wiki do not lead to a full freestanding C++ cross compiler toolchain including the freestanding header files. (Or maybe they are not included because my GCC build does not adhere to C++0x yet.)
And what if you someday wish to port it to some architecture that GCC doesn't support but other compilers do? There are plenty of useful things that GCC can't output, such as EFI Byte Code (not that you'd want to port your OS to EFI, I'm just pointing out that GCC's not perfect).XenOS wrote:Of course this is true. However, I'm really not too worried about portability across different compilers. I decided to use GCC for my project just because it fits my needs and it targets a lot of different architectures.Using inline assembly is dangerous because it ties your code to a single compiler---you don't really want that.
Learning calling conventions wasn't my point. My point was that changing the ABI at some point (e.g., by passing some argument to your compiler) can cause subtle bugs regardless of how you wish to write your assembly, in case inline assembly created some false sense of security. But I take it that wasn't it...XenOS wrote:That's another thing I'm not too worried about. I had to learn the calling conventions anyway for calling the C entry point of my kernel from the startup code or from interrupt handlers, both of which are written in assembly. So this is really not the point why I'm not using inline assembly.The only thing it is actually good at is that when using it you don't need to worry about calling conventions---but that's not such a big maintainability advantage as one might think because the rest of the ABI is not abstracted away.
The CPU can do that as well, via register renaming Anyway, this optimization only makes sense if you're going to use inline assembly in very tight loops... and that doesn't happen very often.XenOS wrote:What I like about GCC's way of handling inline assembly is that it gives me a fine-grained way of telling GCC what I want to do, without imposing too many restrictions. For example, I write some instructions, some of which require using CPU registers, but I don't need to specify which registers to use. GCC may decide which registers to use so that it fits nicely its own register usage - it can optimize both C code and inline assembly. Simple instructions such as "in al, dx" can be placed in a single header file with inline functions, so they can simply be inlined. There is need to do something like "push port number on the stack, call inportb(...), read port number from stack into dx, perform port read, return from inportb, clean up stack" as it would be the case if "in al, dx" and the code using this function were in different compilation units.
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
[ Project UDI ]