GCC saves a struct in a weird way
GCC saves a struct in a weird way
Hello. I have some troubles loading my GDT.
I have a GDT-register struct:
typedef struct
{
uint16_t limit;
uint32_t base;
} GDTRegister;
Weirdly, when I inspect the memory dump, I see that the limit takes as much space as the base, that is 4 bytes instead of two.
At first I thought the problem was that I was using unsigned short instead of uint16_t, but I have updated the struct and I still have the same problem.
Even weirder, if I use uint8_t the limit takes up only one byte as it should. I'm not sure if this is an optimization issue (I think I saw an optimization setting for Visual Studio that lines out memory in blocks of 32-bits for speeds sake, but I'm not sure), or how to fix this. I hope somebody can help me!
I have a GDT-register struct:
typedef struct
{
uint16_t limit;
uint32_t base;
} GDTRegister;
Weirdly, when I inspect the memory dump, I see that the limit takes as much space as the base, that is 4 bytes instead of two.
At first I thought the problem was that I was using unsigned short instead of uint16_t, but I have updated the struct and I still have the same problem.
Even weirder, if I use uint8_t the limit takes up only one byte as it should. I'm not sure if this is an optimization issue (I think I saw an optimization setting for Visual Studio that lines out memory in blocks of 32-bits for speeds sake, but I'm not sure), or how to fix this. I hope somebody can help me!
- thepowersgang
- Member
- Posts: 734
- Joined: Tue Dec 25, 2007 6:03 am
- Libera.chat IRC: thePowersGang
- Location: Perth, Western Australia
- Contact:
Re: GCC saves a struct in a weird way
Always remember to tell the compiler to pack the structs (using __attribute__((packed)) for gcc, or #pragma packed for VCC)
Kernel Development, It's the brain surgery of programming.
Acess2 OS (c) | Tifflin OS (rust) | mrustc - Rust compiler
Currently Working on: mrustc
Acess2 OS (c) | Tifflin OS (rust) | mrustc - Rust compiler
Currently Working on: mrustc
Re: GCC saves a struct in a weird way
Oh right, thanks!
Edit: I used the -fpack-struct flag instead, I think this is cleaner
Edit: I used the -fpack-struct flag instead, I think this is cleaner
Re: GCC saves a struct in a weird way
No, certainly don't use -fpack-struct. There is a reason why structures are unpacked by default: Alignment and the ABI. If you don't design all your structures perfectly (which is *very* difficult to do for large structures with other substructures and custom types that varies between architectures), this will most certainly result in worse performance and even compatibility problems if linked with code compiled with -fno-pack-struct. Very very few structures needs to be packed. In this case you are dealing with weird CPU structures with crazy fields, but normally these fields are nicely designed such that all fields have their natural alignment and that it is packed by design. Simply use __attribute__((packed)) on those few, rare structures that needs to be packed.
The reason this is happening is because of alignment. According to the ABI you are using, uint32_t is 4-byte aligned. This means that when you write:
the compiler detects that qux doesn't have its natural alignment. It then proceeds to rewrite the structure to something like this:
which makes sure all members of the structure is neatly aligned. It is often much more efficient for the CPU to read aligned memory addresses and it is undefined according to the C language to do access variables that isn't aligned according to their type's natural alignment. You really have to go out of your way (break strict aliasing) to create badly aligned variables. This means that whenever you declare a structure as packed, the compiler may generate multiple aligned address reads and a couple shifts to simulate the unaligned read, but at least it is well-defined in this case (as opposed to the undefined behaviour it normally is).
The reason this is happening is because of alignment. According to the ABI you are using, uint32_t is 4-byte aligned. This means that when you write:
Code: Select all
struct foo { uint8_t bar; uint32_t qux; }
Code: Select all
struct foo { uint8_t bar; uint8_t padding[3]; uint32_t qux; }
Last edited by sortie on Sun Nov 03, 2013 6:01 am, edited 1 time in total.
Re: GCC saves a struct in a weird way
Ok, thanks for the information
I will fix it right now!
I will fix it right now!
Re: GCC saves a struct in a weird way
I got my GDT and IDT working, but I ran into a couple of new problems. I have a default handler, and I want it to have a fancy screen which gives the type of exception, the segment and adress of the exception and the values of the general purpose registers.
The first problem is that I can't get the type of the error from within the default error handler. At first I thought the type of error would reside in the error code, but this doesn't seem to be the case. It makes sense, since we can assign different handlers to different exceptions, but it is a bit inconvenient.
Then I ran into troubles with the adress of origin of the exception and the values of the general purpose registers. I can get the adress and the values of eax, ecx, edx, ebx, esi and edi with some inline asm. This is very cumbersome though, as I want to do this in many different places (and for some reason, gcc gives my warnings if I write a function that has to be inlined) and the inline assembly is a bit messy. The error code, code selector and exception adress are passed like normal arguments, so maybe you can access them in pure C? I don't have much experience with C, and I don't know how to do this.
My other problem is the values of esp and ebp. Since gcc lets me inline assembly only after performing the stack frame switch (in which ebp and esp are changed) I can't get these with inline assembly. How am I supposed to do this?
I hope someone can help me with this!
The first problem is that I can't get the type of the error from within the default error handler. At first I thought the type of error would reside in the error code, but this doesn't seem to be the case. It makes sense, since we can assign different handlers to different exceptions, but it is a bit inconvenient.
Then I ran into troubles with the adress of origin of the exception and the values of the general purpose registers. I can get the adress and the values of eax, ecx, edx, ebx, esi and edi with some inline asm. This is very cumbersome though, as I want to do this in many different places (and for some reason, gcc gives my warnings if I write a function that has to be inlined) and the inline assembly is a bit messy. The error code, code selector and exception adress are passed like normal arguments, so maybe you can access them in pure C? I don't have much experience with C, and I don't know how to do this.
My other problem is the values of esp and ebp. Since gcc lets me inline assembly only after performing the stack frame switch (in which ebp and esp are changed) I can't get these with inline assembly. How am I supposed to do this?
I hope someone can help me with this!
Re: GCC saves a struct in a weird way
Never write interrupt handlers in C. It doesn't work that way. You should write your interrupt handlers in assembly and if they need to communicate the state of the CPU to C code, they should save all the registers on the stack in some agreed-upon format. For instance:
Be sure to learn, appreciate and understand the ABI.
Code: Select all
my_interrupt_handler:
/* ... */
/* Some of the interrupted state was pushed by the CPU automatically, push the rest. */
push %eax
push %ebx
push %ecx
push %edx
push %esi
push %edi
push %ebp
/* Call the C interrupt handler. */
push %esp
call my_c_interrupt_handler
add $4, %esp
/* Restore interrupted CPU state. */
pop %ebp
pop %edi
pop %esi
pop %edx
pop %ecx
pop %ebx
pop %eax
/* ... */
Code: Select all
struct interrupt_registers
{
uint32_t ebp;
uint32_t edi;
uint32_t esi;
uint32_t edx;
uint32_t ecx;
uint32_t ebx;
uint32_t eax;
/* ... */
};
void my_c_interrupt_handler(struct interrupt_registers* regs)
{
/* ... */
}
Re: GCC saves a struct in a weird way
Ok, wow! This makes sense. I guess I can just use a label (after some googling I fount that you have to prefix an underscore to follow the C naming convention) in the .asm file, assemble to an object file and link with my other object files? Ah, well, I'll try tomorrow!sortie wrote:Never write interrupt handlers in C. It doesn't work that way. You should write your interrupt handlers in assembly and if they need to communicate the state of the CPU to C code, they should save all the registers on the stack in some agreed-upon format. For instance:
...
Be sure to learn, appreciate and understand the ABI.
Thanks for your fast and awesome responses
Last edited by kutkloon7 on Sun Nov 03, 2013 5:50 pm, edited 1 time in total.
Re: GCC saves a struct in a weird way
No, you do not need to prefix an underscore. That's for silly a.out platforms and the Windows ABI (I believe). As you are surely using a i586-elf cross-compiler, which follows the System V ABI, symbols in C code do not have leading prefixes. Try compile some C code using:
and you can see how the compiler generates assembly for your ABI. The symbols in that C file should not have any leading underscores. Note that you must do a '.global symbol_name' directive from your GNU assembler to make the assembly symbols visible from other translation units. If you are using another assembler, you would need to do something similar. Check out the Bare Bones example if you are in doubt, but note that the underscore in the _start symbol is not because all symbols are prepended with an underscore, but rather to put the symbol in the reserved namespace.
Also, don't quote posts you respond to unless it's directly relevant in your response - for instance, it's hardly required to repeat all that example code in the quote (noting that this code is just an example of the idea, rather than something that actually works).
Mind you there is no such thing as 'C naming convention' and 'C calling convention'. This usually refers to obsolete ABI semantics for old DOS and Windows versions (perhaps even old Unix versions), but the C standard mandates absolutely nothing in this regard. It's all up to the ABI of the implementation, in your case hopefully the System V ABI, which as I stated earlier, you should learn, appreciate and understand.
Code: Select all
i586-elf-gcc -ffreestanding -S test.c -o test.s -O2
Also, don't quote posts you respond to unless it's directly relevant in your response - for instance, it's hardly required to repeat all that example code in the quote (noting that this code is just an example of the idea, rather than something that actually works).
Mind you there is no such thing as 'C naming convention' and 'C calling convention'. This usually refers to obsolete ABI semantics for old DOS and Windows versions (perhaps even old Unix versions), but the C standard mandates absolutely nothing in this regard. It's all up to the ABI of the implementation, in your case hopefully the System V ABI, which as I stated earlier, you should learn, appreciate and understand.
Re: GCC saves a struct in a weird way
Oh, great. Must've read some windows-specific information, then. Indeed I'm using an i586-elf cross compiler (or even i386, I think. I will update sometime in the future because I probably do want to be able to use MMX instruction)sortie wrote:No, you do not need to prefix an underscore. That's for silly a.out platforms and the Windows ABI (I believe). As you are surely using a i586-elf cross-compiler, which follows the System V ABI, symbols in C code do not have leading prefixes.
I came across the System V ABI earlier, but didn't really understand what it was. I downloaded a copy of the specification so I can look things up if I'm lost.sortie wrote: Mind you there is no such thing as 'C naming convention' and 'C calling convention'. This usually refers to obsolete ABI semantics for old DOS and Windows versions (perhaps even old Unix versions), but the C standard mandates absolutely nothing in this regard. It's all up to the ABI of the implementation, in your case hopefully the System V ABI, which as I stated earlier, you should learn, appreciate and understand.
(Would you suggest I read it or skim through it or just keep it as a reference? Seems like tough stuff to read to me I'm not really sure how else I should learn the System V ABI).
Re: GCC saves a struct in a weird way
The system V ABI is a bit worse because there is a Core Document, which is then extended by platform specific Documents. For instance, there is a i386 Supplement to the System V ABI. Now, these documents are probably not the best way to learn the ABI, but they are a good reference when implementing it. They are invaluable when you implement a program loader for your OS, for instance. To learn the ABI, it's a matter of patience and experience I suppose. I know of no really good introductions to the System V ABI, but I suppose you can find some out there.
Have you had a course in machine architecture and a compiler course? You should ideally be able to look at a piece of C code and "know" somewhat roughly how the compiler will understand it and how the generated assembly would look like. For that, I can recommend compiling with -S (keeping in mind that the compiler normally also emits magical assembler directives you need not know of yet) or calling i586-elf-objdump -d on your kernel. It's a good idea to learn to "read" such assembly and be able to see the equivalence to the original C code.
Additionally, it's a very good idea to learn the C programming language in detail and understanding in depth what exactly undefined behaviour is and how your compiler deals with it. For instance, signed overflow is undefined according to the C standard. Therefore GCC assumes signed overflow never occurs. It will then possibly optimize if ( INT_MAX + 1 == INT_MIN ) to if ( false ).
Have you had a course in machine architecture and a compiler course? You should ideally be able to look at a piece of C code and "know" somewhat roughly how the compiler will understand it and how the generated assembly would look like. For that, I can recommend compiling with -S (keeping in mind that the compiler normally also emits magical assembler directives you need not know of yet) or calling i586-elf-objdump -d on your kernel. It's a good idea to learn to "read" such assembly and be able to see the equivalence to the original C code.
Additionally, it's a very good idea to learn the C programming language in detail and understanding in depth what exactly undefined behaviour is and how your compiler deals with it. For instance, signed overflow is undefined according to the C standard. Therefore GCC assumes signed overflow never occurs. It will then possibly optimize if ( INT_MAX + 1 == INT_MIN ) to if ( false ).
Re: GCC saves a struct in a weird way
I have had a course in computer architecture almost 3 years ago (but IMO it sucked, the course was on computer architecture and networking, and the emphasis was on networking), not on compiler design. However, I recently did some research on compiler design myself and I enrolled in a course called "Languages and Compilers" starting next week.
Luckily, I have a little experience with reverse engineering in OllyDBG and debugging my 'kernel' (well, it's not really a kernel yet, but you know what I mean...) in Bochs (with optimization in GCC off, that is. Compiling with optimization gave me errors, so I decided to simply turn it off), this is mainly where my knowledge of x86 assembly comes from (together with writing my bootloader, and a short time where I tried to program the winAPI in MASM). I know turning optimization off makes everything much easier to read, but I guess I have some idea of the correspondance between ASM and C. I have to say that my knowledge of linkers and loaders is lacking (which you can see here, I had no idea about the System V ABI and the like).
Thank you for your time and help, I appreciate it!
Luckily, I have a little experience with reverse engineering in OllyDBG and debugging my 'kernel' (well, it's not really a kernel yet, but you know what I mean...) in Bochs (with optimization in GCC off, that is. Compiling with optimization gave me errors, so I decided to simply turn it off), this is mainly where my knowledge of x86 assembly comes from (together with writing my bootloader, and a short time where I tried to program the winAPI in MASM). I know turning optimization off makes everything much easier to read, but I guess I have some idea of the correspondance between ASM and C. I have to say that my knowledge of linkers and loaders is lacking (which you can see here, I had no idea about the System V ABI and the like).
Thank you for your time and help, I appreciate it!
Re: GCC saves a struct in a weird way
Actually, turning optimization on usually produces more clean assembly because non-optimized assembly often contains useless sequences such as "push %eax\npop %eax\n" that do nothing - or perhaps there is some hidden side effect and it only looks like a noop. With optimizations on it is often much clearer, though, optimization can make things so small it becomes hard to understand again.
Anyways, if your kernel doesn't built with optimizations enable, stop what you are doing. Your kernel must at all time work at all optimization levels. If it doesn't, that suggests you are doing undefined behaviour. It could possibly be inline assembly that has the wrong clobber list and the optimize is screwing things up because of your mistake.
Anyways, if your kernel doesn't built with optimizations enable, stop what you are doing. Your kernel must at all time work at all optimization levels. If it doesn't, that suggests you are doing undefined behaviour. It could possibly be inline assembly that has the wrong clobber list and the optimize is screwing things up because of your mistake.
Re: GCC saves a struct in a weird way
I turned on optimization again, it works now
(I know this doesn't mean that my code doesn't use undefined behavior, so I will look at that sometime soon)
Also, my blue screen of death works like a charm now (thanks to you!)
(whoops, I switched eflags and the adress)
(I know this doesn't mean that my code doesn't use undefined behavior, so I will look at that sometime soon)
Also, my blue screen of death works like a charm now (thanks to you!)
(whoops, I switched eflags and the adress)