Page 1 of 1

Overuse of Casting (was Screen Shots...)

Posted: Fri May 14, 2010 3:00 am
by Neolander
Found a major design flaw... :cry: Unless I take the time to redesign and rewrite a large chunk of my current code, it'll look like this forever...
cast.png
(Each time you see blue text, there's a cast. In my opinion, casts should *always* be avoided when other options are available)

The issue : in 32-bit, a pointer is 32-bit long. In 64-bit, a pointer is 64-bit long. Now, what happens when a 32-bit program has to send a structure including some pointers to a 64-bit program ? *facepalm for not thinking about this before* #-o
Current code treats *every* pointer in the structure as uint64_t, as an experiment. Clearly, this is not viable as a long-term approach.

Re: What does your OS look like? (Screen Shots..)

Posted: Fri May 14, 2010 5:39 am
by qw
What is the real type of kinfo->kmmap anyway?

Re: What does your OS look like? (Screen Shots..)

Posted: Fri May 14, 2010 1:25 pm
by Neolander
It used to be defined as "kernel_memory_map*", in a header shared by both 32-bit and 64-bit code. Until I realized that for 64-bit code, "kernel_memory_map*" did not meant the same thing at all. Then, I tried uint64_t, as there's no way to define a 64-bit pointer in 32-bit C afaik. The resulting code looked like the... thing... above.

Now, I consider using some magic hack : generate a 32-bit structure, work on it, and then convert it to a 64-bit one (does not require expensive code if done right) just before going in 64-bit mode.

Re: What does your OS look like? (Screen Shots..)

Posted: Fri May 14, 2010 9:32 pm
by TylerH
Why not declare pointers with void*, and use the appropriate compiler(32 bit/64 bit) to generate pointers of the right size? Declaring things with a predefined size(like using uints for pointers) when that size is likely to change between ports takes all the portability out of C, define things as what they are, not something else that happens to be the size needed for a particular port.

Casts are a great thing, they allow you to choose how data is accessed, if they bother you, use Basic :P.

Re: Overuse of Casting (was Screen Shots...)

Posted: Sat May 15, 2010 2:13 am
by qw
I think if "kernel_memory_map *" means something different in 32- and 64-bits code, you shouldn't use it in both 32- and 64-bits code.

Re: Overuse of Casting (was Screen Shots...)

Posted: Sat May 15, 2010 3:30 am
by Neolander
Hobbes wrote:I think if "kernel_memory_map *" means something different in 32- and 64-bits code, you shouldn't use it in both 32- and 64-bits code.
That's the point.

Let's explain the thing as a whole...
  • As everybody probably knows here, GRUB legacy is a bit too stupid to load an ELF64 binary and run it in Long Mode with identity mapping directly. Hence the need for some 32-bit code loaded and run before the kernel.
  • Since I need some 32-bit bootstrap code anyway, I thought "Hey ! Let's put all the bootloader-dependent code here ! This way, portability over multiple archs will be easier if I want to get into it later".
  • Then I wondered which kind of information I wanted for my kernel, and wrote a header where the interesting part looked like this :

    Code: Select all

    typedef struct kernel_memory_map kernel_memory_map;
    struct kernel_memory_map {
      uint64_t location;
      uint64_t size;
      unsigned char nature; //0 : Free memory
                            //1 : Reserved address range
                            //2 : Bootstrap kernel component
                            //3 : Kernel and modules
      char* name; //String used to go into more details about the area's function. For free and reserved memory it's either "Low Mem" or "High Mem".
                    //Bootstrap kernel is called "Bootstrap", its separate parts have a precise naming
                    //Kernel and modules are called by their GRUB modules names
    };
     
    typedef struct kernel_information {
      char* command_line; //char* to the kernel command line
      //Memory map
      uint32_t kmmap_size; //Number of entries in kernel memory map
      kernel_memory_map* kmmap; //Gives an insight of the full memory layout.
      arch_specific_info arch_info; //Some arch-specific information (eg startup disk on x86)
    } kernel_information;
  • Then I became incredibly stupid and included that code *both* in the kernel code and in the bootstrap code when I was writing the latter.
  • Being almost done with said bootstrap code, I realized my mistake.
  • I then tried to change all pointers to uint64_t. Structures definitions now looked like this.

    Code: Select all

    typedef struct kernel_memory_map kernel_memory_map;
    struct kernel_memory_map {
      uint64_t location;
      uint64_t size;
      unsigned char nature; //0 : Free memory
                            //1 : Reserved address range
                            //2 : Bootstrap kernel component
                            //3 : Kernel and modules
      uint64_t name; //char* to a string naming the area. For free and reserved memory it's either "Low Mem" or "High Mem".
                    //Bootstrap kernel is called "Bootstrap", its separate parts have a precise naming
                    //Kernel and modules are called by their GRUB modules names
    };
     
    typedef struct kernel_information {
      uint64_t command_line; //char* to the kernel command line
      //Memory map
      uint32_t kmmap_size; //Number of entries in kernel memory map
      uint64_t kmmap; //Pointer to the kernel memory map
      arch_specific_info arch_info; //Some arch-specific info
    } kernel_information;
    Readability clearly got poorer, and it was even worse when I tried to modify the code in order to make it work : every single piece of code which used those structures ended up looking like the picture above. There's nothing better than casts everywhere to make code poorly readable.
  • After 20 minutes of code tweaking, I was happy to see that I could get the whole thing to run again, but then I looked again at what I did with a naive look and thought "Ugggggh... That's some very ugly C source code. I can't redistribute it to anyone, and if I look at it in 10 years I'll wonder what it does". Hence I now look for ideas to improve code cleanness
The approach that I'm currently considering is to use three headers. Two for the bootstrap code, and one for the kernel code.
Kernel code header would be the original one :

Code: Select all

typedef struct kernel_memory_map kernel_memory_map;
struct kernel_memory_map {
  uint64_t location;
  uint64_t size;
  unsigned char nature; //0 : Free memory
                        //1 : Reserved address range
                        //2 : Bootstrap kernel component
                        //3 : Kernel and modules
  char* name; //String used to go into more details about the area's function. For free and reserved memory it's either "Low Mem" or "High Mem".
                //Bootstrap kernel is called "Bootstrap", its separate parts have a precise naming
                //Kernel and modules are called by their GRUB modules names
};

typedef struct kernel_information {
  char* command_line; //char* to the kernel command line
  //Memory map
  uint32_t kmmap_size; //Number of entries in kernel memory map
  kernel_memory_map* kmmap; //Gives an insight of the full memory layout.
  arch_specific_info arch_info; //Some arch-specific information (eg startup disk on x86)
} kernel_information;
The part of the bootstrap code which operates on this data would include something like this :

Code: Select all

typedef struct bs_memory_map bs_memory_map;
struct bs_memory_map {
  uint64_t location;
  uint64_t size;
  unsigned char nature; //0 : Free memory
                        //1 : Reserved address range
                        //2 : Bootstrap kernel component
                        //3 : Kernel and modules
  uint64_t name; //String used to go into more details about the area's function. For free and reserved memory it's either "Low Mem" or "High Mem".
                //Bootstrap kernel is called "Bootstrap", its separate parts have a precise naming
                //Kernel and modules are called by their GRUB modules names
};

typedef struct bs_information {
  char* command_line; //char* to the kernel command line
  //Memory map
  uint32_t kmmap_size; //Number of entries in kernel memory map
  kernel_memory_map* kmmap; //Gives an insight of the full memory layout.
  arch_specific_info arch_info; //Some arch-specific information (eg startup disk on x86)
} bs_information;
And I'd write a new part of the bootstrap kernel whose role would be to convert those structures in kernel structures, including the current header :

Code: Select all

typedef struct real_kernel_memory_map real_kernel_memory_map;
struct real_kernel_memory_map {
  uint64_t location;
  uint64_t size;
  unsigned char nature; //0 : Free memory
                        //1 : Reserved address range
                        //2 : Bootstrap kernel component
                        //3 : Kernel and modules
  uint64_t name; //char* to a string naming the area. For free and reserved memory it's either "Low Mem" or "High Mem".
                //Bootstrap kernel is called "Bootstrap", its separate parts have a precise naming
                //Kernel and modules are called by their GRUB modules names
};
 
typedef struct real_kernel_information {
  uint64_t command_line; //char* to the kernel command line
  //Memory map
  uint32_t kmmap_size; //Number of entries in kernel memory map
  uint64_t kmmap; //Pointer to the kernel memory map
  arch_specific_info arch_info; //Some arch-specific info
} real_kernel_information;
This way, I'd greatly reduce use of casts in my code, leading to an excellent readability (the only places where there would be some casts left would be those using kernel_memory_map->name. That's because
1/It's rarely used in the bootstrap code so it doesn't matter much.
2/The kernel memory map can be up to 1000-entries long (in current code, may be increased later as needed), and is stored in a statically allocated 1000-entries buffer. (In fact, I need three of those, for sorting and entries merging purposes). It sounds unnecessary to add one more buffer and data copying operation just to convert one single member to 64-bit.

Re: Overuse of Casting (was Screen Shots...)

Posted: Sat May 15, 2010 3:51 am
by gerryg400
I think code like this is prone to error. It is almost impossible to pass an eye over it and see problems.

To reduce the amount of casts, to make you code more readable, to make debugging simpler and reduce the chance of typos, I recommend you use some local variables in your code.

i.e assign (kernel_memory_map*)(uint32_t)kinfo->kmmap)[source_index].location (and others) to local variables, casting at the point of assignment. Then in the main part of the code everything can be done with the correct type. You will be amazed how much more readable, reviewable and reliable your code is.

- gerryg400

Re: Overuse of Casting (was Screen Shots...)

Posted: Sat May 15, 2010 4:08 am
by Neolander
So what you suggest is keeping the current header with its uint64_t pointers and change the code using it to something like this ?

Code: Select all

kernel_memory_map* kmmap = (kernel_memory_map*) (uint32_t) kinfo->kmmap;

(...)

kmmap[source_index].location = 0;
I see an immediate benefit to this approach : it avoids header duplication. Thanks, gonna try ! :wink:

Re: Overuse of Casting (was Screen Shots...)

Posted: Sat May 15, 2010 4:56 am
by gerryg400
Actually I was thinking something more like..

Code: Select all

kernel_memory_map *prev_location = (kernel_memory_map*)(uint32_t)kinfo->kmmap)[source_index-1].location;
kernel_memory_map *location = (kernel_memory_map*)(uint32_t)kinfo->kmmap)[source_index].location;
kernel_memory_map *prev_size = (kernel_memory_map*)(uint32_t)kinfo->kmmap)[source_index-1].size;
kernel_memory_map *size = (kernel_memory_map*)(uint32_t)kinfo->kmmap)[source_index].size;

if (location != prev_location + prev_size) {
    kmmap_buffer[dest_index].location = prev_location + prev_size;
    kmmap_buffer[dest_index].size = location - prev_location + prev_size;

etc.
}
However I'm sure that you know your code better. The point is that simple code has fewer bugs.

- gerryg400

Re: Overuse of Casting (was Screen Shots...)

Posted: Sat May 15, 2010 1:12 pm
by Neolander
I agree with you on the simplicity thing. That's exactly why I'm rewriting the thing, after all ^^

Don't know if I'm going this far with local variables, but it could be a good idea since this part already had poor readability in the original code, despite some attempts at beautifying it. Going to find what suits it best. Thanks again !