G++ behavior of storing char* literals causing issues
Posted: Thu Aug 25, 2022 4:46 pm
I've been fumbling around with writing a protected mode operating system in C++ using G++ and nasm. I've written a pretty basic real mode command line interface in assembly before but haven't done anything like this with anything other than assembly before.
Anyway I've got a dynamic memory allocation system set up that I'm happy with as well as a custom made "dynamic array" data container that's kind of like the vector but with a few differences.
One problem that has had me stuck for a while is the way G++ allocates and stores char* literals. I never can reliably access the strings without them getting corrupted by.. something. I'm talking about when you do something like:
or
My kernel loads into ram at location 0x7e00. The size of the program (right now) is 7329 bytes. This means any memory beyond 0x9AA1 and before 0x7e00 should theoretically be a safe place to store stuff. Well, anytime there's something surrounded by quotes in the form of "blah blah blah", it gets stored somewhere but the characters don't always get copied to that location or at least are no longer in that location by the time any other code gets to them.
For example, if I run the code:
Running it will result in the string "test111" getting copied to memory location 28364 or 0x6ECC. With the pointer getting set to 0x6ECC. Reading that memory shows that it worked correctly. "t" is at 0x6ECC, "e" is at 0x6ECD and so on and so fourth. However most of the time, this doesn't work. If I do the same exact thing but modify the string slightly, it doesn't work. For example.
will set the pointer to a value of 28349 or 0x6EBD. The memory at location 0x6EBD will be 0, the next one will be 0 and everything will just be zeros.
Also, doing never works at all. Only the [] operator makes it somewhat work.
Now you may be wondering: why is this a problem? Because manually allocating with calloc doesn't solve the problem. I can do:
While calloc will indeed allocate an array of the specified size where it's intended to go, using the char array literal of test = "test" changes the value of the pointer and attempts to copy the text to that location instead of where calloc initially allocated the array. It doesn't matter if the = "test" assignment was way shorter, the same size or way longer, it always reallocates it. This means is usually doesn't work except for sometimes. This problem is especially detrimental to allowing me to accomplish anything because it uses the same broken behavior for copying char array literals as function parameters which means that doesn't work. I've even written functions to search for strings in all of ram. Anytime the chars just get lost like that, they don't appear elsewhere. It's not just a miscalculated pointer, the data is just completely gone out of existence.
Is there a way to modify the way gcc allocates this stuff? I mean besides spending days, perhaps weeks trying to figure out where in the source code gcc deals with allocating char arrays and modifying it to work with my memory allocation system instead of whatever memory allocation convention its currently using that's messing things up. But I guess if that's the only way then it is what it is.
Anyway I've got a dynamic memory allocation system set up that I'm happy with as well as a custom made "dynamic array" data container that's kind of like the vector but with a few differences.
One problem that has had me stuck for a while is the way G++ allocates and stores char* literals. I never can reliably access the strings without them getting corrupted by.. something. I'm talking about when you do something like:
Code: Select all
char *foo = "bar";
Code: Select all
char[] foo = "bar";
For example, if I run the code:
Code: Select all
char test[] = "test111";
Code: Select all
char test[] = "test11";
Also, doing
Code: Select all
char *test = "foo";
Now you may be wondering: why is this a problem? Because manually allocating with calloc doesn't solve the problem. I can do:
Code: Select all
char *test = (char*)calloc(5, sizeof(char);
test = "test";
Is there a way to modify the way gcc allocates this stuff? I mean besides spending days, perhaps weeks trying to figure out where in the source code gcc deals with allocating char arrays and modifying it to work with my memory allocation system instead of whatever memory allocation convention its currently using that's messing things up. But I guess if that's the only way then it is what it is.