Page 1 of 2

[Solved] EXT2 strangeness, again :(

Posted: Mon Jan 23, 2017 3:56 pm
by Agola
Hello OSDev forum :|

While I was shortening and optimizing my ext2 code, and comparing it with LevOS' and JS-OS' ext2 code, I've noticed a really strange detail.

Both code assume name of directory entries are null terminated except 4-byte padding while creating a new directory entry for a new file.
For example if the name of entry is "GOOD", because of length is 4-byte aligned, it shouldn't be null terminated. But both code makes the name is "GOOD\0", then adds padding to make it 4-byte aligned, so it turns into "GOOD\0\0\0\0"

Part of these codes:

LevOS: (https://github.com/levex/osdev/blob/mas ... xt2/ext2.c)

Code: Select all

ext2_dir *entry = (ext2_dir *)malloc(sizeof(ext2_dir) + strlen(fil) + 1);
entry->size = sizeof(ext2_dir) + strlen(fil) + 1;
entry->namelength = strlen(fil) + 1;
JS-OS: (https://github.com/JSmith-BitFlipper/JS ... /ext2_fs.c)

Code: Select all

dirent.rec_len = sizeof(dirent.ino) + sizeof(dirent.rec_len) + sizeof(dirent.name_len) + sizeof(dirent.file_type) + lengthOfName + 1; //+1 is NULL terminating \000
dirent.name_len = (u8int)lengthOfName;
dirent.file_type = file_type;

//~ k_printf("dirent.name is :%d\n", *dirent.name);

//+1 being the \000 NULL termination 0 byte at the end of the string
dirent.name = (char*)kmalloc(lengthOfName + 1);
In JS-OS it says null terminating also...

Probably I have a mistake, because it is impossible both code to have same mistake.
What's happening, really?

In http://www.nongnu.org/ext2-doc/ext2.html#IFDIR-NAME-LEN, directory entries are not null terminated also.

My brain triggered an overflow exception, will return back in a while :|

Re: EXT2 strangeness, again :(

Posted: Mon Jan 23, 2017 6:49 pm
by klange
My ext2 driver doesn't support writing, don't use it as a reference.

e: I checked to be sure, and sizeof(ext2_dir_t) = 8

Re: EXT2 strangeness, again :(

Posted: Mon Jan 23, 2017 7:15 pm
by crunch
You are correct, they don't need to be null terminated. I don't think it should fsck anything up if you do add null bytes for padding, as long as the name_len and rec_len values are correct.

Re: EXT2 strangeness, again :(

Posted: Mon Jan 23, 2017 8:00 pm
by gerryg400
crunch wrote:You are correct, they don't need to be null terminated. I don't think it should fsck anything up if you do add null bytes for padding, as long as the name_len and rec_len values are correct.
I suspect that's true. It just wastes a few bytes.

Re: EXT2 strangeness, again :(

Posted: Tue Jan 24, 2017 1:39 am
by iansjack
This illustrates the dangers of relying too much on the features of your high-level language. Use Pascal instead of C and it would be natural to treat strings as size/array combinations rather than zero-terminated arrays.

I wonder if there is a degree of code copying between the OSs that you mention, leading to a perpetuation of a simple error.

Re: EXT2 strangeness, again :(

Posted: Tue Jan 24, 2017 1:54 am
by Agola
klange wrote:My ext2 driver doesn't support writing, don't use it as a reference.

e: I checked to be sure, and sizeof(ext2_dir_t) = 8
I've just checked it too, and that is my bad, sorry.
I didn't see the brackets (zero-length array), in char name[]; in struct ext2_dir_t.

Code: Select all

struct ext2_dir {
	uint32_t inode;
	uint16_t rec_len;
	uint8_t name_len;
	uint8_t file_type;
	char name[];		/* Actually a set of characters, at most 255 bytes */
} __attribute__ ((packed));
Then ToaruOS' ext2 is correct, I'm going to remove from list now.
I'm happy to see it is my mistake at least in ToaruOS :|

Cheers

Re: EXT2 strangeness, again :(

Posted: Tue Jan 24, 2017 12:33 pm
by mikegonta
Agola wrote:
klange wrote:My ext2 driver doesn't support writing, don't use it as a reference.
e: I checked to be sure, and sizeof(ext2_dir_t) = 8
I've just checked it too, and that is my bad, sorry.
I didn't see the brackets (zero-length array), in char name[]; in struct ext2_dir_t.

Code: Select all

struct ext2_dir {
	uint32_t inode;
	uint16_t rec_len;
	uint8_t name_len;
	uint8_t file_type;
	char name[];		/* Actually a set of characters, at most 255 bytes */
} __attribute__ ((packed));
Please don't pack already "packed" structs.

Re: EXT2 strangeness, again :(

Posted: Tue Jan 24, 2017 2:08 pm
by mikegonta
iansjack wrote:This illustrates the dangers of relying too much on the features of your high-level language.
FAT32 and exFAT differ from EXT2 in the way that the on device file name is represented. Unlike EXT2 the file name is stored in contiguous
directory entries, however the file name itself is not a contiguous array. Instead of processing all of each directory set to transform the all
except one (and in some cases not even that one) available file names into a contiguous array, I first transform the to be looked up file name
so that it is exactly the same (in the case of exFAT) and only slightly different (in the case of FAT32) as the on device format. The comparison
is then very simple and based on the same idea as the original DOS (and also FAT32 short file name) 8 3 format comparison. Not only is the
method faster, but the resulting code is shorter and simpler.
In fact, I use a cut down version of the same method to lookup a short long file name from the root directory in exFAT and FAT32 boot sectors
written in 8086 compatible code.

Re: EXT2 strangeness, again :(

Posted: Tue Jan 24, 2017 3:08 pm
by LtG
mikegonta wrote:
Agola wrote:
klange wrote:My ext2 driver doesn't support writing, don't use it as a reference.
e: I checked to be sure, and sizeof(ext2_dir_t) = 8
I've just checked it too, and that is my bad, sorry.
I didn't see the brackets (zero-length array), in char name[]; in struct ext2_dir_t.

Code: Select all

struct ext2_dir {
	uint32_t inode;
	uint16_t rec_len;
	uint8_t name_len;
	uint8_t file_type;
	char name[];		/* Actually a set of characters, at most 255 bytes */
} __attribute__ ((packed));
Please don't pack already "packed" structs.
I think that's pretty bad advice. Even based on your article, if this happens to be on x86 or x86_64 then there's no harm done and you've clearly marked the need for packing, on other platforms the compiler presumably does the right thing to ensure correctness.

I would say always mark a struct packed if it requires exact memory layout (ie. it is used "externally" in some way).. If some compiler does stupid things then fix the _compiler_, don't make your code worse with a "hack".

Can you give an example where there's an actual problem and the problem is not in the compiler (and therefore omitting packing is a hack to work around a stupid compiler)?

PS. I would also consider using something compiler independent (ie. #define's) if possible, and possibly adding alignment (with #define's as well) so that the code would actually work correctly on other platforms as well, but might be useless for most small hobby projects. With the #define's you can specify the compiler specific way of marking a struct as packed and characteristics of platforms in a single file so it's easy to adjust for each platform.

Re: EXT2 strangeness, again :(

Posted: Tue Jan 24, 2017 3:49 pm
by mikegonta
LtG wrote:
mikegonta wrote:Please don't pack already "packed" structs.
I think that's pretty bad advice. Even based on your article, if this happens to be on x86 or x86_64 then there's no harm done and
you've clearly marked the need for packing, ...
I should have mentioned that the advice was for GCC x86/x64. One of the nice things in "C" is the assignment of one struct to another.
The only harm is that, GCC for example, will use a complex (and unnecessary) memmove to ensure that most of the copy loop is aligned
for better performance. Of course, you need some sort of directive (#define, etc.) to prevent the compiler from optimizing non-aligned
on device structures, However, packing is unnecessary when the on device struct (for example the exFAT BPB) is already packed (that is,
all struct components are on size alignment) GCC will (for example with -march=i486) simply initialize the 3 required registers and do
a rep movs.
LtG wrote:on other platforms the compiler presumably does the right thing to ensure correctness.
On other platforms (ARM, MIPS, etc.) structs should not be used because of endedness and RISC optimizations, use unsigned char arrays instead.

Re: EXT2 strangeness, again :(

Posted: Tue Jan 24, 2017 4:03 pm
by dozniak
mikegonta wrote:On other platforms (ARM, MIPS, etc.) structs should not be used because of endedness and RISC optimizations, use unsigned char arrays instead.
That's like the most misguided C advice I've ever heard on this forum.

Re: EXT2 strangeness, again :(

Posted: Tue Jan 24, 2017 4:40 pm
by mikegonta
dozniak wrote:
mikegonta wrote:On other platforms (ARM, MIPS, etc.) structs should not be used because of endedness and RISC optimizations, use unsigned char arrays instead.
That's like the most misguided C advice I've ever heard on this forum.
All advice is misguided when taken out of context. For "C" cross platform portability, structs should not be used, whereas if the code
is only for that platform and endedness is not an issue then of course structs are fine.
Here's something for you to read. And, most importantly, this part.

Re: EXT2 strangeness, again :(

Posted: Tue Jan 24, 2017 5:35 pm
by LtG
mikegonta wrote:
LtG wrote:
mikegonta wrote:Please don't pack already "packed" structs.
I think that's pretty bad advice. Even based on your article, if this happens to be on x86 or x86_64 then there's no harm done and
you've clearly marked the need for packing, ...
I should have mentioned that the advice was for GCC x86/x64. One of the nice things in "C" is the assignment of one struct to another.
The only harm is that, GCC for example, will use a complex (and unnecessary) memmove to ensure that most of the copy loop is aligned
for better performance. Of course, you need some sort of directive (#define, etc.) to prevent the compiler from optimizing non-aligned
on device structures, However, packing is unnecessary when the on device struct (for example the exFAT BPB) is already packed (that is,
all struct components are on size alignment) GCC will (for example with -march=i486) simply initialize the 3 required registers and do
a rep movs.
LtG wrote:on other platforms the compiler presumably does the right thing to ensure correctness.
On other platforms (ARM, MIPS, etc.) structs should not be used because of endedness and RISC optimizations, use unsigned char arrays instead.
Is it guaranteed that without "packed" the compiler will not add padding? Isn't it implementation specific? Granted most likely no sane current compiler would add padding, but do you want to rely on that?

Even if it were guaranteed, and this is my main point, I'd mark it as packed to explicitly convey my intent. If the compiler is bad and does something stupid then you should fix the compiler.

I guess the question is, why does GCC do a complex and unnecessary "memmove" instead of "rep movs"? Can't you tell the compiler the alignment?

Note, with a proper language (which C adheres to some extent) the compiler is going to assume that it can do what ever it wants with internal stuff and only maintain external stuff in a compliant way. This allows a proper compiler to do optimizations more freely. Here GCC is going to assume the struct is properly aligned, is there a guarantee that it comes that way from some 3rd party (the OS?) always?

As I said, I'd rather do things correctly and fix the issue where it is (GCC), though obviously fixing GCC optimizer might be too difficult in which case I might consider a hack but then I would at least add comments..

Re: EXT2 strangeness, again :(

Posted: Wed Jan 25, 2017 1:42 am
by dozniak
mikegonta wrote: Here's something for you to read. And, most importantly, this part.
In that discussion, James-M is the only guy who really knows something about C compilers - he writes them for a living. So thanks for the read, I'm going to repeat now - "That's like the most misguided C advice I've ever heard on this forum from person who has little to no idea about how C works."

Re: [Solved] EXT2 strangeness, again :(

Posted: Wed Jan 25, 2017 2:08 am
by Boris
is there a way to tell your C compiler :
I don't care about how this struct looks in memory. You can reverse members order, skip those ever written to or read.
Optimise using Lto and do it according to my arch.