[Solved] EXT2 strangeness, again :(

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
User avatar
Agola
Member
Member
Posts: 155
Joined: Sun Nov 20, 2016 7:26 am
Location: Somewhere

[Solved] EXT2 strangeness, again :(

Post by Agola »

Hello OSDev forum :|

While I was shortening and optimizing my ext2 code, and comparing it with LevOS' and JS-OS' ext2 code, I've noticed a really strange detail.

Both code assume name of directory entries are null terminated except 4-byte padding while creating a new directory entry for a new file.
For example if the name of entry is "GOOD", because of length is 4-byte aligned, it shouldn't be null terminated. But both code makes the name is "GOOD\0", then adds padding to make it 4-byte aligned, so it turns into "GOOD\0\0\0\0"

Part of these codes:

LevOS: (https://github.com/levex/osdev/blob/mas ... xt2/ext2.c)

Code: Select all

ext2_dir *entry = (ext2_dir *)malloc(sizeof(ext2_dir) + strlen(fil) + 1);
entry->size = sizeof(ext2_dir) + strlen(fil) + 1;
entry->namelength = strlen(fil) + 1;
JS-OS: (https://github.com/JSmith-BitFlipper/JS ... /ext2_fs.c)

Code: Select all

dirent.rec_len = sizeof(dirent.ino) + sizeof(dirent.rec_len) + sizeof(dirent.name_len) + sizeof(dirent.file_type) + lengthOfName + 1; //+1 is NULL terminating \000
dirent.name_len = (u8int)lengthOfName;
dirent.file_type = file_type;

//~ k_printf("dirent.name is :%d\n", *dirent.name);

//+1 being the \000 NULL termination 0 byte at the end of the string
dirent.name = (char*)kmalloc(lengthOfName + 1);
In JS-OS it says null terminating also...

Probably I have a mistake, because it is impossible both code to have same mistake.
What's happening, really?

In http://www.nongnu.org/ext2-doc/ext2.html#IFDIR-NAME-LEN, directory entries are not null terminated also.

My brain triggered an overflow exception, will return back in a while :|
Last edited by Agola on Tue Jan 24, 2017 2:09 am, edited 1 time in total.
Keyboard not found!

Press F1 to run setup.
Press F2 to continue.
klange
Member
Member
Posts: 679
Joined: Wed Mar 30, 2011 12:31 am
Libera.chat IRC: klange
Discord: klange

Re: EXT2 strangeness, again :(

Post by klange »

My ext2 driver doesn't support writing, don't use it as a reference.

e: I checked to be sure, and sizeof(ext2_dir_t) = 8
Last edited by klange on Mon Jan 23, 2017 10:29 pm, edited 1 time in total.
User avatar
crunch
Member
Member
Posts: 81
Joined: Wed Aug 31, 2016 9:53 pm
Libera.chat IRC: crunch
Location: San Diego, CA

Re: EXT2 strangeness, again :(

Post by crunch »

You are correct, they don't need to be null terminated. I don't think it should fsck anything up if you do add null bytes for padding, as long as the name_len and rec_len values are correct.
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: EXT2 strangeness, again :(

Post by gerryg400 »

crunch wrote:You are correct, they don't need to be null terminated. I don't think it should fsck anything up if you do add null bytes for padding, as long as the name_len and rec_len values are correct.
I suspect that's true. It just wastes a few bytes.
If a trainstation is where trains stop, what is a workstation ?
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: EXT2 strangeness, again :(

Post by iansjack »

This illustrates the dangers of relying too much on the features of your high-level language. Use Pascal instead of C and it would be natural to treat strings as size/array combinations rather than zero-terminated arrays.

I wonder if there is a degree of code copying between the OSs that you mention, leading to a perpetuation of a simple error.
User avatar
Agola
Member
Member
Posts: 155
Joined: Sun Nov 20, 2016 7:26 am
Location: Somewhere

Re: EXT2 strangeness, again :(

Post by Agola »

klange wrote:My ext2 driver doesn't support writing, don't use it as a reference.

e: I checked to be sure, and sizeof(ext2_dir_t) = 8
I've just checked it too, and that is my bad, sorry.
I didn't see the brackets (zero-length array), in char name[]; in struct ext2_dir_t.

Code: Select all

struct ext2_dir {
	uint32_t inode;
	uint16_t rec_len;
	uint8_t name_len;
	uint8_t file_type;
	char name[];		/* Actually a set of characters, at most 255 bytes */
} __attribute__ ((packed));
Then ToaruOS' ext2 is correct, I'm going to remove from list now.
I'm happy to see it is my mistake at least in ToaruOS :|

Cheers
Keyboard not found!

Press F1 to run setup.
Press F2 to continue.
mikegonta
Member
Member
Posts: 229
Joined: Thu May 19, 2011 5:13 am
Contact:

Re: EXT2 strangeness, again :(

Post by mikegonta »

Agola wrote:
klange wrote:My ext2 driver doesn't support writing, don't use it as a reference.
e: I checked to be sure, and sizeof(ext2_dir_t) = 8
I've just checked it too, and that is my bad, sorry.
I didn't see the brackets (zero-length array), in char name[]; in struct ext2_dir_t.

Code: Select all

struct ext2_dir {
	uint32_t inode;
	uint16_t rec_len;
	uint8_t name_len;
	uint8_t file_type;
	char name[];		/* Actually a set of characters, at most 255 bytes */
} __attribute__ ((packed));
Please don't pack already "packed" structs.
Mike Gonta
look and see - many look but few see

https://mikegonta.com
mikegonta
Member
Member
Posts: 229
Joined: Thu May 19, 2011 5:13 am
Contact:

Re: EXT2 strangeness, again :(

Post by mikegonta »

iansjack wrote:This illustrates the dangers of relying too much on the features of your high-level language.
FAT32 and exFAT differ from EXT2 in the way that the on device file name is represented. Unlike EXT2 the file name is stored in contiguous
directory entries, however the file name itself is not a contiguous array. Instead of processing all of each directory set to transform the all
except one (and in some cases not even that one) available file names into a contiguous array, I first transform the to be looked up file name
so that it is exactly the same (in the case of exFAT) and only slightly different (in the case of FAT32) as the on device format. The comparison
is then very simple and based on the same idea as the original DOS (and also FAT32 short file name) 8 3 format comparison. Not only is the
method faster, but the resulting code is shorter and simpler.
In fact, I use a cut down version of the same method to lookup a short long file name from the root directory in exFAT and FAT32 boot sectors
written in 8086 compatible code.
Mike Gonta
look and see - many look but few see

https://mikegonta.com
LtG
Member
Member
Posts: 384
Joined: Thu Aug 13, 2015 4:57 pm

Re: EXT2 strangeness, again :(

Post by LtG »

mikegonta wrote:
Agola wrote:
klange wrote:My ext2 driver doesn't support writing, don't use it as a reference.
e: I checked to be sure, and sizeof(ext2_dir_t) = 8
I've just checked it too, and that is my bad, sorry.
I didn't see the brackets (zero-length array), in char name[]; in struct ext2_dir_t.

Code: Select all

struct ext2_dir {
	uint32_t inode;
	uint16_t rec_len;
	uint8_t name_len;
	uint8_t file_type;
	char name[];		/* Actually a set of characters, at most 255 bytes */
} __attribute__ ((packed));
Please don't pack already "packed" structs.
I think that's pretty bad advice. Even based on your article, if this happens to be on x86 or x86_64 then there's no harm done and you've clearly marked the need for packing, on other platforms the compiler presumably does the right thing to ensure correctness.

I would say always mark a struct packed if it requires exact memory layout (ie. it is used "externally" in some way).. If some compiler does stupid things then fix the _compiler_, don't make your code worse with a "hack".

Can you give an example where there's an actual problem and the problem is not in the compiler (and therefore omitting packing is a hack to work around a stupid compiler)?

PS. I would also consider using something compiler independent (ie. #define's) if possible, and possibly adding alignment (with #define's as well) so that the code would actually work correctly on other platforms as well, but might be useless for most small hobby projects. With the #define's you can specify the compiler specific way of marking a struct as packed and characteristics of platforms in a single file so it's easy to adjust for each platform.
mikegonta
Member
Member
Posts: 229
Joined: Thu May 19, 2011 5:13 am
Contact:

Re: EXT2 strangeness, again :(

Post by mikegonta »

LtG wrote:
mikegonta wrote:Please don't pack already "packed" structs.
I think that's pretty bad advice. Even based on your article, if this happens to be on x86 or x86_64 then there's no harm done and
you've clearly marked the need for packing, ...
I should have mentioned that the advice was for GCC x86/x64. One of the nice things in "C" is the assignment of one struct to another.
The only harm is that, GCC for example, will use a complex (and unnecessary) memmove to ensure that most of the copy loop is aligned
for better performance. Of course, you need some sort of directive (#define, etc.) to prevent the compiler from optimizing non-aligned
on device structures, However, packing is unnecessary when the on device struct (for example the exFAT BPB) is already packed (that is,
all struct components are on size alignment) GCC will (for example with -march=i486) simply initialize the 3 required registers and do
a rep movs.
LtG wrote:on other platforms the compiler presumably does the right thing to ensure correctness.
On other platforms (ARM, MIPS, etc.) structs should not be used because of endedness and RISC optimizations, use unsigned char arrays instead.
Mike Gonta
look and see - many look but few see

https://mikegonta.com
User avatar
dozniak
Member
Member
Posts: 723
Joined: Thu Jul 12, 2012 7:29 am
Location: Tallinn, Estonia

Re: EXT2 strangeness, again :(

Post by dozniak »

mikegonta wrote:On other platforms (ARM, MIPS, etc.) structs should not be used because of endedness and RISC optimizations, use unsigned char arrays instead.
That's like the most misguided C advice I've ever heard on this forum.
Learn to read.
mikegonta
Member
Member
Posts: 229
Joined: Thu May 19, 2011 5:13 am
Contact:

Re: EXT2 strangeness, again :(

Post by mikegonta »

dozniak wrote:
mikegonta wrote:On other platforms (ARM, MIPS, etc.) structs should not be used because of endedness and RISC optimizations, use unsigned char arrays instead.
That's like the most misguided C advice I've ever heard on this forum.
All advice is misguided when taken out of context. For "C" cross platform portability, structs should not be used, whereas if the code
is only for that platform and endedness is not an issue then of course structs are fine.
Here's something for you to read. And, most importantly, this part.
Mike Gonta
look and see - many look but few see

https://mikegonta.com
LtG
Member
Member
Posts: 384
Joined: Thu Aug 13, 2015 4:57 pm

Re: EXT2 strangeness, again :(

Post by LtG »

mikegonta wrote:
LtG wrote:
mikegonta wrote:Please don't pack already "packed" structs.
I think that's pretty bad advice. Even based on your article, if this happens to be on x86 or x86_64 then there's no harm done and
you've clearly marked the need for packing, ...
I should have mentioned that the advice was for GCC x86/x64. One of the nice things in "C" is the assignment of one struct to another.
The only harm is that, GCC for example, will use a complex (and unnecessary) memmove to ensure that most of the copy loop is aligned
for better performance. Of course, you need some sort of directive (#define, etc.) to prevent the compiler from optimizing non-aligned
on device structures, However, packing is unnecessary when the on device struct (for example the exFAT BPB) is already packed (that is,
all struct components are on size alignment) GCC will (for example with -march=i486) simply initialize the 3 required registers and do
a rep movs.
LtG wrote:on other platforms the compiler presumably does the right thing to ensure correctness.
On other platforms (ARM, MIPS, etc.) structs should not be used because of endedness and RISC optimizations, use unsigned char arrays instead.
Is it guaranteed that without "packed" the compiler will not add padding? Isn't it implementation specific? Granted most likely no sane current compiler would add padding, but do you want to rely on that?

Even if it were guaranteed, and this is my main point, I'd mark it as packed to explicitly convey my intent. If the compiler is bad and does something stupid then you should fix the compiler.

I guess the question is, why does GCC do a complex and unnecessary "memmove" instead of "rep movs"? Can't you tell the compiler the alignment?

Note, with a proper language (which C adheres to some extent) the compiler is going to assume that it can do what ever it wants with internal stuff and only maintain external stuff in a compliant way. This allows a proper compiler to do optimizations more freely. Here GCC is going to assume the struct is properly aligned, is there a guarantee that it comes that way from some 3rd party (the OS?) always?

As I said, I'd rather do things correctly and fix the issue where it is (GCC), though obviously fixing GCC optimizer might be too difficult in which case I might consider a hack but then I would at least add comments..
User avatar
dozniak
Member
Member
Posts: 723
Joined: Thu Jul 12, 2012 7:29 am
Location: Tallinn, Estonia

Re: EXT2 strangeness, again :(

Post by dozniak »

mikegonta wrote: Here's something for you to read. And, most importantly, this part.
In that discussion, James-M is the only guy who really knows something about C compilers - he writes them for a living. So thanks for the read, I'm going to repeat now - "That's like the most misguided C advice I've ever heard on this forum from person who has little to no idea about how C works."
Learn to read.
Boris
Member
Member
Posts: 145
Joined: Sat Nov 07, 2015 3:12 pm

Re: [Solved] EXT2 strangeness, again :(

Post by Boris »

is there a way to tell your C compiler :
I don't care about how this struct looks in memory. You can reverse members order, skip those ever written to or read.
Optimise using Lto and do it according to my arch.
Post Reply