Page 2 of 2

Re: Most reused/versatile code

Posted: Tue Feb 24, 2015 4:33 pm
by iansjack
alexfru wrote: fseek() to positions 0, 1, 2, 4, 8, etc, fgetc(), check for errors. You get the top bit of the size. Similarly you get the rest.
What happens if you have write-only access to the file?

And how many system calls is that going to take on a multi-gigabyte file? All in the name of some illusory "portability" when any operating system will provide a single API call to determine the size of a file.

Re: Most reused/versatile code

Posted: Wed Feb 25, 2015 1:52 am
by Combuster
Because portable is exactly the opposite of doing the same thing a hundred times differently.
write-only access to the file
That's a troll, right?

Re: Most reused/versatile code

Posted: Wed Feb 25, 2015 2:05 am
by alexfru
iansjack wrote:
alexfru wrote: fseek() to positions 0, 1, 2, 4, 8, etc, fgetc(), check for errors. You get the top bit of the size. Similarly you get the rest.
What happens if you have write-only access to the file?
If you can't read the file, what can you use its size for? OTOH, if you're only writing to a file, either you don't care about its size (e.g. it's a log that only grows) or you know exactly how much you're supposed to write since standard C has no functions to truncate files at specified size. I think other cases (e.g. calculating size of files in a directory, taking proper care of links and other peculiarities) are rare and/or need special handling anyway (e.g. POSIX functions, path manipulation / hierarchy navigation, etc). But basic file sizing for basic purposes (=find out its size in order to read or copy it) can be implemented nearly portably.
iansjack wrote:And how many system calls is that going to take on a multi-gigabyte file? All in the name of some illusory "portability" when any operating system will provide a single API call to determine the size of a file.
For a 4G-1B file you'd need something like:
31 fseek() calls for the most significant bit of size
1 fseek() call for bit 30
1 fseek() call for bit 29
...
1 fseek() call for bit 1
1 fseek() call for the least significant bit
Double that because of fgetc().

Less than 100 calls. Not too bad, IMO.

Re: Most reused/versatile code

Posted: Wed Feb 25, 2015 2:07 am
by alexfru
Combuster wrote:Because portable is exactly the opposite of doing the same thing a hundred times differently.
The discouraged use of fseek(f, 0, SEEK_END) is one of those things that people do thousands of times in pretty much exactly the same way and situation. :)

Re: Most reused/versatile code

Posted: Wed Feb 25, 2015 2:46 am
by iansjack
alexfru wrote:Less than 100 calls. Not too bad, IMO.
100 system calls (which are a relatively expensive procedure) to do something that could be done with a single call. Your definition of "not bad" is different to mine. It seems to me that you are using a sledgehammer to crack a nut. We have information that is stored in the meta-description (the directory entry) of the file and rather than using this simple information you want to walk the file counting the bytes (albeit using a binary search rather than a linear one). Crazy (IMO)!

And what about that case of a write-only file?

Re: Most reused/versatile code

Posted: Wed Feb 25, 2015 2:59 am
by alexfru
iansjack wrote:
alexfru wrote:Less than 100 calls. Not too bad, IMO.
100 system calls (which are a relatively expensive procedure) to do something that could be done with a single call. Your definition of "not bad" is different to mine. It seems to me that you are using a sledgehammer to crack a nut. We have information that is stored in the meta-description (the directory entry) of the file and rather than using this simple information you want to walk the file counting the bytes (albeit using a binary search rather than a linear one). Crazy (IMO)!
That's the price you pay for portability. You could argue that a bunch of #ifdef's also counts as a portable solution. :)
iansjack wrote:And what about that case of a write-only file?
I wrote about that. Did you not read that part or did you find it unsatisfactory?

Re: Most reused/versatile code

Posted: Wed Feb 25, 2015 4:47 am
by iansjack
alexfru wrote:I wrote about that. Did you not read that part or did you find it unsatisfactory?
Ah, sorry - I missed that part of your reply. I've now read it and, yes, I do find it unsatisfactory. I think that saying "I can't see why you would want to do that" is a poor excuse. A simple example, would be a program that monitored the size of files and then took some action (such as logging an alert) if the file grew above a certain size. These are confidential files that no-one but the owner should be able to access.

As for #ifdefs - yes, that's a better way of achieving portability than making a huge number of unnecessary system calls (IMO), especially as it doesn't work in all circumstance that I can envisage.

Re: Most reused/versatile code

Posted: Wed Feb 25, 2015 4:51 am
by iansjack
Combuster wrote:Because portable is exactly the opposite of doing the same thing a hundred times differently.
write-only access to the file
That's a troll, right?
What??? You seem very keen to shout "troll" when something is beyond your ken. A sure sign of a troll (IMO) :). Is the concept of a write-only file really so foreign to you? I can see that you have never worked in an environment where security is paramount. Or somewhere where an incorruptible audit trail needs to be maintained.

Don't conclude that the concept of a write-only file makes no sense just because of your lack of imagination.

Re: Most reused/versatile code

Posted: Wed Feb 25, 2015 10:34 am
by sortie
Bender wrote:This:

Code: Select all

/** NOTE: This function trashes the current position in file **/
/** Does no error checking either **/
long int GetFileSize(FILE* FilePointer)  {  
      fseek(FilePointer, 0L, SEEK_END);
      long int ReturnVal = ftell(FilePointer);
      fseek(FilePointer, 0L, SEEK_SET);
      return ReturnVal;
}
(Everywhere?)
Obvious improvements you should adapt in this code:
  • No error handling, you should do that.
  • You could ftell ahead of time and restore it afterwards.
  • Use off_t and ftello and fseeko for large file support. It's POSIX and not ISO C, but any platforms that don't supply it are braindead, as evident by how large file support is on Windows.
  • Use flockfile and funlockfile for thread safety, so the operation is atomic. This is also less portable, I forgot if they made it into C11, but any sane platform will have them. (At this point, you should just stop supporting native Windows C altogether, it's too horrible, check out cygwin or soon midipix if you care).
  • You might as well use fileno and fstat on real files as a much more efficient alternative. Not all FILE objects are backed by file descriptors, so there's still some value in this approach.
  • Really consider whether you really need to know the file size, or whether you can just consider it a stream and process a char, element or line at a time. If it's because you want to read the whole file into memory, there might still be race conditions where the file grows while you read it, so a realloc exponential double pattern is better, until EOF is hit (but such a realloc pattern can be optimized by having the first buffer be the size of the file, so it's almost always right on the first try).
Sorry. Someone passed me -Wpedantic and -Wsuperior-interfaces and -Wstyle-suggestions -Wimperial-opinions. I gotta teach better C coding.