Why still seek and read?

OSwhatever · Post by **OSwhatever** » Mon Nov 18, 2013 1:49 pm

I know there is a risk of being beheaded questioning anything about POSIX here but I'm going to take my chances. I wonder why POSIX seek and write still lives on this day. In practice seek and write can be merged into one read call which also state at what position you want to read from. I often see this extra seek call unnecessary and it also impacts the performance by issuing an extra call. I don't know the origin of the seek call but I guess it was made when non-volatile storage was much slower and seeking for a certain position took ages and could also only be read in a sequence (tape). Today the situation is completely different with flash drivers with minimal seek times and "seeking" with a flash drive is really a pointless operation.

As we implement new operating systems, why should we implement seek down to the lowest level?

kfreezen · Post by **kfreezen** » Mon Nov 18, 2013 2:13 pm

Seek and write could be merged into one call, but then what? You would have to keep track of the position iterator on your own. Having seek and read/write as two seperate calls is quite a nice abstraction in my opinion, because, otherwise, for every single call, you would have to read back the amount read (or maybe the new position iterator) into the old position iterator. I don't really know if I like that idea very well.

sortie · Post by **sortie** » Mon Nov 18, 2013 2:57 pm

You should probably do some research before asking this question, but there are good answers to your question:

There already exists such a system call: It's called pread(2). It's basically read(2) with an off_t parameter and it has no effect on the implicit file descriptor position used by read(2). There's naturally also a pwrite system call.

The important thing to realize about read(2) is that it works on a lot of difference devices. You are probably just thinking of files that have an offset and are seekable, but there are also streams (pipes, character devices, sockets, ...) that are not seekable. The read system call is meant for streams, while the pread system call is meant for files. When you use read on a file, then it emulates a stream by maintaining a file offset in the file descriptor (which the kernel then uses to rewrite the request to a pread call on the kernel device). The lseek system call is meant to control the emulated stream, but it can also be used to learn things about the file (such as its size, but with Linux extensions, also to navigate sparse files).

Indeed, it's a feature that most programs use read (or fgetc/fread) instead of pread. Unless they really need seekable files, it allows these program to use streams (pipes in the most common cases) as input data rather than just files. Streams can be thought of a subset or inferior version files and if you don't strictly need the file semantics, then just use the stream operations.

You might also want to check out the readv(3) and preadv(3) system calls.

Note how the read and write operations are only required to do at least one byte of IO. For instance, if the read system call was required to read as much data as requested, then it wouldn't really work well with the stream principle of "read as much data as current possible, but at least one byte". read(2) is a good low-level abstraction, but in many cases you want the easier fread that doesn't need reading in a loop for more than single byte.

Combuster · Post by **Combuster** » Mon Nov 18, 2013 3:03 pm

The major problem is that in practical use, many things can be read and/or written while not being able to be seeked into without such logic - even when tapes are gone you have things such as standard I/O, pipes, sockets and special devices. Therefore logic dictates that the two operations should (still) be separate.

And it's not POSIX per se that's responsible here. The C standard dictates the exact same practice, for pretty much the same reasons.

EDIT: I see Sortie is ninjaediting for the bonus points on this one

Brendan · Post by **Brendan** » Mon Nov 18, 2013 5:26 pm

Hi,

OSwhatever wrote:As we implement new operating systems, why should we implement seek down to the lowest level?

For "block devices" I've always combined seek and read/write; and made them byte addressable rather than block addressable (e.g. so you can ask to read 123 bytes at offset 4567 if you really want to). I mostly do this because it reduces the number messages and task switches in a micro-kernel (where VFS is in a separate process). I'd also say there's no sane reason why you can't do this and still provide a POSIX/C library that behaves the same, simply by having the library track the current position in the file.

For "character devices", these are streams and not files. Most file operations (seek, append, truncate, copy, etc) don't make any sense for streams. For this reason, I think it was a mistake for C/POSIX to attempt to treat them the same as files.

Cheers,

Brendan

bluemoon · Post by **bluemoon** » Tue Nov 19, 2013 4:46 am

Depends on application usage pattern, read/fread/pread/mmap each have its own pros/cons, you may optimize for certain scenario but there is no best API, and they have no conflict.

Jezze · Post by **Jezze** » Tue Nov 19, 2013 1:59 pm

I chose to use to have offset as part of my read function because that means I dont need to keep a state in the kernel that keeps track of the current position for each file. Less states in general means it is easier to debug and validate for correctness. For streams this offset is simply ignored but you could also make it mean discard this many bytes. This is up to the underlying filesystem.

Combuster · Post by **Combuster** » Wed Nov 20, 2013 12:08 am

Jezze wrote:For streams this offset is simply ignored but you could also make it mean discard this many bytes. This is up to the underlying filesystem.

Which means that reading the first byte and second byte would actually read the first and third byte on anything that's not a file... Is that really what you want?

linguofreak · Post by **linguofreak** » Wed Nov 20, 2013 1:49 am

Brendan wrote:For "character devices", these are streams and not files. Most file operations (seek, append, truncate, copy, etc) don't make any sense for streams. For this reason, I think it was a mistake for C/POSIX to attempt to treat them the same as files.

Look at it this way: under C/POSIX a stream isn't just treated as a file, but rather "file" really means "stream", and many (indeed most) files are members of a derived class* that implements extra operations. This isn't how the C and POSIX standards describe it or how things are usually talked about, but it's how a system that implements the standards actually behaves.

*C isn't object oriented, so it's a bit of an abuse of terminology to be talking about classes here, but we can look at things in object oriented terms even if they aren't implemented in an object oriented language.

mrstobbe · Post by **mrstobbe** » Wed Nov 20, 2013 2:53 am

I'd like to add here that the concept of "seek and read" has serious ramifications on I/O buffering. For example...

Starting at "0" and doing a read has allowed an OS to pre-read buffers as needed (the vast majority of cases so why not?).
Starting at "0" but then seeking potentially wastes some I/O cycles (see point 1), but allows the OS to know where the program is about to read/write.
Reading at one point but then seeking to another and reading again... is well, you know... bad. mkay. But not super bad because of point 2.

Until disk I/O beats memory I/O, I don't think any of the above will obsolete. Seeking is a good behavior. Fuzzy pre-cognition of the user's intention... It's really as simple as that. It's a good thing.

Kevin · Post by **Kevin** » Fri Nov 22, 2013 5:25 am

The OS doesn't really know if the program is going to read from the current offset, if it's going to write there, or if the next thing is an lseek. It can guess that a read will come and optimise for that case. But the same way, it can guess that after a pread(fd, buf, n, off), the next thing the program will do is a read from off + n. It's no more and no less valid than making the assumption with separate read/lseek.

sortie · Post by **sortie** » Fri Nov 22, 2013 6:09 am

Programs can use functions like readahead(2) to tell the kernel what memory they will be using shortly. I'm sure there's other also APIs to tell the access pattern.

Kevin · Post by **Kevin** » Fri Nov 22, 2013 6:22 am

True, but I've yet to see anyone use them.

Anyway, my point was that all of these optimisations are not really coupled to the question of read/pread, they work with both.

mrstobbe · Post by **mrstobbe** » Sun Nov 24, 2013 11:48 pm

My point is OS's can do that (and quite frankly should if they aren't already). Seeking someplace is an immediate indicator that, as program, you intend to do something there. If the file's been opened as read only, you know exactly what they plan to do there. Taking advantage of any information provided is clearly a good thing.

EDIT: syntax (comma) to clarify.

Combuster · Post by **Combuster** » Mon Nov 25, 2013 7:12 am

Kevin wrote:Anyway, my point was that all of these optimisations are not really coupled to the question of read/pread, they work with both.

Unless you consider why you would want to use pread - it's when the offset for individual reads is likely to be non-consecutive, and thus you could predict that readahead is probably less effective on that file, and spend some disk transfer cycles elsewhere.

OSDev.org

Why still seek and read?

Why still seek and read?

Re: Why still seek and read?

Re: Why still seek and read?

Re: Why still seek and read?

Re: Why still seek and read?

Re: Why still seek and read?

Re: Why still seek and read?

Re: Why still seek and read?

Re: Why still seek and read?

Re: Why still seek and read?

Re: Why still seek and read?

Re: Why still seek and read?

Re: Why still seek and read?

Re: Why still seek and read?

Re: Why still seek and read?