The following issue troubled me yesterday a bit, but I hope I 've got the clue:
a File Pointer is something that points to an inode in the inode-list. It usually keeps information like: in which mode the file is opened. How many processes use the file. the inode for the file. Information about locks. I am not sure whether this one belongs here too: at which position in the file are we? - I'd rather put this information in the file descriptor.
A file descriptor is something that belongs to a process and points to a file pointer. In my opinion it should keep following information: File descr. ID, File Pointer, at which position in the file are we?. More than one file descriptor can point to one file pointer.
Is this correct?
The other thing that amuses my brains: How to get the file transferred to the process in the end? shall the pager do some work? (for processes it WILL have to do the work - putting the process image at its place.) Or will I need to do a copy_block function?
thanks for your feedback in advance.
File Descriptors vs File Pointers
-
- Member
- Posts: 1600
- Joined: Wed Oct 18, 2006 11:59 am
- Location: Vienna/Austria
- Contact:
File Descriptors vs File Pointers
... the osdever formerly known as beyond infinity ...
BlueillusionOS iso image
BlueillusionOS iso image
Re:File Descriptors vs File Pointers
Yep, that's correct, though I would have both the position and the mode in the file descriptor. Each opened copy of a file (file descriptor) needs its own position so that different processes can do I/O at different places at the same time. Also, if one process has a file opened read/write and another has it opened read-only, the read-only one needs to be denied write access.
Your second question opens a few possibilities. An obvious way would be to read straight from disk into user memory on calls to read(). But what about memory mapped files? OK, provide read() for apps, and put in a call to read() in the page fault handler. But what about the cache? OK, have read() look at the cache. But then we get two copies of the same data for memory-mapped files (the cache and the MMF). OK, map MMF pages directly to the cache. But then the cache and the file on disk get out of sync (imagine an app writing directly to the MMF in memory then attempting to read() the file)...
So implement read() by mapping a region of the file temporarily, then doing memcpy() from the temporary MMF to the user buffer. If that region of the file is not cached, read() will invoke a page fault internally. The page fault handler does the actual reading.
Your second question opens a few possibilities. An obvious way would be to read straight from disk into user memory on calls to read(). But what about memory mapped files? OK, provide read() for apps, and put in a call to read() in the page fault handler. But what about the cache? OK, have read() look at the cache. But then we get two copies of the same data for memory-mapped files (the cache and the MMF). OK, map MMF pages directly to the cache. But then the cache and the file on disk get out of sync (imagine an app writing directly to the MMF in memory then attempting to read() the file)...
So implement read() by mapping a region of the file temporarily, then doing memcpy() from the temporary MMF to the user buffer. If that region of the file is not cached, read() will invoke a page fault internally. The page fault handler does the actual reading.
-
- Member
- Posts: 1600
- Joined: Wed Oct 18, 2006 11:59 am
- Location: Vienna/Austria
- Contact:
Re:File Descriptors vs File Pointers
that's a good point: the information about the mode of a file in the file descriptor.
As for application loading: Would it then be necessary to keep the application data in the block cache? The pager does the grunt work, so it's loaded into Process memory: why bother keeping the blocks in the block cache? It would waste some memory...
Say, we start a process:
the shell sends the message to the memory manager: gimme a process "x". the memory manager creates the adress space structure for process "x", and tells the file system service: Hey, I wanna load process "x". tell me the blocks where it is located. file system sets off and fetches the inode (tells NO if no valid directory entry found). It creates a list of blocks (actually an array of ints dynamically allocated) and sends it to the memory manager via Pipe (copies a stream of data from one process to the other one).
the memory manager picks the data, attaches it to the process struct it keeps and tells the pager: gimme a page directory (which is empty except of the kernel pages). then it informs the kernel: New Process: Userstack, Code, Heap, Priority. Kernel creates the process with adress 0xdeadbeef. so we know it is a new born one. the process is stuffed onto the ready queue for its priority.
The moment the process is dropped to the cpu and starts execution, it stumbles over 0xdeadbeef - and is in page fault land, where the pager reigns.
The pager asks the memory manager what the ruddy hell happens here: it gets informed: this is a new process. Here is the device and here are some blocks to fetch. it marks those blocks as 'loaded'. pager inserts standard page tables in the process pagedir and then talks to the device in question: load blocks 30,31,32,33 at say 0x1000. To do this, the pager has its own fetch_blocks() function.
Now, the process can start work.
This is only a veeery rough sketch of how it could be done. Have the pager do all the grunt work no one else wants to do. *ggg*.
Thanks for feedback, Tim!
As for application loading: Would it then be necessary to keep the application data in the block cache? The pager does the grunt work, so it's loaded into Process memory: why bother keeping the blocks in the block cache? It would waste some memory...
Say, we start a process:
the shell sends the message to the memory manager: gimme a process "x". the memory manager creates the adress space structure for process "x", and tells the file system service: Hey, I wanna load process "x". tell me the blocks where it is located. file system sets off and fetches the inode (tells NO if no valid directory entry found). It creates a list of blocks (actually an array of ints dynamically allocated) and sends it to the memory manager via Pipe (copies a stream of data from one process to the other one).
the memory manager picks the data, attaches it to the process struct it keeps and tells the pager: gimme a page directory (which is empty except of the kernel pages). then it informs the kernel: New Process: Userstack, Code, Heap, Priority. Kernel creates the process with adress 0xdeadbeef. so we know it is a new born one. the process is stuffed onto the ready queue for its priority.
The moment the process is dropped to the cpu and starts execution, it stumbles over 0xdeadbeef - and is in page fault land, where the pager reigns.
The pager asks the memory manager what the ruddy hell happens here: it gets informed: this is a new process. Here is the device and here are some blocks to fetch. it marks those blocks as 'loaded'. pager inserts standard page tables in the process pagedir and then talks to the device in question: load blocks 30,31,32,33 at say 0x1000. To do this, the pager has its own fetch_blocks() function.
Now, the process can start work.
This is only a veeery rough sketch of how it could be done. Have the pager do all the grunt work no one else wants to do. *ggg*.
Thanks for feedback, Tim!
... the osdever formerly known as beyond infinity ...
BlueillusionOS iso image
BlueillusionOS iso image
Re:File Descriptors vs File Pointers
hmmm ... another thing comes into mind: would it make sense to have the pager keep a lru list of blocks (with hash chain)too. It would be responsible for the Data blocks.
File system handles inodes, superblocks, directories block and inode allocation and so forth ...
File system handles inodes, superblocks, directories block and inode allocation and so forth ...