Adding FILE Support

AJ · Post by AJ » Wed Mar 14, 2007 8:31 am

Hi,

I am looking at the design of my OS with respect to the device manager and C standard library interface at the moment.

Until now, my kernel has fairly decent printf support via its 'emergency', VGA text-mode driver but I am now looking at how to provide this support for user space programs in my standard library. For this reason, I have started looking at the FILE macro, stdin, stdout and stderr files.

Does anyone know of any decent examples of how to achieve this kind of interfacing and support at the kernel level? Preferably, I would like to look at as many examples as possible, then implement from scratch based on what I have learned - I'm not really interested in just copying what someone else has done!

Thanks in advance for any pointers...
Adam

Solar · Post by **Solar** » Thu Mar 15, 2007 1:15 am

Look at the PDCLib sources. Not the v0.4.1 release, but the latest stuff in the repository. More specifically,

internals/_PDCLIB_int.h, line 273, defining struct _PDCLIB_file_t (which is aliased to FILE in stdio.h),

functions/_PDCLIB/print.c, which contains the "juice" of the printf() functionality, _PDCLIB_print(), which handles the conversion specifiers of any printf() call,

functions/stdio/vfprintf.c, which does the wrapup.

_PDCLIB_print() and the various printf() functions handle user-space buffers, which - when full - are fflush()ed, which in turn uses the system call write() using the file handle from within the FILE structure. That means, what the kernel gets to see is a (int) file descriptor and a (char *) string to be written to the stream so identified. Porting this to kernel space is trivial.

Note that the code, while working good, has to be considered "broken" because it only supports printing at the moment. Reading, file positioning, text / binary conversions, multibyte support etc. etc. are still missing.

If you have any questions about the above, ask, I'm happy to help.

Another source on how this can be done is Paul Edward's PDPCLIB / stdlib.c.

AJ · Post by AJ » Fri Mar 16, 2007 2:56 am

Thanks for this. I will have a look through the sources and see what I can see.

The format conversion part of my current printf (emergency kernel-mode console which prints directly to video memory) is pretty good, I think, and I have put in a few format converters to my stdlib. The only tricky thing on that score is floats (as I think we discussed in a previous topic - you seemed fairly horrified that I write my stdlib functions as I need them

). The main problem from my point of view is the conceptual FILE ptr one.

Unfortunately it looks like a busy few days ahead so it's now a case of when I get around to it - sure you know how it is

. but I'll get back if I have any probs with implementation.

Cheers again for the links,
Adam

AJ · Post by AJ » Thu Mar 22, 2007 4:14 am

OK - thanks to the references you gave me I now have basic FILE support. I'm just ironing out a few problems with the fact that stdin, stdout and stderr (which points to stdout at the moment) are ring buffers, whereas all of my file handling functions will, of course, be required to handle linear buffers in the future.

At the moment, my processes get assigned a stdout, stdin and stderr pointer if required on process startup. If a keyboard event happens within a certain processes time-slice, the basic (kernel-mode) keyboard driver writes a character to the stdin for that process.

In contrast, if a process writes to its individual stdout, my basic (TUI) video driver gets scheduled and reads stdout, placing characters on screen as appropriate.

I still have quite a bit of work to do with regards to separate process spaces and IPC - for example it could get a bit complicated at the moment if there are two text-mode processes expecting console input! I'm going to sort out some kind of registration process, but does the above generally seem reasonable?

Cheers,
Adam

Solar · Post by **Solar** » Thu Mar 22, 2007 7:33 am

I am not really sure what you mean by "ring buffer", but you are aware that stdin, stdout and stderr are in no way different from a "normal" stream, aside from that they are already open when int main() starts? Look at the realloc() function - you can re-assign stdout to a file stream, or you can connect stdin to a socket if your OS supports sockets-as-files.

AJ · Post by AJ » Thu Mar 22, 2007 9:32 am

All I meant is that the file buffers for the standard buffers wrap when they are full - as long as enough data in them has been consumed. I believe the system is fully redirectable to any other stream - although GRUB is currently loading my test userland process at the moment, so I don't actually have disk IO to try this out

The whole 'sockets as files' business is on the todo list (with a million and one other things). Again, thanks to looking at your (and several other) file implementations, I think I have that right in my head too.

I'm also currently reading up on Linux (and other OS) drivers too, as I would like streams to be very closely involved in IPC and user-->driver communication.

Adam

mystran · Post by **mystran** » Thu Mar 22, 2007 7:18 pm

The UNIX approach to a TTY (that is the console a program see) is basicly three pipes, automatically bound to the file descriptors 0, 1 and 2.

0 is bound to the reading end of a pipe, which is fed by kernel (or the shell that started the process) with standard input.

1 is bound to the writing end of a pipe, and whatever you put into that will by default get printed by kernel, though your shell might have redirected it to some other place (like a pipeline's next processes 0, or a file).

2 is bound to the writing end of another pipe, which acts the same, but is more rarely redirected, as it's supposed to be used for error messages..

Now, in unix you can use read(0, ...), write(1, ...) and write(2, ...) directly, but the runtime automatically wraps some objects around them.

The object around 0 is known as stdin. Typical implementation would have a buffer, read from the buffer (fread, scanf, whatever), and once that buffer gets empty, automatically issue a read() for more data.

The object around 1 is known as stdout. Typical implemenetation would have a buffer, write to that buffer, and when that buffer gets full, or a newline-character was written, automatically issue a write() to get it out for real.

The object around 2 is identical to the object around stdout, except the buffer is flushed after the fwrite/printf/puts/whatever operation whether or not there was a newline even if the buffer didn't get full. One could even skip having a buffer at all.

So basicly, you need to implement read()/write() operations in your library as system calls, implement pipes in your kernel, and provide each process with these three pipes. Rest of it can be done in your library in standard C, which you can either write yourself, or lift from one of the free libraries (assuming you don't want to just port a full library).

mystran · Post by **mystran** » Thu Mar 22, 2007 7:51 pm

I'll add some more now that I started:

To just support 0,1,2 you only need pipes really. But since read()/write() in Unix are generic calls to any type of filedescriptors, and pipes are just one example of those, what you often get as 0,1,2 aren't really pipes, but descriptors to tty-devices. They look like pipes for which the other end is handled directly by kernel, but respond to some things you can't do for a pipe... typically you can do ioctl()s on them to tweak the settings of the tty device, which is a virtual device implemented by a tty-driver in kernel. Or you might want to ask the tty-driver how wide is the screen we're printing to so you can do correctly wrap stuff.... and such.

Anyway, a shell that starts a process will basicly do something like this:

Code: Select all


int execute_program(char * name) {
   
     int process = fork();
     if(process) {
        // in reality we'd handle the possibility of suspends but..
        // process is now the PID of the process, and wait() returns
        // the return value once that process exists, resuming our shell
        // until then it just blocks the shell
        return wait(process);

     } else {
        // the new process from fork() gets return value 0
        // first we close all files we don't want to leave open:
        int i;
        for(i = 3; i < MAXFD, i++) { close(i); }

        // we replace the current program image by calling exec()
        exec(name);

        // that never returns, since it'll no longer be this program at all
     }
}

Now, filedescriptors (well most) will get saved by exec(), so 0,1,2 will still
point to the pipes/TTY device that they pointed to in the shell, since we skipped those. We could have done other stuff though, and done more work before we wait():

Code: Select all

int execute_program(char * name, int in, int out, int err) {
   
     int process = fork();
     if(process) {
        return process;
     } else {
        int i;
        // replace 0,1,2 with "in" "out" and "err"
        dup2(in,0);    // look in your favourite unix manual for dup2()
        dup2(out,1);
        dup2(err,2);
        // then close the rest and proceed
        for(i = 3; i < MAXFD, i++) { close(i); }
        exec(name);
     }
}

So that way we can then handle things like:

Code: Select all

int runpipe(char * command1, char *command2) {
     int fds[2];
     pipe(&fds); // likewise, check pipe in UNIX manuals

     int p1 = execute_program(command1, 0, fds[1], 2);
     int p2 = execute_program(command2, fds[0], 1, 2);

     // close the pipe in shell, so we don't keep command2 alive
     // when command1 exits
     close(fd[0]);
     close(fd[1]);

     int ret1 = wait(p1);
     int ret2 = wait(p2);
     
     return ret1 || ret2;
}

Now everything command1 writes into it's standard out, will go to the the standard input of command2. Other than that it's normal. Both processes will run at the same time. If the pipe's internal buffer gets full, then command1 will block writing until command2 reads it, and if it gets empty, then command2 will block reading until command1 writes some more.

When command1 terminates, it's descriptors will get closed, command2 will get end-of-file and hopefully return as well. If command2 terminates before command1 does, then command1 will get a "broken pipe" signal (or if ignores those EPIPE on write) and hopefully terminate as well.

It's not important in which order one wait()s processes, since if they end before we wait, then OS keeps the return value (and the process as a zombie) until we free it with wait().

If we wanted to redirect with files, we could either start a process which knows how to read/write files, or we could just let the shell play the other party of the pipe reading/writing to files, or we could even open a file and pass that to the new process. It'll still can be read/written so the process doesn't have to care if it doesn't need to anything special with it's terminal.

Just like that, you could even have the pipes be sockets, at which point they could be the telnet connection of some remote user, though normally those are wrapped into a "pseudo tty" to allow full terminal functionality. Shell doesn't need to care though, since it either passes it's own 0,1,2 onwards or creates more redirections..

That's about it when it comes to Unix tty handling and standard inputs and outputs and such.

AJ · Post by AJ » Fri Mar 23, 2007 3:16 am

Wow, thanks for the full answer - I'm just digesting all the information now!

Cheers,
Adam