Mental dilemma, require assistance (Syscalls)

JamesM · Post by **JamesM** » Thu Nov 08, 2007 7:25 am

Right, as some of you may know I have a 'novel' way of implementing system calls. My kernel is fully C++ based, and I have tried to come up with an interface that allows class member functions to be run as syscalls (in ring 0) with the least possible code:

Context description

Code: Select all

int MyClass::MyFunc(int a, struct bleh b)
{
  START_KERNEL;

  ...

  END_KERNEL(my_return_value);
}

All is fine and dandy. Under the hood the START_KERNEL macro expands to:

Code: Select all

#define START_KERNEL \
  u32int __start_kernel_ring3 = 0; \
  asm volatile("mov %%ss, %0" : "=r"(__start_kernel_ring3)); \
  if(__start_kernel_ring3==0x23) \
  { \
    fastSyscall(); \
  }

And the END_KERNEL macro is similar: The asm there is to check if we're already in ring0: if so just skip over the ring change code.

fastSyscall is a pile of asm which will call an kernel trampoline via SYSENTER and magically pop out of it in ring0. It then returns, and the function can execute in ring0. there is a similar function fastSysexit() which is called in END_KERNEL.

The problem

This assembly trampoline could be exploited to run any code in ring 0 mode.

The potential solutions

1. Have a check: if (address_to_return_to > KERNEL_END) PANIC(); This has another problem: any arbitrary kernel code can be run (i.e. just find somewhere where a 'cli' is called and get that code called directly. Not good).
2. Have a call table. This is possible, but it has a few problems. As mentioned these syscall functions can be (and mostly are) member functions, which means function pointers cannot be easily gleaned and once gotten there is almost nil way to manipulate them. Also this will take away some of the transparency in syntax that I have worked so hard for.
3. Use a post-processor to find the valid entry points and populate a table with their locations. This again has its problems - including efficiency: every syscall would have to grep through an array of valid locations to see if it can find itself there.

I'd love to hear any ideas / points you have, and your thoughts. Am I barking totally up the wrong tree? should I be going back to an int 0x80-style C interface? Or is this possible? I'd really really like to keep my interface C++ based as my entire kernel is written in C++: it seems a little wasteful to have a whole C->C++ and C++->C layer.

Cheers!

JamesM

Solar · Post by **Solar** » Thu Nov 08, 2007 7:40 am

Hmmm... I don't quite get the picture. The example code you are showing above, is that the kernel, or user space?

If it is kernel space, there is no need for START_KERNEL / END_KERNEL, as the kernel is already running in "super" mode.

If it is user space, well, user-space applications should not be able to "magically" have code run in kernel mode; that's the whole idea of the "ring" / "super-mode" shebang: The user code should be forced to call the kernel through well-defined, "narrow" interfaces, with the kernel checking each request for proper authorization (is the process allowed to open that file at all?).

I am sure I am missing something here. Could you elaborate on your concept (semantically, without focus on the C++ syntax / implementation)?

Craze Frog · Post by **Craze Frog** » Thu Nov 08, 2007 7:59 am

Why can't you just declare your class twice, the normal class and a call-from-userspace class which only switches to kernel mode and calls the ordinary member function?

Like this:

Code: Select all

class CWahWah {
  public:
    int MyFunc(int a, int b);
};

int CWahWah::MyFunc(int a, int b) {
  // Do kernel mode stuff here, it's always called from code running in ring 0
}

And then the interface, which is used by userspace programs and can be automatically generated from the main source file:

Code: Select all

class KI_CWahWah {
  public:
    int MyFunc(int a, int b);
};

int KI_CWahWah::MyFunc(int a, int b) {
  enter_ring0();
  call_actual_MyFunc(a, b);
  leave_ring0();
}

Then you must of course make sure that none of your kernel code is writable from ring0 and everything should be all right I think.

JamesM · Post by **JamesM** » Thu Nov 08, 2007 7:59 am

Hmmm... I don't quite get the picture. The example code you are showing above, is that the kernel, or user space?

If it is kernel space, there is no need for START_KERNEL / END_KERNEL, as the kernel is already running in "super" mode.

If it is user space, well, user-space applications should not be able to "magically" have code run in kernel mode; that's the whole idea of the "ring" / "super-mode" shebang: The user code should be forced to call the kernel through well-defined, "narrow" interfaces, with the kernel checking each request for proper authorization (is the process allowed to open that file at all?).

The code samples shown (and the point of this) is for kernel code only; that is code which is in the supervisor-only area of memory. The point is that the user calls MyClass::MyFunc as an ordinary function. MyClass::MyFunc is actually a syscall, and the first thing it does is to try and change into supervisor-mode (ring 0).

As for the "well-defined, narrow interfaces": this is what the problem is: how to (if it can be done) narrow the current gateway/interface from all-encompassing to the minimum. I hope that made sense.

I am sure I am missing something here. Could you elaborate on your concept (semantically, without focus on the C++ syntax / implementation)?

I'll give it a shot!

The user calls a function. Let's call it MyClass::MyFunc. He/She has no idea what the implementation is. The user doesn't need to know this is a system call.

MyClass::MyFunc, when it gets called by the user, resides in the supervisor-only area of memory, and notes that the current privilege level is >0. So it tries to change the current privilege level to 0. It then executes, changes the level back and returns.

That is the concept: Transparency is the issue, and while int 0x80-style interfaces can work well for C code (using macros etc to mask the implementation) for C++ code it is not so easy. The entire stack frame including the "this" pointer must be copied, and that is why I have come up with this alternative method.

I really hope that cleared things up: Thanks for replying and give me a shout if you need more info!!

JamesM · Post by **JamesM** » Thu Nov 08, 2007 8:19 am

After going outside and grabbing lunch, I feel as if I'm being a little tunnel-minded, too set on my current implementation to see a bigger picture (if there is one). My current implementation I have had finished for a long time with parts of the kernel relying on it, so that was probably why I was so fixated on fixing that solution as opposed to getting the best solution.

So now I am changing my question. If you were going to implement a C++ syscall interface, what ideas or methods would you use?.

I'm thinking the best plan is a call table just like most people (and my last OS) have, but it seems much more complex when you could be dealiing with member function pointers.

E.g. in a normal C call table:

Code: Select all

void *table[MAX_SYSCALLS];
table[0] = (void*)&fork;
table[1] = (void*)&execve;

etc etc: all function pointers can be cast to a common denominator (void*) and stored in an array.

But member pointers cannot be cast to void*; they carry around lots of state and so are much more difficult to mutate. Any ideas here would be helpful!

(I'm rather unhappy that I have to rewrite most of my syscall interface, but at least it'll be better!)

Solar · Post by **Solar** » Thu Nov 08, 2007 8:53 am

JamesM wrote:MyClass::MyFunc, when it gets called by the user, resides in the supervisor-only area of memory...

...which means the user cannot call it, because the user cannot see it, at least as far as I understand. What you scetched in the code example is a way for user-level code to execute BEGIN_KERNEL, i.e. execute arbitrary code with kernel-level privileges. Dumb Idea (TM).

That is the concept: Transparency is the issue, and while int 0x80-style interfaces can work well for C code (using macros etc to mask the implementation) for C++ code it is not so easy. The entire stack frame including the "this" pointer must be copied, and that is why I have come up with this alternative method.

Why does the kernel need the "this" pointer? What is it that MyClass does that the kernel must know about it? I fear you're running into the wrong direction, so to speak.

Let me depict the status quo in C, and you tell me where your class concept improves matters:

User calls fopen(), which takes a filename and a "mode" parameter. fopen() passes filename and (preprocessed) "mode" to open(), which is a syscall wrapper: It places its parameters somewhere specific, e.g. in certain registers, and invokes the kernel - by int 0x80, for example. Control passes to the kernel.

The kernel, more specifically the int 0x80 IRQ handler, looks up which service is actually required (e.g. by looking up a "service number" left in a certain register by open()). The service handler knows where to find the parameters (filename and mode), and after having a quick talk with the filesystem driver, gets a "handle" for the newly-opened file, e.g. an integer number. This is placed in a register, and the IRQ handler returns.

open() simply takes the integer from the register, and passes it to fopen() as its return value.

fopen() then sets up a FILE structure, with buffer, pointers, status bytes, the whole shebang, and of course, the handler just received. That FILE structure is returned to the user.

Note how the whole "state" of the FILE structure is userspace-only. There is only one tiny integer in there that makes the connection between the userspace FILE and the kernelspace filedescriptor. Also, fopen() knows nothing about kernel-space. Even open() doesn't know what actually happens, it just knows how to create an interface between C calling conventions and kernel syscalls.

Now, and because I want to understand this (as I've fiddled with a C++ kernel myself, ages ago): Where does your class, your passing of a "this" pointer to the kernel, become more useful than a simple, static C-style function?

Solar · Post by **Solar** » Thu Nov 08, 2007 8:55 am

Sorry, it took me ages to write that reply, and I missed your last post. The core of my question remains valid, though: Why do you need member functions making the syscall, and passing their state - instead of the member function calling a static, non-member function, and the "state" be damned? Does the kernel need to know about the object's state?

JoeKayzA · Post by **JoeKayzA** » Thu Nov 08, 2007 8:58 am

JamesM wrote:So now I am changing my question. If you were going to implement a C++ syscall interface, what ideas or methods would you use?.

Mine would look a bit like this:
Userspace invokes syscalls on objects. The objects reside in kernel space, and their methods correspond to system calls. To invoke a system call, userspace needs a reference to an object, these references are numeric indexes into a table, just like handles under windows nt or file descriptors under unix (and resources under clicker32, IIRC). To invoke a system call, userspace must provide a reference number and a method number, then invoke the syscall handler.

The syscall handler looks up the reference in the table (and gets a real pointer to the object in kernel space), then invokes a dispatcher method on the object (and passes the method number and further arguments), which then calls the implementation.

Just a few thoughts..

cheers Joe

Solar · Post by **Solar** » Thu Nov 08, 2007 9:06 am

I don't get it. Why does userspace need a reference on a kernel space object? Sticking to the C status quo I described above, the IRQ handler needs a reference to the object, and that's it - or am I being dense?

JamesM · Post by **JamesM** » Thu Nov 08, 2007 9:10 am

Joe: Great stuff, I like that!

Solar: Valid points, I'll try to address them.

Take for example the C function "getpid()". This is matched by my C++ functions Process::getPid(). Now obviously I can't make that static: the "this" pointer is required because the variable specifying the pid of the process (along with all the rest of the process' state) is stored in that Process instance.

I really should have thought of Joe's point first though. I've spent so long with my mind going in circles and didn't even think of having a dispatcher as a superclass!

Solar · Post by **Solar** » Thu Nov 08, 2007 9:41 am

JamesM wrote:Take for example the C function "getpid()". This is matched by my C++ functions Process::getPid(). Now obviously I can't make that static: the "this" pointer is required because the variable specifying the pid of the process (along with all the rest of the process' state) is stored in that Process instance.

I don't think so.

Sorry, but how should the user-space process know its PID, without asking the kernel for it using getpid()? It's not as if the PID is determined in user-space: You call getpid(), which does a system call, and returns with the PID as determined by the kernel (who assigned the PID on process startup). The kernel doesn't have to ask any user-space object for the PID. In fact, it would be a Dumb Idea (TM), because that would allow the user-space app to forge its PID...

Colonel Kernel · Post by **Colonel Kernel** » Thu Nov 08, 2007 9:47 am

@JamesM:
Realistically, the only way to have a sane kernel API is to define it with "flat" C-style functions, for all the reasons that Solar has pointed out. (Even the syscall interface in Singularity, which is written in C#, contains only static methods.) However, this doesn't mean that code in user-space can't use C++ convenience classes that wrap the syscall API. This is done all the time to wrap legacy C APIs and make them more palatable.

To use Solar's example of fopen():

Code: Select all

class File
{
public:

    static std::auto_ptr<File> open( const std::string& fileName, const std::string& mode );

    ~File(); // Calls close() if it hasn't been called already.

    void close();

    template <typename T>
    void read( std::vector<T>& buffer ) const;

    template <typename T>
    void write( const std::vector<T>& buffer );

    ...

private:

    FILE* m_file;
    bool m_isOpen; // Set to false by close(); checked in d-tor.
};

This will work with any POSIX-style syscall API and gives applications the kind of convenience that you're looking for (I think).

JamesM · Post by **JamesM** » Thu Nov 08, 2007 1:04 pm

Solar wrote:
JamesM wrote:Take for example the C function "getpid()". This is matched by my C++ functions Process::getPid(). Now obviously I can't make that static: the "this" pointer is required because the variable specifying the pid of the process (along with all the rest of the process' state) is stored in that Process instance.
I don't think so.

Sorry, but how should the user-space process know its PID, without asking the kernel for it using getpid()? It's not as if the PID is determined in user-space: You call getpid(), which does a system call, and returns with the PID as determined by the kernel (who assigned the PID on process startup). The kernel doesn't have to ask any user-space object for the PID. In fact, it would be a Dumb Idea (TM), because that would allow the user-space app to forge its PID...

Sorry you've completely misunderstood! Process::getPid() is the equivalent of getpid(). It is a syscall. The Process object it gets called on resides in kernel space, the user cannot mutate anything. (Cannot forge it's own pid etc). It is a kernel-owned object. It has an accessor method called getPid() which can be called from user space. Thus it swaps from user-mode to supervisor-mode to handle the request, thus it is a 'syscall'.

As a point of fact 'kernel space' is a little wishy-washy as a term in my OS: currently the entirety of kernel space is available for reading by all processes. (This will change). That is how user-space processes can 'see' kernel object to call functions on them.

I think however a

Code: Select all

syscallManager->call(currentProcess, GETPID);

May suffice and create a safer interface.

JoeKayzA · Post by **JoeKayzA** » Thu Nov 08, 2007 3:18 pm

Solar wrote:I don't get it. Why does userspace need a reference on a kernel space object? Sticking to the C status quo I described above, the IRQ handler needs a reference to the object, and that's it - or am I being dense?

Well, what I meant was not that userspace code has a pointer or something to a kernel object, but instead an opaque numeric value that corresponds to an object (just like a file descriptor). Providing that value as well as a method number (which is specific on the target object's type) is sufficient to invoke a syscall.

That's because of the capability-like way my syscall facade works. All in all this is nothing revolutionary, the only real difference to a flat procedural syscall interface is that there are multiple levels of syscall dispatching (the central syscall dispatcher, forwards to the object's internal syscall dispatcher, which then invokes the syscall implementation). It simply seems a lot cleaner to me for object based operating systems.

cheers
Joe