allow task to make their own stacks

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Tyler
Member
Member
Posts: 514
Joined: Tue Nov 07, 2006 7:37 am
Location: York, England

Post by Tyler »

Avarok wrote:Heh...

Threads are supposed to be parallel transformations on a data set, however that would require that data not be shared at all between them. Then you have processes. Threads are *supposed* to share the data they transform.

With that understood, often the data needs to be in a certain state for the a given thread to continue execution. This will typically rely on the output of another thread or external event. The thread *could* just infinitely loop until the result came, or we can have it give up it's execution time to something else. The desireable thing often appears to be to cede it's time to the thread it's waiting for.

So yeah, again, the difference between threads and processes is that threads aren't isolated from one another in memory, and so can share memory without any additional abstraction. This is better in all cases except when one thread cannot trust another thread (the bad thread could change my stack, or overwrite my code - :evil: )
Thread's aren't an abstraction of a task... they are an execution enviroment that runs the same code as other thread's within the process. They share the address space, and therefore the code. You can't have "bad" threads.

As for the Thread Calls mentioned above... i don't see thread's waiting for each other to be at exactly the same point just so they can tamper with each other's stacks. It is more like such calls would be made through the Kernel or using Global Memory.
Avarok
Member
Member
Posts: 102
Joined: Thu Aug 30, 2007 9:09 pm

Post by Avarok »

Well, you're assuming a single code segment loaded by a single party, with multiple threads executing from the same entry point.

I'm not.

Any given "process" could load dozens of entry points and code segments, dozens of stacks, and have dozens of threads assuming a more powerful execution model. You could potentially (though I don't see why) even let the threads wander between "processes".

I'm suggesting that you couldn't "simply thread" everything in a system - that there is a place for the separation of data (currently done by paged memory isolation) and that it's because of security.
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.
- C. A. R. Hoare
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Post by JamesM »

You could potentially (though I don't see why) even let the threads wander between "processes".
How, exactly? when you put the thread in a new address space it won't be able to access it's own code... :S
You also seem to be blurring the line between threads and processes quite deliberately. Why would you want to have dozens of threads in one process with multiple entry and code segments? Not only are you reducing the amount of address space available to each thread for personal storage, but you are just asking for memory corruption. Mutexes can only go so far- having dozens of threads concurrently accessing shared memory will either lead to corruption (with badly programmed code) or starvation (deadlock is just blatantly around the corner).

The difference between threads and processes is threads share memory. If you need dozens of lightweight processes all sharing global memory, perhaps it is time to rethink your design strategy!!

Crazed123:
What, exactly, are synchronous cross-thread calls? I thought the point of a thread was that it ran its own instruction stream, and therefore couldn't "called" into or out of.

But this stuff sounds interesting and right up my alley (I'm into portal-based IPC.), so please do explain.
I was referring to any interprocess-communication system. Pretty much any IPC system can be used to make synchronous cross-thread calls. For example, in my OS, all my IPC methods are wrappers around a system of remote procedure calls. I think technically it's called remote method invocation, as it's designed to be used on C++ objects.

It is essentially a glorified message-passing system, but the calling semantics are much nicer:

Code: Select all

obj->func(param);
Some stub code inside that function can make it be called by any process or (later) thread.

Did that explain well what I meant? It doesn't have to be RPC - I'm sure you could tailor almost any functional IPC system to do similar. In fact, a semaphore signalling system with parameter passing would count as a cross-thread call:

Code: Select all

semaphore s;
param parameters;
retval returnValue;

thread1()
{
  // do stuff
  parameters = params_to_call;
  s.wait();
  // <return value is in returnValue.>
}

thread2()
{
  // do stuff
  while(1) {
    // do more stuff
    if(s.test())
    {
      returnValue = performFunc(parameters);
      s.signal();
    }
  }
}
Over-simplified, but it should get the point across :P

JamesM
User avatar
Colonel Kernel
Member
Member
Posts: 1437
Joined: Tue Oct 17, 2006 6:06 pm
Location: Vancouver, BC, Canada
Contact:

Post by Colonel Kernel »

JamesM wrote:
You could potentially (though I don't see why) even let the threads wander between "processes".
How, exactly? when you put the thread in a new address space it won't be able to access it's own code... :S
Windows CE implements thread migration by basically shoe-horning all processes into a single virtual address spaces, then fiddling with page protection bits on task switches. When a thread in one process calls a function in another process, the "active portion" of the virtual address space changes, but the same stack is still being used.

This presentation explains it better than I can...
Top three reasons why my OS project died:
  1. Too much overtime at work
  2. Got married
  3. My brain got stuck in an infinite loop while trying to design the memory manager
Don't let this happen to you!
Avarok
Member
Member
Posts: 102
Joined: Thu Aug 30, 2007 9:09 pm

Post by Avarok »

JamesM wrote:
You could potentially (though I don't see why) even let the threads wander between "processes".
How, exactly? when you put the thread in a new address space it won't be able to access it's own code... :S
Wow... that's a tough one. There are probably a million ways. The most obvious to me is to pause the thread, modify it's EIP from the stack, copy the code segment over somewhere and resume. Alternatively, you can remap memory or whatnot. What I was originally thinking though is that the thread is simply a stack and EIP running in *a* code segment. I thought a far call into an entry point combined with an address space switch would migrate the thread to that entry point. The only thing you need to do there is migrate the stack and program it *that way*.
JamesM wrote:You also seem to be blurring the line between threads and processes quite deliberately. Why would you want to have dozens of threads in one process with multiple entry and code segments?
Well, threads are lighter weight than processes. Having one for each entry point can make sense, having them for event handling can make sense, or dedicating threads to different tasks that you want to do at the same time makes sense as long as you don't have to worry about security.
JamesM wrote:Not only are you reducing the amount of address space available to each thread for personal storage, but you are just asking for memory corruption. Mutexes can only go so far- having dozens of threads concurrently accessing shared memory will either lead to corruption (with badly programmed code) or starvation (deadlock is just blatantly around the corner).
Actually, since I'm using a 64 bit address space, using static buffers for code and statically declared data, and then separate dynamically allocated buffers for dynamic arrays and stacks, which are inherently address-flexible, memory corruption isn't so much a problem until you have multiple CPU's writing and writing/reading the same data in exactly the same cycle. Not sure how that should work out.

The idea though was to use threads on different entry points, so they tend to be doing different things within the address space at any given time - preventing them from typically locking up or starving each other.

Alternatively, the current world creates a separate address space for each "thread", making it a process. Does forking (creating and copying entire address spaces) to handle instances of an external event seem at all more efficient? Or maintaining branches of code for one fork across that processes (possibly tens of thousands of) address spaces?
JamesM wrote:The difference between threads and processes is threads share memory. If you need dozens of lightweight processes all sharing global memory, perhaps it is time to rethink your design strategy!!
I think exactly the opposite. If *everything* executed needs a separate memory space regardless of trust considerations, and then uses shared memory functions, perhaps it's time to rethink my design strategy.
What, exactly, are synchronous cross-thread calls? I thought the point of a thread was that it ran its own instruction stream, and therefore couldn't "called" into or out of.

But this stuff sounds interesting and right up my alley (I'm into portal-based IPC.), so please do explain.
My take is that you interrupt/call into the thread switcher to call the wait/sleep(), but with a twist. Instead of just passing off to *any* next thread, you tell the thread switcher thingy that you want to dedicate your CPU time to the thread you're waiting for results from.

Code: Select all

<in thread 2>
while(1) {
  waitFor(thread[7], 3);
  doSomething(bob);
  wait();
}

<in thread 7>
bob = 14;
signal(3);
waitFor(thread[2], 1);
return bob;
I was referring to any interprocess-communication system ... [snip] ... It is essentially a glorified message-passing system, but the calling semantics are much nicer: [snip] Some stub code inside that function can make it be called by any process or (later) thread.
That sounds most dangerous. I'll assume it's not. Perchance you can explain how though. I don't see it.

I liked the semaphore cross-thread-call example. I'm still bad with terminology, so I'll roll with that. :)
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.
- C. A. R. Hoare
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Post by JamesM »

Code: Select all

Wow... that's a tough one. There are probably a million ways. The most obvious to me is to pause the thread, modify it's EIP from the stack, copy the code segment over somewhere and resume.
Argh! that means that either (a) you have to assume that there's not already some code or a file mmap at that position in the destination address space or (b) ALL code is compiled position-independent. :|

Code: Select all

Well, threads are lighter weight than processes. Having one for each entry point can make sense, having them for event handling can make sense, or dedicating threads to different tasks that you want to do at the same time makes sense as long as you don't have to worry about security. 
You have a point. Especially with my starvation talk I was assuming a scheduler like linux, which does not pre-empt threads. Obviously because you're making your own OS, you'd have thread pre-empting.
That sounds most dangerous. I'll assume it's not. Perchance you can explain how though. I don't see it.
I don't understand why you think it's dangerous, but I'll have a shot at an example here. Note I don't have my code with me (at work) so it may be a bit off.

Code: Select all

class C : RemoteProcedureCall
{
  RPC_DEFINE1(0, int, C, myMethod, int); // The 0 should just be any unique integer
public:
  C()
  {
     RPC_REGISTER(myMethod);
  }
  int myMethod(int param)
  {
     RPC_SYNC(myMethod, param);
     // Do stuff
     return 5;
  }
};

// Some code somewhere else
C *myC = new myC();
myC->myMethod(3); // This will be synchronous and remote.
So, the RPC_REGISTER, DEFINE, SYNC calls are macros that look a little bit like:

Code: Select all

#define RPC_DEFINE1(idx, returnValue, className, func, param) \
  const int __rpcconst__##func = idx; \
  static returnValue __rpccall__##func(className *obj, param p1) \
  { \
    return obj->func(p1); \
  }

#define RPC_REGISTER(func) \ 
processManager.getProcess()->registerRpcHandler(__rpcconst__##func, __rpccall__##func);

#define RPC_SYNC(func, param) \
  if (getpid() != targetPid) \
  { \
    return rpcSynchronous(__rpcconst__##func, (void*)param); \
  }
And the RemoteProcedureCall class looks like

Code: Select all

class RemoteProcedureCall
{
  RemoteProcedureCall()
  {
    targetPid = getpid(); // The process that created us becomes our "owner" or target.
  }

protected:
  int targetPid;

  template <BLAH LOTS OF TEMPLATE PARAMS>
  R rpcSynchronous(Obj, Func, ..........)
  {
 .....
  }
}
The rpcSynchronous call is heavily templated to make it integrate seamlessly. I couldnt remember/be bothered to work out what parameters go where.
It just fills out a RpcCall struct with the object, func ID and parameter (as void*), and passes it to the process manager, who adds it to the target process' queue.

Did that sort of explain my system?

It's gone through quite a few changes - originally I was using c++ pointer-to-member functions, but they do NOT play nicely across address spaces. More correctly, they do not play nicely when there are 2 different class definitions. (I would have one class definition with all member functions filled out etc, and another which would just be a declaration, exposing the interface of the class to others but not the implementation. Essentially every function in this class definition would just be

Code: Select all

void func()
{
  RPC_SYNC(func);
}
- it would assume you are calling from another address space, and didn't know the definition of the functions. (if you did, you wouldn't have to use IPC in the first place!).

Anyway, this didn't work because GCC created a different vtable for each class definition and that caused pointer-to-member functions to balls up big time. So I came up with this method. It works very well in my os, and it's pretty speedy, too.

Is that what you were imagining avarok?

EDIT: I should also mention that the vtable malarky is the reason every function has a static wrapper - so it can be referenced as a void* and not a pointer-to-member.

JamesM
Avarok
Member
Member
Posts: 102
Joined: Thu Aug 30, 2007 9:09 pm

Post by Avarok »

Well James, I'm developing for the x86-64. So yes, all code can very easily be position independent via RIP Relative Addressing for the code "block/array/space".

The stack is always managed through RBP and RSP, so you simply need to change those registers to reflect the change.

If a thread was programmed to migrate, it could rather efficiently do so on this platform. I just don't see the reason at this time.

As for your RPC protocol, it looks cool. :)

I suck with templates, so it took a bit to actually read what you posted. :roll:
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.
- C. A. R. Hoare
Post Reply