Contexts, Threads, Capabilities and Interfaces

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Crazed123

Re:Contexts, Threads, Capabilities and Interfaces

Post by Crazed123 »

Pype.Clicker, your analysis is artful!

The idea of thread-specific heaps sounds good, but requires a bit of extra work. In the standard model of things (as far as I'm aware) the heap is always mapped at the end of the data segment. However, mapping of things is handled at the context (room) level, and thus special locations would be needed for both stack and heap if both were to migrate.

I've actually banged out a pseudo-implementation of how this could all work. So far only the Permit() call is pseudocoded, but it's something to look at. It's attached and relies on the very basic capability unit I've written, which can't be attached right now.

Code: Select all

type
 TCapability = packed record
  lwObjectID: longword;
  wRights: word;
  {This means the check value is a longword.}
  CheckValue: array [0..9] of byte;
 end;
 PCapability = ^TCapability;

procedure MakeCapability(var Cap: TCapability; lwCheckValue: longword);
function CheckCapability(Cap: TCapability; lwCheckValue: longword): Boolean;

function GetObjectID(): longword;
That's the gist of it, really. MakeCapability() and CheckCapability() do exactly as they say, currently with a very basic encryption algorithm that I really, really hope can't be broken to easily. But that's for another post, and I've isolated the encryption work in those two functions anyway to make it easier to change. GetObjectID() increments a private global variable (implementation section of the Pascal unit, where other units that use this one can't see it) and returns its old value.
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

Re:Contexts, Threads, Capabilities and Interfaces

Post by Pype.Clicker »

Crazed123 wrote: currently with a very basic encryption algorithm that I really, really hope can't be broken to easily.
Well, unless you're a trained genius, your encryption is very likely to be quickly broken or fooled. If i can recall the following generic principles:
- make use of timestamps so that any crypted message has a limited time validity
- make use of enough redundance so that you can distinguish the message from random garbage

For capabilities purpose, i'd be tempted to work using digital signature: the capability is clear text but it comes with a validator that is basically a one-way hash (MD5 or SHA should do the trick) of that cleartext together with a secret value that the kernel only knows. E.g.

"capability create (...)" will generate "capability.signature=md5(capability.raw,kernel_secret);"
and "capability_check (...)" will ensure that "md5(capability.raw, kernel_secret) == capability.signature".

Of course, one should ensure that the kernel's secret is never disclosed, so creating/validating capabilities is only achievable by the kernel. Note, too, that you cannot update the capability without asking the kernel to sign it again ...

In a way, one quickly notice that this scheme will add little compared to privately-kept capabilities (e.g. they're stored at kernel level and the kernel is the only one to worry about them). What can be more interesting is the case when a server "X" creates a capability using its own private secret and defer that capability to some client "Y" and that it wants "Y" to prove to the kernel it has the right to do what "X" offered. This is possible if X has an identifier known by the kernel and that the "private secret" of every server is known by the kernel aswell. That way, we can have

Code: Select all

 capability_check(cap) {
   return cap.signature == md5(cap, too_many_secrets[cap.server_id]);
}
Crazed123

Re:Contexts, Threads, Capabilities and Interfaces

Post by Crazed123 »

Well, I'm not trained genius, but I had already come up with the idea of signing capabilities. Look at the code above for capabilities. CheckValue is an array of bytes, "unsigned char CheckValue[10]" if you want it in C. To sign a capability with MakeCapability() first the lwObjectID, followed by wRights is copied into corresponding positions in CheckValue, followed by lwCheckValue itself. The entire thing (CheckValue) is then encrypted using lwCheckValue as the key. This key is kept private by the object making a capability for a client.

CheckCapability() in turn, decrypts its copy of CheckValue using the given private key (lwValue), and then checks the decrypted data against lwCheckValue, wRights, and lwObjectID.

Code: Select all

var
 lwNextObjectID: longword;

procedure MakeCapability(var Cap: TCapability; lwCheckValue: longword);
var
i,j: longword;
begin
Move(Cap.lwObjectID,Cap.CheckValue[0],sizeof(longword) + sizeof(word));
Move(lwCheckValue,Cap.CheckValue[6],sizeof(longword));
{This encryption sucks, but use it for starters.}
j:= 0;
for i:= 0 to 10 - 1 do
 begin
 Cap.CheckValue[i]:= Cap.CheckValue[i] + not PByte(@lwCheckValue)[j];
 j:= j + 1;
 if j = sizeof(longword) - 1 then
  j:= 0;
 end;
end;

function CheckCapability(Cap: TCapability; lwCheckValue: longword): Boolean;
var
i,j: longword;
begin
CheckCapability:= false;
j:= 0;
for i:= 0 to 10 - 1 do
 begin
 Cap.CheckValue[i]:= Cap.CheckValue[i] - not PByte(@lwCheckValue)[j];
 j:= j + 1;
 if j = sizeof(longword) - 1 then
  j:= 0;
 end;
if (PLongword(@Cap.CheckValue[0])^ = Cap.lwObjectID) and
   (PWord(@Cap.CheckValue[4])^ = Cap.wRights) and
   (PLongword(@Cap.CheckValue[6])^ = lwCheckValue) then
 CheckCapability:= true;
end;

function GetObjectID(): longword;
begin
GetObjectID:= lwNextObjectID;
lwNextObjectID:= lwNextObjectID + 1;
end;
There's the current implementation, and I'm very sorry if the indentation doesn't render correctly in code tags. Hopefully it's readable anyway.

I'm no genius, but I've learned my basics.
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

Re:Contexts, Threads, Capabilities and Interfaces

Post by Pype.Clicker »

well, the problem is that your "hash" is trivially easy to guess for an attacker (if i got the code correctly, which i cannot guarantee since i'm not that turned to pascal)

for instance,

Code: Select all

capability_check[0] == oid[0] + secret_0
Since i can read capability_check[ 0 ] and oid[ 0 ], i can easiliy guess secret_0.

Now let's say all i can see is capability_check[ 0 ] (not oid[ 0 ]), i still can change the rights i have on the object by changing the capability_check[4] accordingly: if the initial stuff was X, it has been encoded into X+secret_2, so if i want to have Y instead, all i have to do is to add (Y-X) to the value found in capability_check[4].

Even if i hadn't the code, it would still be quite easy to enumerate all the possible "rights" values for a given capability, or to save a previous capability to a file and reuse the encrypted 'rights' bits from a file i can open on a file i shouldn't be able to open...

On the other side, one-way hash function like MD5 have the properties that:
1. if you change one bit of the source, you change virtually all the bits of the result
2. you can very hardly create a source that will have a given resulting hash
Crazed123

Re:Contexts, Threads, Capabilities and Interfaces

Post by Crazed123 »

I'll write the code here in C just in case you didn't get it, because I didn't understand the part of your post with the bullets. Maybe you didn't read correctly, or did you type some BBCode wrong? I don't know, but I could use the coding in C anyways.

And I'm looking into how MD5 works.

Code: Select all

void MakeCapability(TCapability* Cap,unsigned long ulCheckValue)
{
/* This will copy ulObjectID and usRights into the signature. */
memcpy(&Cap->Signature[0],&Cap->ulObjectID,sizeof(unsigned long) + sizeof(word));
memcpy(&Cap->Signature[6],&ulCheckValue,sizeof(unsigned long));
for(int i=0;i<10;i++)
 Cap->Signature[i] = Cap->Signature[i] + ~((unsigned char*)&ulCheckValue)[i % (sizeof(unsigned long)*sizeof(unsigned long)) / sizeof(unsigned long)] ^ ((unsigned char*)&ulCheckValue)[i % sizeof(unsigned long)];
}

int CheckCapability(TCapability Cap,unsigned long ulCheckValue)
{
CheckCapability = 0;
for(i=0;i<10;i++)
 Cap.Signature[i] = (Cap.Signature[i] ^ ((unsigned char*)&ulCheckValue)[i % sizeof(unsigned long)]) - ~((unsigned char*)&ulCheckValue)[i % ((sizeof(unsigned long)*sizeof(unsigned long)) / sizeof(unsigned long)];
if( (*((unsigned long*)&Cap.Signature[0]) == Cap.ulObjectID) &&
    (*((unsigned short*)&Cap.Signature[4]) == Cap.wRights) &&
    (*((unsigned long*)&Cap.Signature[6]) == ulCheckValue) )
 CheckCapability = 1;
}
Again, I'm looking into MD5 and other hash algorithms. I probably won't use MD5 itself because it pads its input out to make the number of bits in it divisible by 512. This means that for a 6-byte capability value, there will be 58 bytes of padding. Not doin it.

[EDIT] Changed the hashing algorithm. The new cipher is in the source code above, which has been changed.

And you were right. My old cipher was trivial to break.[/EDIT]

[edit again] Changed the cipher to use a 64-bit key instead of 32-bit, because I didn't like the way once a couple of values in the key were guessed by bruteforce (since 1/256 values must be right for each byte) an attacker could start calculating the rest by pure and normal algebra. Hopefully having 8 bytes of key (expanded to 64 bytes of sub-keys by the pKey type code) will help. Probably going to have to find ANOTHER hash function because the next poster will explain how to break this one in three easy steps...[/edit again]
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

Re:Contexts, Threads, Capabilities and Interfaces

Post by Pype.Clicker »

indeed, there were weird stuff in my post. all the xyz[ 0 ] became bullets :P

Honnestly, i strongly encourage you to reuse an existing and peer-reviewed hash if you want a comfortable protection against cheaters in your capability-based system.

And about the padding, you don't need to worry that much: just use 58 bytes of secret rather than 58 bytes of padding ;)

e.g.

Code: Select all

struct stuff_to_sign {
    capability_t cap;
    char secret[512/8 - sizeof(capability_t)];
} sign_this;

init() {
    create_some_secret(&sign_this.secret, sizeof(sign_this.secret));
}

makeCapability(...) {
    sign_this.cap=*cap;
    cap->signature=md5hash(&sign_this,sizeof(sign_this));   
}

checkCapability(...) {
    sign_this.cap=*cap;
    sign_this.cap.signature=empty_signature;
    return md5hash(&sign_this,sizeof(sign_this)) == cap->signature;
}
(okay, i'm overloading the meaning of == here ... sue me)
Kemp

Re:Contexts, Threads, Capabilities and Interfaces

Post by Kemp »

I probably won't use MD5 itself because it pads its input out to make the number of bits in it divisible by 512
Surely it's the output you should be worried about? Messing with the input before processing it won't waste any space in the final result as the output is fixed length with MD5 anyway (of course, this fixed length may be longer than you want). Though the conversation is getting quite involved, so I might have misunderstood what you meant.

btw, Pype, your analogy with RPGs was possibly the best I've seen yet ;D
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

Re:Contexts, Threads, Capabilities and Interfaces

Post by Pype.Clicker »

Kemp wrote: btw, Pype, your analogy with RPGs was possibly the best I've seen yet ;D
Well, actually its mainly due to my research activities in "mobile agents", "active networking" and things alike ... except in those situations the rooms may be located on various systems and that your "agents" (the players) may be moving from PCs to PCs carrying code & data of their own with them...
Crazed123

Re:Contexts, Threads, Capabilities and Interfaces

Post by Crazed123 »

I'd be quite happy to use a peer reviewed encryption algorithm. However, MD5, SHA-0 and SHA-1 have had attacks discovered. Also, I'd rather just encrypt the secret data (which I've made 8 bytes now) then hash it, as no imperfect hash is perfectly secure. If you could please point me towards a peer-reviewed encryption algorithm that can be used on small amounts of data, I'd be quite happy.
Soap_

Re:Contexts, Threads, Capabilities and Interfaces

Post by Soap_ »

Crazed, how about using XTEA (http://en.wikipedia.org/wiki/XTEA) for encrypting caps? It has the advantage that it's fast, easy to implement, works on small amounts of data and is - yes - peer reviewed.
Crazed123

Re:Contexts, Threads, Capabilities and Interfaces

Post by Crazed123 »

^_^ Thanks a ton, looks like I've found an algorithm. Looks like my biggest trouble with this one is having to insert two bytes of padding in the data to encrypt so that it will be a multiple of the block size! Not that that's a problem. Just wondering if there's any useful information that could be encoded there instead of pure padding.
JoeKayzA

Re:Contexts, Threads, Capabilities and Interfaces

Post by JoeKayzA »

This talk seems to have gone into hashing algorithms, but anyway, this still seems to fit in here.

I spent a few more thoughts on thread migration, especially the issue whether a migrating thread should take the whole thread's stack with it, or it should get a new stack in the destination address space. I know you think I'm a bit keen on this, but I'm about to decide whether or not to support this mechanism, and how to implement it finally.

So, one method would be to map the whole thread's stack (that should have a fixed size then) into the destination address space, as soon as the thread migrates there, and still keeping the mapping in the address space the thread came from.

The pro-argument for this method is that any arguments the thread wants to transport can be put onto the stack, and are then naturally accessible from within any address space, in which the thread operates (since the whole stack is accessible from within there).

The second method would be to allocate a new stack (or take a pre-allocated stack) in the destination address space.

Advantages here are:
  • The newly allocated stack can be of different size (depending on the call-depths and stack-usage which are common in the local code)
  • It can save address space, since the newly allocated stack can be significantly smaller
The main disadvantage here is, that all arguments and data, which need to be transfered, have to be put into a seperate location before the thread migrates, in some cases this would mean to copy data from the stack to some other memory location.

The main reason why I would prefer the latter method is that even when you map the whole stack into the destination address space, you cannot rely that you can preserve the same base address, thus invalidating all the pointers that lie on the stack. This makes it impossible to make a thread migration implicit - before you migrate, you'd have to adapt pointers (and also have to make heap objects available in the destination address space - you can't just take the whole heap with you during migration).

So instead you could start off with a new stack, and when you need to take an object with you, you declare this in some 'argument-list' that the kernel will interpret and copy the corresponding objects into the destination address space - and backwards, of course, when the thread migrates back into the caller address space. When one of these objects resides on the stack, you'd put it onto the list anyway.

So, finally, if you also planned to implement migrating threads, or you already implemented it, I would be thankful for some comments about how you solved the stack-issue. Aside this, I'd would be interested in what you think about the concept at all - do you think it could be useful or not - and why?

THX for reading through this post, I'm afraid it has got a bit longer than I expected. ::)

cheers, Joe
Crazed123

Re:Contexts, Threads, Capabilities and Interfaces

Post by Crazed123 »

I certainly haven't managed to implement migrating threads, as I've been gone for the weekend.

I think that asking the thread for a list of arguments it would like to place onto its new stack is an excellent idea, with one real flaw. How can the thread specify these arguments? In order to pass a dynamic array or linked list in C it would have to allocate a series of structures on the exact heap of the context/process it plans to leave behind. So some kind of wierd stuff would have to be done with the kernel in order to tell it what kind of allocator's free function to call on that memory once the thread was successfully migrated. For this reason, and the fact that programmers can darn well write code with the full awareness they're about to migrate a thread, I really don't think passing an argument list in user memory that will have to be deleted by the kernel is a good idea.

A better solution might be to pass a number of bytes for the kernel to take off of the old thread stack. These are then placed on the bottom of an entirely new stack (whose size can be small, large, equal to old stack, whatever), and the old stack disposed of. This way, an "argument list" (probably to the thread's new entry point) can be passed, but the kernel will still be able to delete that memory without having to know about user-level structures and allocators.
User avatar
Colonel Kernel
Member
Member
Posts: 1437
Joined: Tue Oct 17, 2006 6:06 pm
Location: Vancouver, BC, Canada
Contact:

Re:Contexts, Threads, Capabilities and Interfaces

Post by Colonel Kernel »

JoeKayzA wrote:Aside this, I'd would be interested in what you think about the concept at all - do you think it could be useful or not - and why?
LOL... isn't that the first question you should be asking, before all the implementation details? ;)

In the second case where the thread has to marshal its parameters before migrating -- what's the difference between this scheme, and an OS that uses synchronous IPC with priority inheritance? In the synchronous IPC case, the client thread packages up the parameters in a message, sends it to the server thread, and blocks until the reply comes back. The server thread inherits the client thread's priority. It has a different thread ID, but I don't see how this makes any functional difference...
Top three reasons why my OS project died:
  1. Too much overtime at work
  2. Got married
  3. My brain got stuck in an infinite loop while trying to design the memory manager
Don't let this happen to you!
Crazed123

Re:Contexts, Threads, Capabilities and Interfaces

Post by Crazed123 »

The functional difference is that a migrated thread can stay within its new context permanently.

I've actually managed to figure out a good set of system calls for interface/context handling. They use capabilities to determine who can do what:

Code: Select all

type
 TCapability = packed record
  lwContextID: longword;
  wObjectID,wObjectRights: word;       {This means the signature key is 128 bits long.}
  Signature: array [0..23] of byte;
 end;
 PCapability = ^TCapability;

TInterfaceRights = (irLink = 0,irMigrate = 1);
 TContextRights = (crImport = 0,crPermit = 1,crMatch = 2,crLink = 4,crMatch = 8);

{
    Whatever one of these points to is only ever linked to by the kernel. Finds out what rights the prolary will give right now.  If szInterface is nil it asks about the context itself.
}
 TUnknownInterface = packed record
  Permit: function(szInterface: PChar): TCapability;
  {Given the current state of the thread, get the key the context wants to use for this object.} 
  Key: procedure(wObjectID: word; var Key: array [0..3] of longword);
 end;

{
   Asks the kernel for a capability to import a certain interface.
   If nil is passed for szInterface it asks a context what rights another context has to load it at all.
}
function Permit(szInterface: PChar; cidContext: longword): TCapability;

{
  Returns whether the given interface definition matches the one the context exports.
}
function Match(icContext: TCapability; szInterfaceDefinition: PChar): integer;

{ 
  Obtains an interface table for an interface if the capability has linking rights.
}
function Link(icInterface: TCapability; pLinkTable: PPointer; lwTableEntries: longword): integer;

{
   Loads a context into virtual address space based on a capability.
}
function Import(icContext: TCapability): integer;

{
   Finds a context id based on a context name or an interface it supports.
}
function FindContext(szContext: PChar; bInterface: Boolean): integer;

{
  If a special capability with superuser rights is passed in the other capability is automatically made valid by its authority.
}
function Super(cSuper: TCapability; var Capability: TCapability): integer;
I would have posted a source file, but this is taken from multiple files. Particular things I'd like some help/opinions on are what szInterfaceDefinition in the Match() call should be (because I'm somewhat wary of writing a full-blown IDL interpreter in the kernel for converting to internal data structures; perhaps a stripped-down version?), and what the actual internal data structures of interfaces should be.
Post Reply