Page 1 of 1

A Modular System for Detecting EXE Type (C++)

Posted: Tue Feb 05, 2008 10:27 am
by AJ
Hello,

In my OS, I'm just creating a module loader. This works by having a 'Modules' class which stores a linked list of currently loaded Module classes (where Module is the common base class). There will then be a system of classes such as ELF, COFF etc...

When a module is loaded in to memory, I want to be able to do something like:

Code: Select all

Modules.Add(void *start, unsigned length);
The 'Modules' container will then establish what format a module is in, and add the appropriate child class to the linked list. Ideally, I would like to do something like have a static function in each of the child classes (e.g. ELF::IsValid(void *start, unsigned length); ) which can be called by the 'Modules' container in order to establish which of the classes to construct.

For this, however, the Modules container needs to know in advance what executable types are available (to know how to call 'IsValid' in the first place). I would like to be able to dynamically add an executable format if required, or at least add a new executable format without manually tinkering with the 'Modules' class.

All the ways I can see around this are either wasteful of resources or a major coding headache. I would really like a nice neat way of doing things while keeping the executable type detection code within the individual executable type classes.

I would be grateful if anyone has any thoughts on a neat way of achieving this.

Cheers,
Adam

Posted: Tue Feb 05, 2008 11:32 am
by lukem95
Well ELF and CoFF both have the signature in the first few bytes, as im sure most (if not all) other executable formats do.

why do you write a routine that switches the few bytes to determine which one matches, and then return a number assigned with it?

Posted: Tue Feb 05, 2008 1:56 pm
by proxy
The way i plan on handling this is i have a class:

class FileType {
public:
static FileType *instance(const uint8_t *buf, std::size_t size);
};

and that static function will return a type which inherits filetype but implements a generic interface.

So then you just do:

FileType *const ft = FileType::instance(buf, size);

and ft will be a pointer to the right kind of handler. Beyond that, I have an auto registration system which uses static global constructors to register all my drivers pre-main.

proxy

Posted: Tue Feb 05, 2008 2:05 pm
by Tyler
In my last Kernel i had separated blobs of code that were part of the Executable Format Driver. These blobs of code were required to be PIC, and could be injected into other code. So when registering an Executable Format, the Driver simply passes on it's code blob and set's it up to be called. Then when checking a file's type, the Kernel ran each Blob, in an order based upon preference and likelyhood of appearance, until one returns positive.

This way, it is not neccesary for all Driver's to be loaded in order to check a file. It is also very simple to have the list of File types to check point into already loaded code instead of having the Blob loaded twice (in the collection and Driver) Unfortunately, this is not simple to implement in a standard ELF module with standard Compiler's and Linker's, but still quite possible.

EDIT: Ha, i didn't read the very last bit of your post, where you specifically request that the code stays inside the Driver. Oh well, i'll keep this post here anyway.

Posted: Wed Feb 06, 2008 2:44 am
by AJ
Thanks for the suggestions, guys. Thinking ahead, looks like I'll need a similar system at some point to detect what file system is on a particular device too....

I was reading about the factory design pattern and am wondering if there's some clean way there to achieve what I want - if not, I'll just have to do as proxy suggests, which means that the main 'Modules' class has to know about all the file types statically - I guess there's not much call for dynamically adding executable formats anyway :)

The 'waste of resources' way would be to have one instance of each class created at boot time. When an exe is loaded, 'Initialise()' is called on the class. If this class is then found to be valid, that class is returned. If not, the next exe type class in the list is called. This means having one class constantly installed for each exe type and could soon end up a bit of a jumble.

Cheers,
Adam

Posted: Wed Feb 06, 2008 11:05 am
by Ready4Dis
Well, this is a similar problem I had for when I was doing gamedev for loading image files. I had a base image class with function with virtual functions for checking for a valid file, loading, and unloading. Then I had an image loading class, that was just a list of loaders... I could call AddLoader, and it would add whatever was passed it's way...

ImageLoad.Add(new BitmapLoader_C);
ImageLoad.Add(new TGALoader_C);

Then to load an image, I just call ImageLoad.Load("Test.bmp");. It would automatically call BitmapLoader_C and TGALoder_C to check if they support the file format. I could dynamically add file formats just by adding the new class (Which must be inherited from the base Loader_C class and have the correct overloaded functions). So, depending on what you needed, I would have a base class for EXE files...

Code: Select all

class BaseModule_C
{
public:
   virtual char IsValid(char *Data, int Size) = 0; //Check if it's valid
   virtual int FindSymbol(char *Data, char *Symbol, int Size) = 0; //Find a symbol in the file
   virtual void Relocate(char *Data, int Size, int Address) = 0; //Relocate to this address
};

class ModuleLoader_C
{
public:
//Any type of list you prefer
   List_C<BaseModule_C*> ModuleList;
public:
   AddLoader(BaseModule_C *ptrLoader); //Add this one to our list...
   BaseModule_C* FindLoader(char *Data, int Size); //return the module that can load it
};
You can then create an EXE loader, Coff loader, a.out, elf by inheriting the base class, then writing a bit of code to detect the magic value's and checking for a valid header. You would find the loader, then you can perform whatever you need on the module.

Code: Select all

ModuleLoader_C ModuleLoader;

int main(int argc, char *argv[])
{
   BaseModule_C *ptrLoader;
   int SymbolStart;
   ModuleLoader.AddLoader(new EXELoader_C);
   ModuleLoader.AddLoader(new CoffLoader_C);
   ModuleLoader.AddLoader(new ELFLoader_C);
//Get the loader that can read this :)
   ptrLoader = ModuleLoader.FindLoader(FileData,FileSize);
//Relocate the file
   ptrLoader->Relocate(FileData,FileSize,LoadAddress);
//find a symbol
   SymbolStart = ptrLoader->FindSymbol(FileData,"Start",FileSize);
   return 1;
}
You can abstract it however far you'd like, but that is the general idea... you can dynamically add module loaders by inheriting from the base class, even after it's running without issue.