hi.
i have started working on my own os, and for now i've made a simple console kernel in nasm.
now i want to execute an app, hopefully in exe or com format.
does anyone know how can i do that, and send me an example?
thanks.
ivan
executing apps
Re:executing apps
I'm not sure how far you've come in your OS development, but there's a lot of stuff you have to worry about when it comes to running other programs.
Is your OS multitasking (i.e., at least able to run two functions in the kernel at once, then you can worry about jumping into another program)?
Do you have some sort of interrupt or call gates in your OS so that the programs can interact with the OS (even just to be able to print to the screen, the program will need something like this)?
Have you written at least some standard library calls that will tell the program how to do stuff like tell the operating system to print to the screen (this is where those interrupt gates or call gates come in handy)?
Do you have enough of a memory manager that, if you use paging, you can handle the different virtual address spaces of the executing processes, or at least handle requests to allocate memory for the processes, etc?
Once you get these kinds of operating system support for running processes, then you can look into the documents online (or open source code) that show how PE (exe) files are layed out. But basically it comes down to parsing a header that tells you what you need to know about the executable (like where the entry point is into the program, etc), creating a space for the program to execute (following header information about how big it is, etc), and then either jumping to the entry (if no multi-tasking) or telling your scheduler about the new process so that it can be jumped to when its turn comes up.
Is your OS multitasking (i.e., at least able to run two functions in the kernel at once, then you can worry about jumping into another program)?
Do you have some sort of interrupt or call gates in your OS so that the programs can interact with the OS (even just to be able to print to the screen, the program will need something like this)?
Have you written at least some standard library calls that will tell the program how to do stuff like tell the operating system to print to the screen (this is where those interrupt gates or call gates come in handy)?
Do you have enough of a memory manager that, if you use paging, you can handle the different virtual address spaces of the executing processes, or at least handle requests to allocate memory for the processes, etc?
Once you get these kinds of operating system support for running processes, then you can look into the documents online (or open source code) that show how PE (exe) files are layed out. But basically it comes down to parsing a header that tells you what you need to know about the executable (like where the entry point is into the program, etc), creating a space for the program to execute (following header information about how big it is, etc), and then either jumping to the entry (if no multi-tasking) or telling your scheduler about the new process so that it can be jumped to when its turn comes up.
- Pype.Clicker
- Member
- Posts: 5964
- Joined: Wed Oct 18, 2006 2:31 am
- Location: In a galaxy, far, far away
- Contact:
Re:executing apps
... or by any chance, do you work in real mode ? (as suggested by the fact you want to load COM programs).
If you are, just
- find a place in memory where to store your program (it will not be larger than 64K, though
- load your program at that place, skipping the 256 first bytes (to comply with the COM standard) -- oh yes, this means you need a filesystem, even if it's Q&D
- put whatever information you like the program to have in the Program Segment Prefix (iirc), or check the information DOS used to push there and emulate it ...
- load ds and ss with the program's segment and jump at program_segment:0x100
If you are, just
- find a place in memory where to store your program (it will not be larger than 64K, though
- load your program at that place, skipping the 256 first bytes (to comply with the COM standard) -- oh yes, this means you need a filesystem, even if it's Q&D
- put whatever information you like the program to have in the Program Segment Prefix (iirc), or check the information DOS used to push there and emulate it ...
- load ds and ss with the program's segment and jump at program_segment:0x100
Re:executing apps
Keep forgetting about this whole "real mode" thing... Sorry, Ivan...I'm getting a little frustrated trying to get the same thing to work in my own OS...
Even in real mode, though, won't you need some sort of interrupt gate like DOS's interrupt 21 for the program to actually be able to DO anything, like print to the screen, unless the program itself was written to directly access hardware? iirc, most com programs are designed to run on dos using at least the int21 services, but i guess there are some that know how to print to the screen and get input form the keyboard directly...
Even in real mode, though, won't you need some sort of interrupt gate like DOS's interrupt 21 for the program to actually be able to DO anything, like print to the screen, unless the program itself was written to directly access hardware? iirc, most com programs are designed to run on dos using at least the int21 services, but i guess there are some that know how to print to the screen and get input form the keyboard directly...
Re:executing apps
The thing I don't get is the ORG preprocessor in asm. If you leave it blank, the assembler fills it out for you. But how can a program run at ... when it has an ORG of 0x300000?
Re:executing apps
Well, in protected mode with paging enabled, this is a simple matter as you can give each process it's own virtual address space. So when loading the program, you get from the header what address it expects as its org, and then you put it there in the virtual address space. Since paging is enabled, translations are done on the memory addresses, so it looks to the program like it is running at the ORG it needs, but in reality it can be mapped to any physical pages of memory.
In real mode, you would actually have to load it at the physical address it required, so if there was already data in that area you would not be able to run the program.
In real mode, you would actually have to load it at the physical address it required, so if there was already data in that area you would not be able to run the program.
Re:executing apps
It can't, of course, unless you were very careful not to use any direct memory references (what is know as Position Independent Code). A flat binary object file (such as a DOS .com executable) can only be run beginning at the location of it's origin relative to it's CS base. So if (to use a real-mode example), CS is set to 0x1000, and your origin is at 0x100, then your code must be loaded into at absolute address 0x10100, and the loader will have to jump to that address (not 0x10000, the beginning of the segment) for it to run correctly. This is in fact exactly what DOS does when it runs a .com file (the first 256 bytes of the segment are used to hold the Program Segment Prefix, a special data area used by DOS for certain bookkeeping details - it's something of a holdover from CP/M, like the .com executable format itself).
To give a p-mode case, assuming that the CS selector is 0x0000 (which is how it is in most systems), and the origin is 0x300000, then the code is always loaded at 0x300000, and again, the loader needs to jump to there in order to start the program.
However, you have to keep in mind that the ORG directive only applies to flat-binary files, which generally are only used for certain very specific systems programming needs (primarily in the early stages of booting); the .com file format is something of an anomaly as these things go.
More typically, an executable file (whether a system program or an application) is in a relocatable object format such as OMF (more often called .exe), PE, COFF, a.out, or ELF. When the object file is generated by the compiler, two of the steps in creating an actual executable image are delayed: adding in any referenced external code, and assigning the actual code origin and label address values. Instead, it creates a symbol table which holds the offsets for all the labels in the program, relative to a nominal base of zero. This table is a part of the object file, often preceeding the code itself.
The first of these two binding stages is done by the linker (when the executable files is created for static libraries, or at runtime for dynamically shared libraries); it resolves all the external references, makes sure that all of them are correct, and adds the new label offsets to the symbol table. The result is what is usually called the executable file, but it is still not the final form of the program.
The second step is done at runtime by the loader, preferably delaying the binding to the last possible stage. The loader determines where the code should be loaded at (based on where the process's allocated memory is), and generates the final program image by reading the relative offsets from the symbol table and patching the appropriate addresses into the code.
For example, if this assembly language program
is assembled into FOO format, then the object file might look like
(All values in hex; keep in mind that x86 is a little endian format, so the address-offset for [tt]bar[/tt] is actually 0x0000000A). Then, if the linker puts [tt]baz[/tt] directly after the end of the data (normally it wouldn't, but we'll assume it for demonstration purposes), it would then change the line in the symbol table to
Finally, when the loader runs, it gets the appropriate location to load the program at from the process manager - let's say, 0x300000 - and then proceeds to replace all the symbol references with the actual addresses to be used, and emits a binary image like this:
It is this final image that is loaded to 0x300000 (as a binary, of course, not as hex). The loader then jumps to the entry point, as it would with a flat-binary executable (which is essentially what the program now is).
For more details on object formats, linking and loading, see the home page for the book Linkers and Loaders by John Levine.
HTH. Comments and corrections welcome.
To give a p-mode case, assuming that the CS selector is 0x0000 (which is how it is in most systems), and the origin is 0x300000, then the code is always loaded at 0x300000, and again, the loader needs to jump to there in order to start the program.
However, you have to keep in mind that the ORG directive only applies to flat-binary files, which generally are only used for certain very specific systems programming needs (primarily in the early stages of booting); the .com file format is something of an anomaly as these things go.
More typically, an executable file (whether a system program or an application) is in a relocatable object format such as OMF (more often called .exe), PE, COFF, a.out, or ELF. When the object file is generated by the compiler, two of the steps in creating an actual executable image are delayed: adding in any referenced external code, and assigning the actual code origin and label address values. Instead, it creates a symbol table which holds the offsets for all the labels in the program, relative to a nominal base of zero. This table is a part of the object file, often preceeding the code itself.
The first of these two binding stages is done by the linker (when the executable files is created for static libraries, or at runtime for dynamically shared libraries); it resolves all the external references, makes sure that all of them are correct, and adds the new label offsets to the symbol table. The result is what is usually called the executable file, but it is still not the final form of the program.
The second step is done at runtime by the loader, preferably delaying the binding to the last possible stage. The loader determines where the code should be loaded at (based on where the process's allocated memory is), and generates the final program image by reading the relative offsets from the symbol table and patching the appropriate addresses into the code.
For example, if this assembly language program
Code: Select all
;; test.asm
[extern baz]
main: mov eax, DWORD bar
call baz
bar dd "quux"
Code: Select all
SYMTAB
00000000 "main" 00 00 00 00
00000001 "baz" ** ** ** **
00000002 "bar" 0A 00 00 00
OBJ
B8 *00000002
E8 *00000001
71757578
Code: Select all
00000001 "baz" 0E 00 00 00
Code: Select all
B80A000030E80E00003071757578
For more details on object formats, linking and loading, see the home page for the book Linkers and Loaders by John Levine.
HTH. Comments and corrections welcome.
Re:executing apps
I am going to do the offset thingie for my plugins. They are all connected in a way. So I can't split them. Using the kernel to message between plugins will be slow though. So they can't use virtual memory.