Hi,
I am taking the time to go through my scheduler and task management systems (it's time to look for something a little more robust) and was re-visiting my process-initialisation code. I just wondered how people generally tackle task initialisation. At present, I do the following (section which probably needs re-coding highlighted...):
* Load the object file 'as-is' in to RAM.
* Extract relocation data and store in a 'unified format' understood by my kernel (so ELF, COFF, PE etc.. end up looking the same).
* Create a new page table, which includes the top 1GB containing my kernel.
------------Optimisation Needed Below?---------------------
* Switch to new process space having disabled the scheduler.
* Relocate the executable, set up stack(s).
* Switch back to kernel space.
* Add the task to the currently running list.
* Re-enable the scheduler.
-------------------Optimisation Ends?------------------------------
* Run the task when it is due for scheduling.
The bits I don't like are having to disable the scheduler for so long and having to relocate the entire exe rather than paging on-demand. I experimented with not disabling the scheduler at all, but that would mean the additional overhead of having to save the outgoing CR3 for task switches (otherwise, CR3 gets trashed if a task switch occurs during initialisation).
As for paging on demand, that's a bit beyond my kernel at present (I do it for heaps, but not for loadable data).
Any ideas how I can optimise things a bit (or not have to switch to the new task space)?
Cheers,
Adam
A Better Way of Loading Processes
You could use a mutex to protect the code that launches tasks. That way you are guaranteed to avoid launching more than one process at once. Of course you should probably do that anyways regardless of the method you chose. I just know that my way has no problems with leaving the scheduler running. Of course just remember that the choice is up to you.
Re: A Better Way of Loading Processes
Hi,
Cheers,
Brendan
I tend to use something like this:AJ wrote:I am taking the time to go through my scheduler and task management systems (it's time to look for something a little more robust) and was re-visiting my process-initialisation code. I just wondered how people generally tackle task initialisation.
- * Ask the linear memory manager to create a new address space containing the kernel's pages and nothing else
* Create the framework for the new process (allocate a "process ID" and set some information about the process)
* Create an initial thread for the process (kernel stack, scheduler priority, etc)
* Make the new thread "ready to run"
* Send a "startup" message to the new thread containing details
* Return back to the caller
- * Wait until the "startup" message is received
* Parse the "startup" message (including getting the executable's file name)
* Attempt to open the executable file
* Attempt to load the executable file's header into the address space
* Check stuff!
* Load the rest of the executable file into the address space (unless I support memory mapped files - never got that far yet though)
* Send a "started OK" message to the thread that started the process
* Jump to the executable file's entry point (actually an IRET to switch back to CPL=3)
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
I like the sound of that method, but the whole CPL0 --> CPL3 thing threw me a bit.
If I understand you correctly, you would need to set up two ring 0 stacks - one to switch to the kernel-mode portion of code and then one to switch to the user-mode portion of code. These two stacks could be set up next to each other in RAM. When the kernel-mode code finishes executing and needs the iret to start the CPL3 code, it would get the scheduler to adjust its ESP, resulting in the next iret switching to user mode.
I'm going to start playing with it now...
Thanks,
Adam
If I understand you correctly, you would need to set up two ring 0 stacks - one to switch to the kernel-mode portion of code and then one to switch to the user-mode portion of code. These two stacks could be set up next to each other in RAM. When the kernel-mode code finishes executing and needs the iret to start the CPL3 code, it would get the scheduler to adjust its ESP, resulting in the next iret switching to user mode.
I'm going to start playing with it now...
Thanks,
Adam
Hi,
Some people use a "do the task switch when returning to CPL=3" trick, which seems simpler and means that you can have a single kernel thread (or one per CPU) shared by everything, but it also means you can't have a preemptable kernel and can't have kernel threads. In this case, spawning a process like I do will be a hard...
Cheers,
Brendan
All my kernels have always been interruptable and preemptable. Every thread has it's own kernel stack (and possibly another user-level stack, if it's not a "kernel thread"). This means that if there's 1234 threads then there's 1234 kernel stacks, and spawning any new thread for any reasons involves allocating a new kernel stack for it. In this case, spawning a process like I do is simple.AJ wrote:If I understand you correctly, you would need to set up two ring 0 stacks - one to switch to the kernel-mode portion of code and then one to switch to the user-mode portion of code. These two stacks could be set up next to each other in RAM. When the kernel-mode code finishes executing and needs the iret to start the CPL3 code, it would get the scheduler to adjust its ESP, resulting in the next iret switching to user mode.
Some people use a "do the task switch when returning to CPL=3" trick, which seems simpler and means that you can have a single kernel thread (or one per CPU) shared by everything, but it also means you can't have a preemptable kernel and can't have kernel threads. In this case, spawning a process like I do will be a hard...
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Hi,
I currently have a separate PL0 stack for each task and plan to keep it that way. I have a lot of design to do on paper first before I properly implement my 'final' task manager, but given your first answer, I am now thinking along the lines of creating a 'task setup process' in kernel space.
Once process setup is complete (the new task space, PD, stack etc...) are set up, this kernel-stack will then call a 'privilege demotion' interrupt. The task then gets put on the sleep queue. When the scheduler gets round to putting it in the run queue again (depending on priority), it will change the segment selectors to RPL3. I am thinking that a space will be left on the stack for ESP3 and SS3 during the original process creation so that the scheduler can easily 'demote' any process to ring 3. Although it sounds convoluted, I'm sure it won't be too bad to implement...
As you can see, I still have some thinking ahead
Thanks again for the help and ideas,
Adam
I currently have a separate PL0 stack for each task and plan to keep it that way. I have a lot of design to do on paper first before I properly implement my 'final' task manager, but given your first answer, I am now thinking along the lines of creating a 'task setup process' in kernel space.
Once process setup is complete (the new task space, PD, stack etc...) are set up, this kernel-stack will then call a 'privilege demotion' interrupt. The task then gets put on the sleep queue. When the scheduler gets round to putting it in the run queue again (depending on priority), it will change the segment selectors to RPL3. I am thinking that a space will be left on the stack for ESP3 and SS3 during the original process creation so that the scheduler can easily 'demote' any process to ring 3. Although it sounds convoluted, I'm sure it won't be too bad to implement...
As you can see, I still have some thinking ahead
Thanks again for the help and ideas,
Adam
Hi,
The kernel API's "create process" function allocates some RAM for some data structures, stuffs some data into those data structures and returns to whoever called it (no blocking, no task switching, etc). It doesn't really need anything special, except for a "put Thread X on a scheduler run queue" function (that should be a generic scheduler function that's used when any blocked thread is unblocked for any reason).
The kernel's "process startup code" (that the new process' initial thread runs) doesn't really need anything special either - some code to get an executable file into an address space (standard file I/O and some memory management), and a (pretend) IRET.
Cheers,
Brendan
Why? The kernel can just push some data onto it's own stack, then use IRET to "return" to the CS:EIP that it just pushed onto it's stack. You don't need any interrupt (you just pretend there was one).AJ wrote:Once process setup is complete (the new task space, PD, stack etc...) are set up, this kernel-stack will then call a 'privilege demotion' interrupt.
Why? At this point the new thread was running, and can continue running...AJ wrote:The task then gets put on the sleep queue.
I use the same DS, ES, FS, GS segments for both CPL=0 and CPL=3, and never change them. The IRET changes SS and CS as part of switching from CPL=0 to CPL=3. If you use different DS, ES, FS, GS segments for CPL=0 and CPL=3, then you can load the CPL=3 segments into the segment registers before you do the IRET.AJ wrote:When the scheduler gets round to putting it in the run queue again (depending on priority), it will change the segment selectors to RPL3.
Think of it like this...AJ wrote:I am thinking that a space will be left on the stack for ESP3 and SS3 during the original process creation so that the scheduler can easily 'demote' any process to ring 3. Although it sounds convoluted, I'm sure it won't be too bad to implement...
The kernel API's "create process" function allocates some RAM for some data structures, stuffs some data into those data structures and returns to whoever called it (no blocking, no task switching, etc). It doesn't really need anything special, except for a "put Thread X on a scheduler run queue" function (that should be a generic scheduler function that's used when any blocked thread is unblocked for any reason).
The kernel's "process startup code" (that the new process' initial thread runs) doesn't really need anything special either - some code to get an executable file into an address space (standard file I/O and some memory management), and a (pretend) IRET.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.