Couple of mess thoughts/questions flashing in my mind...

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Post Reply
User avatar
smwikipedia
Member
Member
Posts: 49
Joined: Tue Apr 20, 2010 1:11 am

Couple of mess thoughts/questions flashing in my mind...

Post by smwikipedia »

Current OS usually has the concept of "process". Each process has dedicated data block that describes its context, resouces allocation status, etc. Usually, this data block is called "Process Control Block (PCB)".

For a multi-task OS, there should be multiple PCBs. And these PCBs should be kept in an organized manner so that kernel can switche among them to achieve the multi-task illusion. I can think of array or link list to store the process chain.

I am wondering:

1- Where is this PCB array/link list stored? I think it should be in the kernel's address space (kernel space). If we are using the array approach, should the kernel define some contant such as MAX_PROCESS_COUNT? If we use the link list approach, we should also limit the maximum number of process the kernel can support. Is this limit one of the concerns when designing an OS kernel?

2- How about let the PCB contain only the CR3 register value, i.e. the process's page directory. All the other process-specific info would be loaded from user address space after the CR3 is loaded. This will simplify the design of the process chain, also it will greatly reduce the space required in kernel space since the major part of process-specific info are stored in user address space.

1001- (This is not quite related to the above two bullets so I use a much lager number)

Suddenly I have the feeling that only the kernel can be called the "real" program that is running on the box. Other things such as device drivers, user-mode applications, services, etc are only attachments. Kernel provides a skeleton, a framework, and other things are just like muscle or bricks that plug onto the kernel through well-defined interfaces.

Or more radically, kernel is the only program, and all the other things are just data manipulated by the kernel.
User avatar
smwikipedia
Member
Member
Posts: 49
Joined: Tue Apr 20, 2010 1:11 am

Re: Couple of mess thoughts/questions flashing in my mind...

Post by smwikipedia »

I have heard that the kernel should be mapped into each of the process's address space, thus the so-called kernel space. But this is virtual address concept. I am wondering where's the kernel's physically? How should we arrange for it? Is there any design best practices for this?
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Couple of mess thoughts/questions flashing in my mind...

Post by Solar »

1) Correct, PCB data is kernel space.

2) Not a good idea. The kernel will need most of the information from the PCB (for example, the scheduler wants to see priorities). It's a good thing to have these at a fixed offset in the PCB block, so you don't have to go through page tables etc. to get your data.

1001) Not that bad a picture.


Your second post:

The idea is to have only the kernel reside in ring 0, "kernel space". Only the kernel "sees" physical memory as it really is - all user processes reside in ring 3 ("user space"), and only see virtual addresses. This way the kernel has ultimate control about what an application can, and cannot do - which is the whole idea of the thing. Having the kernel being visible at a fixed location in the virtual address range has practical reasons (makes it easier for the application to communicate with / call the kernel).
Every good solution is obvious once you've found it.
User avatar
Creature
Member
Member
Posts: 548
Joined: Sat Dec 27, 2008 2:34 pm
Location: Belgium

Re: Couple of mess thoughts/questions flashing in my mind...

Post by Creature »

Solar wrote:Having the kernel being visible at a fixed location in the virtual address range has practical reasons (makes it easier for the application to communicate with / call the kernel).
Agreed, so you would probably want to have your kernel mapped into every address space, but not necessarily the entire kernel's address space; the kernel's address space may for example have its own heap at some virtual address, but you may not necessarily want this heap to be shared with other processes (so the kernel has a private heap). The same way, you could use something like a slab allocator (or maybe just a static array, depending on what you want) or a mini-heap which is shared between processes (or just part of the kernel) so the scheduler has access to them regardless of what context it is in. If you make it private for the kernel's address space, you will need to switch back and forth just to access the list.
When the chance of succeeding is 99%, there is still a 50% chance of that success happening.
User avatar
gravaera
Member
Member
Posts: 737
Joined: Tue Jun 02, 2009 4:35 pm
Location: Supporting the cause: Use \tabs to indent code. NOT \x20 spaces.

Re: Couple of mess thoughts/questions flashing in my mind...

Post by gravaera »

Hi:
smwikipedia wrote: 1- Where is this PCB array/link list stored? I think it should be in the kernel's address space (kernel space). If we are using the array approach, should the kernel define some contant such as MAX_PROCESS_COUNT? If we use the link list approach, we should also limit the maximum number of process the kernel can support. Is this limit one of the concerns when designing an OS kernel?
Yes, the kernel keeps metadata in its own address space: this way, no matter which address space it is currently working from, it can access information globally.

The maximum number of processes supported is a constraint you'll decide based mainly on the amount of memory available on the machine.
smwikipedia wrote:2- How about let the PCB contain only the CR3 register value, i.e. the process's page directory. All the other process-specific info would be loaded from user address space after the CR3 is loaded. This will simplify the design of the process chain, also it will greatly reduce the space required in kernel space since the major part of process-specific info are stored in user address space.
From a technical point of view, data takes up memory, and it must be in RAM before it can be accessed. This data will inevitably take up N bytes of physical or "real" memory. Whether you split it up between the kernel address space and the userspace address region, or you keep all useful metadata in the kernel's high 1GB address space, you're going to be using N bytes of RAM on it anyway.

The kernel will, I assume take 1GB of virtual memory from each process. However, if you think about it, this is one gigabyte. You can definitely store all processes' PCBs in there. A good tip when considering things like this is, always ask yourself "Yes, this sounds nice, but from a purely pragmatic point of view, what benefits does it offer? And from a technical point of view, does it make sense?"
17:56 < sortie> Paging is called paging because you need to draw it on pages in your notebook to succeed at it.
Ready4Dis
Member
Member
Posts: 571
Joined: Sat Nov 18, 2006 9:11 am

Re: Couple of mess thoughts/questions flashing in my mind...

Post by Ready4Dis »

smwikipedia wrote:1- Where is this PCB array/link list stored? I think it should be in the kernel's address space (kernel space). If we are using the array approach, should the kernel define some contant such as MAX_PROCESS_COUNT? If we use the link list approach, we should also limit the maximum number of process the kernel can support. Is this limit one of the concerns when designing an OS kernel?
As already mentioned,kernel space. I personally odn't have a max process count, my list is dynamically allocated. You can use either a dynamic array, or a linked list depending on your needs. A fixed array works, just make sure it's something reasonable.

smwikipedia wrote:2- How about let the PCB contain only the CR3 register value, i.e. the process's page directory. All the other process-specific info would be loaded from user address space after the CR3 is loaded. This will simplify the design of the process chain, also it will greatly reduce the space required in kernel space since the major part of process-specific info are stored in user address space.
Horrible idea, that would mean your scheduler would have to switch to every single process when it needs to figure out information about it, for example: Is the process currently awake or sleeping? What is it's priority level? What if another process is looking for a specific process to communicate, you'd have to traverse the entire list, and switch contexts each and every time. Having this information in the kernel space doesn't waste enough space to warrant such extreme's.
smwikipedia wrote:1001- (This is not quite related to the above two bullets so I use a much lager number)

Suddenly I have the feeling that only the kernel can be called the "real" program that is running on the box. Other things such as device drivers, user-mode applications, services, etc are only attachments. Kernel provides a skeleton, a framework, and other things are just like muscle or bricks that plug onto the kernel through well-defined interfaces.

Or more radically, kernel is the only program, and all the other things are just data manipulated by the kernel.
Hmmm... sure, that's a way to look at it. The kernel is basically the first abstraction layer and safety net. Depending on your design (micro, mono, exo, etc) it can be viewed a bit differently I guess, so it's hard to really give it a hard definition, but it is typically the glue to hold the system together.
Post Reply