Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
I have been reading about the TSS and getting ready to add one to my 64-bit kernel. Here is what the Intel manual says about page boundaries and 64-bit:
Intel Manual; Vol 3A; Section 7.2.1 wrote:If paging is used:
* Avoid placing a page boundary in the part of the TSS that the processor reads during a task switch (the first 104 bytes). The processor may not correctly perform address translation if a boundary occurs in this area. During a task switch, the processor reads and writes into the first 104 bytes of each TSS (using contiguous physical addresses beginning with the physical address of the first byte of the TSS). So, after TSS access begins, if part of the 104 bytes is not physically contiguous, the processor will access incorrect information without generating a page-fault exception.
Later in the manual:
Intel Manual; Vol 3A; Section 7.7 wrote:In 64-bit mode, task structure and task state are similar to those in protected mode. However, the task switching mechanism available in protected mode is not supported in 64-bit mode. Task management and task switching must be performed by software.
Nothing I have found in the WIKIs and forum threads are related to 64-bit mode.
Section 7.2.1 specifically calls out "During a task switch", which is not available in 64-bit mode; the 64-bit mode section calls out the similarities. I'm conflicted on how to interpret this information. So, my question is: Does anyone know if the TSS can span a page boundary with non-contiguous physical frames in 64-bit mode? I'm not going to be in a position to test this myself until I get much more of my kernel ready, and I was hoping someone would have an answer and save me a little aggravation.
(edit: fixed quote blocks)
Thank you!
Adam
The name is fitting: Century Hobby OS -- At this rate, it's gonna take me that long!
Read about my mistakes and missteps with this iteration: Journal
"Sometimes things just don't make sense until you figure them out." -- Phil Stahlheber
While hardware task switching (multiple TSS) is not supported in 64-bit mode, you would still have to define, and hence load, a (single) TSS to facilitate ring3->ring0 jumps (for stack and segment switching in case of interrupts).
The way I read Section 7.2.1 is that the processor performs address translation only once when reading the TSS, i.e. the physical address determined for the first byte is used as the base for the rest of the TSS, regardless of alignment, ignoring the MMU. Hence your single 64-bit TSS should still be ensured not to cross a page boundary; unless you're operating on very strict memory layout requirements, just force the base of the TSS on a page boundary [e.g. __attribute__((aligned (4096)))].
Thank you for your reply. You comment on the MMU translation happening only once makes sense.
So far, I have taken the conservative approach so far to have the TSS fully contained within a page. I have actually artificially adjusted my OS representation of the structure to 128 bytes so I can cleanly get 32 of them in a page.
milliburn wrote:While hardware task switching (multiple TSS) is not supported in 64-bit mode, you would still have to define, and hence load, a (single) TSS to facilitate ring3->ring0 jumps (for stack and segment switching in case of interrupts).
It is my understanding that in a multiprocessor system, multiple interrupts can be handled at a time (1 on each processor) and that part of an IDT entry is optionally to specify which Interrupt Stack Table in the processor's TSS to use for that interrupt (in theory guaranteeing a clean stack). So, let's say that I have 2 processors that are processing page faults concurrently. A page fault is set to use IST4 as an example. If both processors were to use the same TSS, then the same stack address will be used for both processors and create an issue.
For that reason, I still feel I will need 1 TSS for each processor -- even though hardware task switching is not available. Is my thinking reasonable, or am I missing something?
Adam
The name is fitting: Century Hobby OS -- At this rate, it's gonna take me that long!
Read about my mistakes and missteps with this iteration: Journal
"Sometimes things just don't make sense until you figure them out." -- Phil Stahlheber
eryjus wrote:Thank you for your reply. You comment on the MMU translation happening only once makes sense.
So far, I have taken the conservative approach so far to have the TSS fully contained within a page. I have actually artificially adjusted my OS representation of the structure to 128 bytes so I can cleanly get 32 of them in a page.
milliburn wrote:While hardware task switching (multiple TSS) is not supported in 64-bit mode, you would still have to define, and hence load, a (single) TSS to facilitate ring3->ring0 jumps (for stack and segment switching in case of interrupts).
It is my understanding that in a multiprocessor system, multiple interrupts can be handled at a time (1 on each processor) and that part of an IDT entry is optionally to specify which Interrupt Stack Table in the processor's TSS to use for that interrupt (in theory guaranteeing a clean stack). So, let's say that I have 2 processors that are processing page faults concurrently. A page fault is set to use IST4 as an example. If both processors were to use the same TSS, then the same stack address will be used for both processors and create an issue.
For that reason, I still feel I will need 1 TSS for each processor -- even though hardware task switching is not available. Is my thinking reasonable, or am I missing something?
Correct. You can either do a TSS per processor, or go a step further and use a GDT per processor (the latter is mostly useful for 32-bit kernels - where you might have a GDT entry per core pointing to per core state, for example, but its not unheard of in 64-bit mode because (A) you can now have more processors than GDT entries (though such systems are rare) and (B) because it lets you do various segmentation tricks - the most notable is it permits swapping in and out a per process LDT, which is useful in certain cases, e.g. WINE running 32-bit Windows code)
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
eryjus wrote:So, my question is: Does anyone know if the TSS can span a page boundary with non-contiguous physical frames in 64-bit mode?
First; don't forget that the TSS is multiple parts - the first 104 bytes, followed by an (optional) unused gap, followed by an (optional) 32-byte "Software Interrupt Redirection Bit Map" (used by virtual 8086 when "Virtual Mode Extensions" are enabled), followed by an (optional) "IO Permission Bit Map". This doesn't necessarily fit in a 4 KiB page. For example, the IO Permission Bitmap may be 65536 bits (or 8192 bytes) alone.
To me; Intel's "Avoid placing a page boundary in the part of the TSS that the processor reads during a task switch (the first 104 bytes). The processor may not correctly perform address translation if a boundary occurs in this area." looks like the CPU was supposed to do address translation correctly for both pages, but one of Intel's CPUs got it wrong, so rather than fixing it they just amended the documentation to make the dodgy behaviour "legal".
With this in mind; I'd suggest that (for accessing data in the TSS) some CPUs may only do address translation correctly for the first page of whatever it actually accesses. This would imply:
For hardware task switching in protected mode, the CPU actually accesses 104 bytes of the TSS, therefore it'd be a bad idea to have a page boundary within those 104 bytes.
For "CPL=3 -> CPL=0" privilege level switching in protected mode, the CPU actually accesses 2 dwords (SS0 and ESP0 fields only), therefore it'd be a bad idea to have a page boundary within those 8 bytes
For "IST table lookup" in long mode, the CPU actually accesses 2 dwords (e.g. "IST1 lower 32 bits" and "IST1 upper 32 bits"), therefore it'd be a bad idea to have a page boundary within those 8 bytes
For "IO Permission Bit Map lookup" and "Software Interrupt Redirection Bit Map lookup", the CPU actually accesses 1 byte, therefore it's impossible for a page boundary to matter for that.
Of course this means that for an OS that doesn't use hardware task switching at all, there are many places within the first 104 bytes of a TSS where it'd be safe to have a page boundary.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.