Page 1 of 1

[SOLVED] Triple Fault on TaskSwitch (Xeon Processor)

Posted: Fri Dec 03, 2010 3:45 am
by prajwal
Hi Friends,

I am using Intel x86 TSS framework for task management

It perfectly works fine on emulator, my 4 year old dell laptop, on my pc with amd processor, on a brand new laptop with quad core intel processor
But I have a 1 year old laptop which is having a dual core xeon pentium 3 family processor.

task switch simply crashes my system.

Please note that the kernel itself boots fine and runs with it's own TSS (no LDT, ring0). the moment I switch to a TSS (user/kernel task) which has it's own LDT it crashes. In this case it is a TSS running in ring0, with LDT using same PDBR (page table directory reg) as that of kernel
I have made sure that the TSS is memset(zero) during initialization, IO_MAP_SIZE set to 103 (same in TSS desc), Debug T Bit set to 0

Again, please note that it works fine on many of the new and old laptops except the new laptop I have

I know the details I have provided are very abstract but that is all I have.

Any sort of help --> suggestions for debuging, common mistakes, anything processor specific I have to take care, etc.... is greatly appreciated.

Thanks in advance,

Regards,
- MosMan

Re: Triple Fault on TaskSwitch (Xeon Processor)

Posted: Fri Dec 03, 2010 4:22 am
by Combuster
Common mistakes that fit the symptoms: buffer overflows, uninitialized memory, race conditions. None of them are easy to check though.

Did you try printf-debugging before switching to the "broken" tss?

Re: Triple Fault on TaskSwitch (Xeon Processor)

Posted: Fri Dec 03, 2010 5:46 am
by prajwal
Thanks Combuster. race condition is something which I would want to check again.

But the condition is consistent. either always crashes on my new laptop or never crash on other pcs I mentioned in my initial post

printf-debugging... yeah I did. I put a an infinite while loop in PIT handler to see if task switch is successful because PIT handler is the only place where the control can come after task switch and when there is a timer irq. in fact my task also is simply a while(1); loop

Regards
- MosMan

Re: [SOLVED] Triple Fault on TaskSwitch (Xeon Processor)

Posted: Sun Dec 05, 2010 2:14 am
by prajwal
Ok, the problem is fixed. Thanks for your help.
The issue was with the aligment of TSS.
I changed my GDT, LDT, more importantly TSS to be PAGE (4KB) aligned in memory.

Thanks,
- MosMan

Re: [SOLVED] Triple Fault on TaskSwitch (Xeon Processor)

Posted: Sun Dec 05, 2010 2:28 am
by gerryg400
Just curious, are you using hardware task switching ? As far as I remember, the TSS need not be page aligned but it should be wholly within a single physical page.

Re: [SOLVED] Triple Fault on TaskSwitch (Xeon Processor)

Posted: Mon Dec 06, 2010 1:13 am
by prajwal
Hi gerryg400, Yes I use Hardware Task switching

yeah I think you are right. TSS should be available within a single page. I do remember that nothing of page alignment constraint has been imposed on TSS locality in intel manuals. I read the latest document while fixing this and didn't see any such constraint.

Also, in my case the first 64 MB of memory is 1:1 mapped in all processes (kernel area) so there is no question of page fault in this area. My user TSS was at address 530 KB and of length 103 bytes. So, it was not spanning across page boundary and was available within a single phy page. As I mentioned it is working on many of the new laptops but except on my new laptop. So, as part of "brute force bug fix strategy :lol: " I just changed the TSS base address to 532 KB And there you go, it worked!!

I also moved LDT to page aligned address but I don't attribute LDT as the cause because I tried task switching with my new task using GDT instead of LDT... It still crashed!. So, I concluded that it is because of TSS location. Once it started working I haven't spent too much time on Postmortem

Re: [SOLVED] Triple Fault on TaskSwitch (Xeon Processor)

Posted: Mon Dec 06, 2010 3:12 am
by Combuster
You might want to look up the relevant errata lists for something relevant - hardware task switching with unaligned TSSes is uncommon enough to be unheard of. You do need the exact processor model to find the right list (I can't look it up with the listed info - there are no dual-core chips in the pentium 3 generation out there, only chips that can be used in multisocket configurations). If you find something relevant in there you can at least be sure it was not caused by some other nasties.

Re: [SOLVED] Triple Fault on TaskSwitch (Xeon Processor)

Posted: Mon Dec 06, 2010 4:01 am
by gerryg400
I would say that is your problem. It's a good example of how a rarely used feature (e.g. hardware task switching) might perhaps receive less thorough testing and thus be more likely to contain a bug.

Re: [SOLVED] Triple Fault on TaskSwitch (Xeon Processor)

Posted: Mon Dec 06, 2010 7:06 am
by Combuster
I'd still like to see an Intel document confirming it...

Re: [SOLVED] Triple Fault on TaskSwitch (Xeon Processor)

Posted: Mon Dec 06, 2010 2:35 pm
by gerryg400

Re: [SOLVED] Triple Fault on TaskSwitch (Xeon Processor)

Posted: Mon Dec 06, 2010 6:15 pm
by gerryg400
berkus wrote:
gerryg400 wrote:See page 54 item G45 http://www.orpheuscomputing.com/downloa ... n-spec.pdf
403 with referer logged.
Hey Berkus, what's that mean ? I can view that page and download the doc.

Re: [SOLVED] Triple Fault on TaskSwitch (Xeon Processor)

Posted: Mon Dec 06, 2010 7:01 pm
by gerryg400
Should I attach the actual document to this post ?
But I have a 1 year old laptop which is having a dual core xeon pentium 3 family processor.
Isn't Pentium 3 a lot older than 1 year ?

Re: [SOLVED] Triple Fault on TaskSwitch (Xeon Processor)

Posted: Mon Dec 06, 2010 8:34 pm
by Brendan
Hi,
gerryg400 wrote:
But I have a 1 year old laptop which is having a dual core xeon pentium 3 family processor.
Isn't Pentium 3 a lot older than 1 year ?
Heh. There are no dual-core Pentium III chips, and Xeons are for servers not laptops.

By omitting least-likely information and assuming "laptop" and "dual-core" is correct, and also assuming that "1 year old" means it was purchased one year ago (and designed anywhere from 12 months to 3 years ago), then I'd assume it's "Core" or "Core 2" (and maybe a "mobile" and/or
Celeron" variation of one of these). Normal Nehalem based CPUs use too much power for laptops and the "mobile Nehalem" CPUs are too new, and PentiumM and Netburst based CPUs are too old.
gerryg400 wrote:See page 54 item G45 http://www.orpheuscomputing.com/downloa ... n-spec.pdf
Item G45 (on page 40 of that document) is "Processor May Assert DRDY# on a Write with No Data", and begins with "When a MASKMOVQ instruction is misaligned across a chunk boundary". Item G72 (the only item on page 54 of that document) is "AGTL+ Receiver May Induce Falling Edge Ledges and Undershoot Levels". Neither of these seem likely.



Cheers,

Brendan

Re: [SOLVED] Triple Fault on TaskSwitch (Xeon Processor)

Posted: Mon Dec 06, 2010 10:07 pm
by gerryg400
Sorry should have been G54 on page 45. Two typos ! But since it's prolly not a Pentium III I guess it's not this issue .....