Trying to Debug A Triple-Fault in Init

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Trying to Debug A Triple-Fault in Init

Post by AJ »

Hi,

*EDIT: Don't waste time on this - answer below!*

Firstly, I'd better say "hello". It's been a while - children, work, divorce, business and so on...

Anyway - I've picked up my old long-mode higher-half kernel which boots as far as adding a second thread and doing some cooperative multitasking. The whole thing is C++, booted with BootBoot and I'm writing in a very architecture-agnostic way, where possible. For example, I have a "Machine" class which is inherited by "Pc" and will also later be inherited by RPi...you get the idea. Same thing for Cpu and X86_46 classes. I add a single private uint64_t field to my Pc class, and suddenly I go from a decently working early-kernel, to something that seems to randomly triple fault.

I'm on Qemu with GDB. I thought my problem was something happening in GCC's _init, as the CPU reset seems to happen there, but on looking further, there's no single instruction that seems to be the problem. For example:

Code: Select all

0xfffffffff80055dd in frame_dummy ()
(gdb)
0xfffffffff8005610 in frame_dummy ()
(gdb)
0xfffffffff800561a in frame_dummy ()
(gdb)
0x000000000000fff0 in ?? ()
(gdb)
and on another occasion:

Code: Select all

(gdb) si
0xfffffffff80054d4 in register_tm_clones ()
(gdb) si
0x000000000000fff0 in ?? ()
Sometimes, it even gets as far as my own constructor code.

Remove that uint64_t field from my Pc class, and all is well again.

So logic seems to suggest that it's:

1. The size of the Pc object causing this (same happens if I add a void* field).
2. It's not happening in a single location, so we're looking at something like an interrupt.

BUT: This is all at a point before I set up my IDT and IF is zero. When I remove that one field, suddenly we go back to booting successfully with a working IDT, GDT, TSS, Page Frame Allocator, new and delete doing their thing...

I haven't yet done anything with my AP's, and if an AP is detected entering kmain, it is halted.

I just don't quite know where to start. Any pointers?

Cheeers,
Adam
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Re: Trying to Debug A Triple-Fault in Init

Post by AJ »

Do you ever work on something for a whole day, post a question and then suddenly the answer hits you?

I'll leave this here for anyone else daft enough to do the same as me. The solution perfectly explains the "random" nature of this - it was an AP.

Code: Select all

if(Cpu::IsBsp())
	{
		_init();

		DebugConsole& debug = DebugConsole::GetInstance();
		Machine& machine = Machine::GetInstance();
		machine.AddDefaultConsoleDevices(debug);

		INFO( "Initialising Caracal v1.0" );
		VINFO( "BSP ID: " << (uint64_t)Cpu::ProcessorId());

		if(Machine::GetInstance().Boot())
			INFO("Architecture-specific boot routine complete")
		else
			FATAL("Boot routine failed");
	}
	else
        {
		Machine::GetInstance().HaltCurrentCore();
        }

       ...
That's the gist of my kmain function.

Something about adding an extra field to Pc (which inherits Machine), made my initialisation take long enough that the AP kicked in and attempted to use the Machine::GetInstance() function before Machine was initialised. D'oh! Change that for a simple for loop, and the problems go away.

Now all I need to do is wait for the BSP to signal the AP's and they can all get playing nicely with each other.

Thanks for reading. Back in style...

Adam
Post Reply