[SOLVED] Page Fault when calling a virtual function

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
User avatar
RowDaBoat
Posts: 13
Joined: Tue Nov 04, 2014 12:01 pm

[SOLVED] Page Fault when calling a virtual function

Post by RowDaBoat »

Hi,

I'm developing my 64 bits kernel in C++ and I'm trying to solve a bug that causes an Exception 14 (Page Fault) at address 0x0.
The exception raises when I call a virtual method on a local object. The very first thing the kernel does is instantiating the object, and the virtual call is the second.

What puzzles me most is that the error seems to dissapear if I comment a few lines of code after the bugous call, but reapears as soon as I uncomment them, I have verified that those lines don't get actually excecuted before the crash by printing a character to the first byte of video memory.

At first I thought that I was accidentally writing over the vtable, but the kernel crashes immediatly on the first call to a virtual method, before I even write on memory. Then I considered the possibility of having a missing section in my linker script, and verified that I have text, rodata, data and bss sections present. After that I considered that probably I'm missing a propper paging setup, or just writing outside the page setup, but the bootloader is already taking care of mapping a big chunk of memory before jumping into long mode.

Has anyone had a smiliar behaviour? I appreciate a lot your time and help.

This is my setup:
Language: C++
Bootloader: Pure64
Cross Compiler: gcc 4.8.2 x86-64

Linker script:

Code: Select all

OUTPUT_FORMAT("binary")
ENTRY(loader)
SECTIONS
{
	.text 0x0000000000100000 :
	{
		*(.text)
	}
	rodata = .;
	.rodata :
	{
		*(.rodata)
	}
	data = .;
	.data :
	{
		*(.data)
	}
	bss = .;
	.bss :
	{
		*(.bss)
	}
	endOfKernel = .;
}
Here's a reduced version of my kernel's code:

Code: Select all

int main() {
	(char*)(0xb8000)) = 'K';
	WyrmLog log;
	log.clear();  //Clear calls the function that causes the problem
	(char*)(0xb8000)) = 'R'; //This never gets executed.

	//If I comment the lines below, the error on the previos line stops happening
	MemoryMap map(log, (void*)(getStackBase() + sizeof(uint64_t)));
	PageAllocator pageAllocator(map.startAddress(), map.totalMemory(), log);
	configurePageAllocator(map, pageAllocator);
	
	void * mem = pageAllocator.alloc(1);
	long long int test = 1;
	RedBlackTreeNodeImpl<long long int> * node = new (mem) RedBlackTreeNodeImpl<long long int>(test);
	...
}

//Log is the base class of WyrmLog
class Log {
public:
	virtual Log& chr(int c) = 0;
	...
}

Log& WyrmLog::clear() {
	*((char*)(0xb8000)) = 'F';

	for (int i = 0; i < Columns * Lines; i++) {
		chr(' '); //This is the pure virtual method causing the problem
	}

	xPos = yPos = 0;
	return *this;
}

Log& WyrmLog::chr(int c) {
	*((char*)(0xb8000)) = '1'; //This never gets executed
	...
	return *this;
}
*EDIT* I changed the post title to a more proper name and added a bit more of my source code.
*EDIT* I corrected an allocation on the NULL address (it was not being executed anyway)
Last edited by RowDaBoat on Thu Dec 25, 2014 4:25 pm, edited 2 times in total.
If I could only switch you
If I could set your stack
If I could only switch you
That would really be a breakthru
Octocontrabass
Member
Member
Posts: 5590
Joined: Mon Mar 25, 2013 7:01 pm

Re: Page Fault when calling the implementation of a pure vir

Post by Octocontrabass »

What is the command line you use for compiling your kernel?

Does the compiler/linker print any warnings or errors?

Unfortunately I'm not familiar enough with C++ to give you any help beyond the wiki.
User avatar
RowDaBoat
Posts: 13
Joined: Tue Nov 04, 2014 12:01 pm

Re: Page Fault when calling the implementation of a pure vir

Post by RowDaBoat »

Thanks for your response!

No warnings are print :(
This is the line I use to link my kernel:

Code: Select all

/home/arkanrow/opt/cross/bin/x86_64-elf-ld -T Wyrm.ld -nostdlib -o WyrmKernel.bin Loader.o Wyrm.o libsupc++.o  MemoryManager/MemoryManager.a Log/Log.a Collections/Collections.a Scheduler/Scheduler.a StandardLibrary/stdlib.a CPU/cpulib.a /home/arkanrow/opt/cross/lib/gcc/x86_64-elf/4.8.2/libgcc.a
The static libraries and object files are being built with lines similar to the following:

Code: Select all

/home/arkanrow/opt/cross/bin/x86_64-elf-g++ -I/media/psf/Home/Projects/C-C++/Wyrm/Kernel/StandardLibrary -I/media/psf/Home/Projects/C-C++/Wyrm/Kernel/Collections -fno-exceptions -fno-rtti -c Log.cpp -o Log.o
/home/arkanrow/opt/cross/bin/x86_64-elf-g++ -I/media/psf/Home/Projects/C-C++/Wyrm/Kernel/StandardLibrary -I/media/psf/Home/Projects/C-C++/Wyrm/Kernel/Collections -fno-exceptions -fno-rtti -c WyrmLog.cpp -o WyrmLog.o
/home/arkanrow/opt/cross/bin/x86_64-elf-ar rvs Log.a Log.o WyrmLog.o
I've been following that exact wiki page from the beginning, and I have actually implemented the __cxa_pure_virtual function required by gcc. Any thoughts?
If I could only switch you
If I could set your stack
If I could only switch you
That would really be a breakthru
User avatar
AndrewAPrice
Member
Member
Posts: 2303
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: Page Fault when calling the implementation of a pure vir

Post by AndrewAPrice »

what is the class definition of WyrmLog?
My OS is Perception.
User avatar
RowDaBoat
Posts: 13
Joined: Tue Nov 04, 2014 12:01 pm

Re: Page Fault when calling the implementation of a pure vir

Post by RowDaBoat »

These are the definitions for Log and WyrmLog:

Code: Select all

class Log {
private:
	static char buf[128];

public:
	virtual Log& newline() = 0;
	virtual Log& tab() = 0;
	virtual Log& chr(int c) = 0;
	virtual Log& clear() = 0;

	Log& str(const char * string);
	Log& bits8(uint8_t value);
	Log& bits16(uint16_t value);
	Log& bits32(uint32_t value);
	Log& bits64(uint64_t value);
	Log& addr(void * value);
	Log& laddr(void * value);
	Log& haddr(void * value);
	Log& count(uint32_t value);
	Log& dec(int32_t value);
	Log& hex(uint64_t value, int digits);
	Log& sizeb(uint64_t value);
	Log& sizeKb(uint64_t value);
	Log& sizeMb(uint64_t value);

protected:
	void printBase(uint32_t value, int completeTo, int base);
	int uintToBase(uint32_t value, char * buf, int base);
};

Code: Select all

class WyrmLog : public Log {
private:
	static const int Columns = 80;
	static const int Lines = 25;
	static const int Video = 0xB8000;

	static char buf[128];

	VideoCell * video;
	int xPos;
	int yPos;
	char color;

public:
	WyrmLog();
	Log& newline();
	Log& tab();
	Log& chr(int c);
	Log& clear();
	Log& mark(char c);
	//TODO Tab character

private:
	void scroll();
};
If I could only switch you
If I could set your stack
If I could only switch you
That would really be a breakthru
User avatar
AndrewAPrice
Member
Member
Posts: 2303
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: Page Fault when calling the implementation of a pure vir

Post by AndrewAPrice »

Hmm, a page fault accessing address 0 sounds like the vtable is filled out for your log object, causing it to jump to 0. Maybe the constructor isn't being called? What about "WyrmLog log()" ?
My OS is Perception.
User avatar
RowDaBoat
Posts: 13
Joined: Tue Nov 04, 2014 12:01 pm

Re: Page Fault when calling the implementation of a pure vir

Post by RowDaBoat »

I've checked again and the constructor is being called.
Isn't the line WyrmLog log(); interpreted as a function declaration returning WyrmLog by the compiler?
Thanks!
If I could only switch you
If I could set your stack
If I could only switch you
That would really be a breakthru
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: Page Fault when calling the implementation of a pure vir

Post by Schol-R-LEA »

If you are declaring the object as class WyrmLog rather than Log, and using a local variable rather than a pointer to memory on the heap, then it shouldn't matter if it is a virtual function or not; it will always use the WyrmLog version of the function directly, without a vtable lookup. Whatever the cause of the problem is, it isn't because it is a pure virtual function.

As for

Code: Select all

WyrmLog log();
That would be an explicit call to the c'tor for WyrmLog in the local variable declaration, which is what you want.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: Page Fault when calling the implementation of a pure vir

Post by Schol-R-LEA »

A few observations:
  • Since only one WyrmLog can be directed at the video at once, you may want to make WyrmLog explictly a singleton class.
  • In the methods you posted earlier, you return *this for method declared &WyrmLog. You should only be declaring them as this; the extra indirection isn't needed, as this is already a pointer. Your compiler ought to have thrown a warning, or even an error, on that; add the -Wall switch to you build command string to make sure it does in the future.
  • Remember that the video text buffer is made up of character-attribute pairs. I would make a simple packed struct type made up of a char and a uint8_t to represent the video elements:

    Code: Select all

    struct __attribute__ ((__packed__))  DisplayChar
    {
        char text;
        uint8_t attribute;
    }
    You'll need to #include <cstdint> for the uint8_t type, which means having a version of it in your library. This is adviseable anyway, as the size-specified types are more precise when it comes to system work.
  • You have the video base as a static const int variable. You should be using a pointer rather than an int for this. If you use the struct above, change that to static const *DisplayChar TEXT_BUFFER_PAGE_0 = 0xB8000;.
  • You should make your video handling (not just for logging) flexible enough to handle different text modes. As such, the max width and max height should not be constants.
  • You have the base of the video buffer, but you don't have anything representing it's current state. You need a pointer to the current char-attribute pair for the cursor, and some sort of indicator for the text mode. A method from moving the cursor along the lines of goto_xy() would be adviseable as well.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
User avatar
xenos
Member
Member
Posts: 1121
Joined: Thu Aug 11, 2005 11:00 pm
Libera.chat IRC: xenos1984
Location: Tartu, Estonia
Contact:

Re: Page Fault when calling the implementation of a pure vir

Post by xenos »

Schol-R-LEA wrote:In the methods you posted earlier, you return *this for method declared &WyrmLog. You should only be declaring them as this; the extra indirection isn't needed, as this is already a pointer. Your compiler ought to have thrown a warning, or even an error, on that; add the -Wall switch to you build command string to make sure it does in the future.
The type &WyrmLog is a reference type, not a pointer. If the declaration was *WyrmLog, then your statement would be correct and the function would be supposed to return a pointer, so returning this would be correct. But for the reference type &WyrmLog it should be correct to return *this.
Programmers' Hardware Database // GitHub user: xenos1984; OS project: NOS
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: Page Fault when calling the implementation of a pure vir

Post by Schol-R-LEA »

Hits self on head Oh, right. Never mind, it was early morning and I must have spaced on that. The comment about using -Wall is still a good idea, though. You want as many warninigs as you can get, even if you think you know they don't apply to what you are doing, because they often do indicate something you have overlooked.

In fact, I spaced on a number of things I should have noticed, like the fact that WyrmLog does have a cursor variable, which is a pointer to a video cell type just as I described. D'oh!
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
User avatar
RowDaBoat
Posts: 13
Joined: Tue Nov 04, 2014 12:01 pm

Re: Page Fault when calling the implementation of a pure vir

Post by RowDaBoat »

Since only one WyrmLog can be directed at the video at once..., you may want to make WyrmLog explictly a singleton class.
Yes, I'm actually refeactoring the constant out by recieveing it in the constructor. I hate singletons.
In the methods you posted earlier, you return *this for method declared &WyrmLog.
What XenOS said, plus, yes I should be using -Wall, how could I forget that :shock:
Remember that the video text buffer is made up of character-attribute pairs.
That's exactly what the line

Code: Select all

VideoCell * video;
is doing, VideoCell is my char+attrib pair.
You'll need to #include <cstdint> for the uint8_t type, which means having a version of it in your library.
I created my own version of that header, stdint.h, I'm not sure now why I put that name, probably because I never liked not ending my includes in .h
You have the video base as a static const int variable.
Oh my god... how could I? Changing that.
You should make your video handling (not just for logging) flexible enough to handle different text modes. As such, the max width and max height should not be constants.
Sure, but (for now) the purpose of WyrmLog is just to log the bootstrap steps of the kernel on 80x25 text mode. My ideal kernel would handle logging in a userspace process, which would include all the more complex handling you suggested. (I guess that won't happen for a while :P)
If I could only switch you
If I could set your stack
If I could only switch you
That would really be a breakthru
User avatar
RowDaBoat
Posts: 13
Joined: Tue Nov 04, 2014 12:01 pm

Re: Page Fault when calling the implementation of a pure vir

Post by RowDaBoat »

Thanks a lot for your replies!
If you are declaring the object as class WyrmLog rather than Log, and using a local variable rather than a pointer to memory on the heap, then it shouldn't matter if it is a virtual function or not; it will always use the WyrmLog version of the function directly, without a vtable lookup. Whatever the cause of the problem is, it isn't because it is a pure virtual function.
Yes, you are right, on a second look the place where I'm calling 'chr' shouldn't require a vtable lookup. However, I just tried removing the virtual keyword and implementing 'chr' as an empty function in the base, and now it works, it actually runs the code on chr, but crashes again with the same error when it tries to call another function defined as pure virtual (newline).

So, based on these behaviors, I'm most sure that somehow, methods declared as pure virtual are getting messed up on my kernel.
As for
Code:
WyrmLog log();
I'm pretty sure that WyrmLog log(); is the declaration of a function named log returning a concrete WyrmLog object. I have tried that before, the compilation errors that it generates seem to back up my belief, also I have searched for this on stackoverflow http://stackoverflow.com/questions/5300 ... onstructor.
If I could only switch you
If I could set your stack
If I could only switch you
That would really be a breakthru
User avatar
RowDaBoat
Posts: 13
Joined: Tue Nov 04, 2014 12:01 pm

Re: Page Fault when calling the implementation of a pure vir

Post by RowDaBoat »

Hi, I managed to make some progress!
tl;dr I used . = ALIGN(4096); on every section in the linker script and the bug is no longer happening.

However, this does not satisfy me since I don’t completely understand why this fixed my problem.
This is the whole process I followed:
First of all, since I’m not doing anything really weird with memory, I assumed that both versions of my kernel (the crashing and the working one) are bugged, but only one showed the symptoms, so I could debug the working version to find the error.
I logged the address of my kernel’s sections and I found out that they were:

Code: Select all

text = 0x100000
rodata = 0x10232E
data = 0x1028E6
bss = 0x103670
Then I debugged with bochs to obtain the address of the vtable for WyrmLog, it happened to be 0x103580 which is inside the data section.

My output format is "flat binary" (bootloader requirement). Normally in an ELF file, the vtable is in the rodata section, I suppose this is the same for flat binaries. I considered that maybe my kernel was being loaded incorrectly by the bootloader.
So I thought about changing my bootloader to GRUB2 and read this article http://wiki.osdev.org/Creating_a_64-bit_kernel, then I noticed that the linker script in there had page aligned sections, I tried out that on my script and presto… it worked.
However, when I debugged with bochs again I found out that the addresses of my sections were

Code: Select all

text = 0x100000
rodata = 0x103000
data = 0x104000
bss = 0x105000
and the vtable is still located inside the data section at address 0x104c90

So these are my new questions:
  • Why does this work? Do sections have to be page aligned because of the memory protection model or is there any other reason I still don’t know of?
  • Have I really fixed the error, or is it working just by chance and it might reappear in the future when the kernel grows?
  • Why did my vtables ended up in the data section?
If I could only switch you
If I could set your stack
If I could only switch you
That would really be a breakthru
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Page Fault when calling the implementation of a pure vir

Post by Combuster »

This just sounds like a typical memory corruption bug. Sadly those are also the most annoying to debug. What you can do is dump the address and first 16 bytes at various places in the code to the screen and check when any of them change. Looking up where in the data section that address points might also help indicating which code is responsible for that mess.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
Post Reply