Page 1 of 2
[SOLVED] Page Fault when calling a virtual function
Posted: Wed Nov 26, 2014 11:01 pm
by RowDaBoat
Hi,
I'm developing my 64 bits kernel in C++ and I'm trying to solve a bug that causes an Exception 14 (Page Fault) at address 0x0.
The exception raises when I call a virtual method on a local object. The very first thing the kernel does is instantiating the object, and the virtual call is the second.
What puzzles me most is that the error seems to dissapear if I comment a few lines of code after the bugous call, but reapears as soon as I uncomment them, I have verified that those lines don't get actually excecuted before the crash by printing a character to the first byte of video memory.
At first I thought that I was accidentally writing over the vtable, but the kernel crashes immediatly on the first call to a virtual method, before I even write on memory. Then I considered the possibility of having a missing section in my linker script, and verified that I have text, rodata, data and bss sections present. After that I considered that probably I'm missing a propper paging setup, or just writing outside the page setup, but the bootloader is already taking care of mapping a big chunk of memory before jumping into long mode.
Has anyone had a smiliar behaviour? I appreciate a lot your time and help.
This is my setup:
Language: C++
Bootloader: Pure64
Cross Compiler: gcc 4.8.2 x86-64
Linker script:
Code: Select all
OUTPUT_FORMAT("binary")
ENTRY(loader)
SECTIONS
{
.text 0x0000000000100000 :
{
*(.text)
}
rodata = .;
.rodata :
{
*(.rodata)
}
data = .;
.data :
{
*(.data)
}
bss = .;
.bss :
{
*(.bss)
}
endOfKernel = .;
}
Here's a reduced version of my kernel's code:
Code: Select all
int main() {
(char*)(0xb8000)) = 'K';
WyrmLog log;
log.clear(); //Clear calls the function that causes the problem
(char*)(0xb8000)) = 'R'; //This never gets executed.
//If I comment the lines below, the error on the previos line stops happening
MemoryMap map(log, (void*)(getStackBase() + sizeof(uint64_t)));
PageAllocator pageAllocator(map.startAddress(), map.totalMemory(), log);
configurePageAllocator(map, pageAllocator);
void * mem = pageAllocator.alloc(1);
long long int test = 1;
RedBlackTreeNodeImpl<long long int> * node = new (mem) RedBlackTreeNodeImpl<long long int>(test);
...
}
//Log is the base class of WyrmLog
class Log {
public:
virtual Log& chr(int c) = 0;
...
}
Log& WyrmLog::clear() {
*((char*)(0xb8000)) = 'F';
for (int i = 0; i < Columns * Lines; i++) {
chr(' '); //This is the pure virtual method causing the problem
}
xPos = yPos = 0;
return *this;
}
Log& WyrmLog::chr(int c) {
*((char*)(0xb8000)) = '1'; //This never gets executed
...
return *this;
}
*EDIT* I changed the post title to a more proper name and added a bit more of my source code.
*EDIT* I corrected an allocation on the NULL address (it was not being executed anyway)
Re: Page Fault when calling the implementation of a pure vir
Posted: Thu Nov 27, 2014 4:02 pm
by Octocontrabass
What is the command line you use for compiling your kernel?
Does the compiler/linker print any warnings or errors?
Unfortunately I'm not familiar enough with C++ to give you any help beyond
the wiki.
Re: Page Fault when calling the implementation of a pure vir
Posted: Thu Nov 27, 2014 4:40 pm
by RowDaBoat
Thanks for your response!
No warnings are print
This is the line I use to link my kernel:
Code: Select all
/home/arkanrow/opt/cross/bin/x86_64-elf-ld -T Wyrm.ld -nostdlib -o WyrmKernel.bin Loader.o Wyrm.o libsupc++.o MemoryManager/MemoryManager.a Log/Log.a Collections/Collections.a Scheduler/Scheduler.a StandardLibrary/stdlib.a CPU/cpulib.a /home/arkanrow/opt/cross/lib/gcc/x86_64-elf/4.8.2/libgcc.a
The static libraries and object files are being built with lines similar to the following:
Code: Select all
/home/arkanrow/opt/cross/bin/x86_64-elf-g++ -I/media/psf/Home/Projects/C-C++/Wyrm/Kernel/StandardLibrary -I/media/psf/Home/Projects/C-C++/Wyrm/Kernel/Collections -fno-exceptions -fno-rtti -c Log.cpp -o Log.o
/home/arkanrow/opt/cross/bin/x86_64-elf-g++ -I/media/psf/Home/Projects/C-C++/Wyrm/Kernel/StandardLibrary -I/media/psf/Home/Projects/C-C++/Wyrm/Kernel/Collections -fno-exceptions -fno-rtti -c WyrmLog.cpp -o WyrmLog.o
/home/arkanrow/opt/cross/bin/x86_64-elf-ar rvs Log.a Log.o WyrmLog.o
I've been following that exact wiki page from the beginning, and I have actually implemented the __cxa_pure_virtual function required by gcc. Any thoughts?
Re: Page Fault when calling the implementation of a pure vir
Posted: Thu Nov 27, 2014 7:43 pm
by AndrewAPrice
what is the class definition of WyrmLog?
Re: Page Fault when calling the implementation of a pure vir
Posted: Thu Nov 27, 2014 10:34 pm
by RowDaBoat
These are the definitions for Log and WyrmLog:
Code: Select all
class Log {
private:
static char buf[128];
public:
virtual Log& newline() = 0;
virtual Log& tab() = 0;
virtual Log& chr(int c) = 0;
virtual Log& clear() = 0;
Log& str(const char * string);
Log& bits8(uint8_t value);
Log& bits16(uint16_t value);
Log& bits32(uint32_t value);
Log& bits64(uint64_t value);
Log& addr(void * value);
Log& laddr(void * value);
Log& haddr(void * value);
Log& count(uint32_t value);
Log& dec(int32_t value);
Log& hex(uint64_t value, int digits);
Log& sizeb(uint64_t value);
Log& sizeKb(uint64_t value);
Log& sizeMb(uint64_t value);
protected:
void printBase(uint32_t value, int completeTo, int base);
int uintToBase(uint32_t value, char * buf, int base);
};
Code: Select all
class WyrmLog : public Log {
private:
static const int Columns = 80;
static const int Lines = 25;
static const int Video = 0xB8000;
static char buf[128];
VideoCell * video;
int xPos;
int yPos;
char color;
public:
WyrmLog();
Log& newline();
Log& tab();
Log& chr(int c);
Log& clear();
Log& mark(char c);
//TODO Tab character
private:
void scroll();
};
Re: Page Fault when calling the implementation of a pure vir
Posted: Thu Nov 27, 2014 11:25 pm
by AndrewAPrice
Hmm, a page fault accessing address 0 sounds like the vtable is filled out for your log object, causing it to jump to 0. Maybe the constructor isn't being called? What about "WyrmLog log()" ?
Re: Page Fault when calling the implementation of a pure vir
Posted: Fri Nov 28, 2014 12:24 am
by RowDaBoat
I've checked again and the constructor is being called.
Isn't the line WyrmLog log(); interpreted as a function declaration returning WyrmLog by the compiler?
Thanks!
Re: Page Fault when calling the implementation of a pure vir
Posted: Fri Nov 28, 2014 6:48 am
by Schol-R-LEA
If you are declaring the object as class
WyrmLog rather than
Log, and using a local variable rather than a pointer to memory on the heap, then it shouldn't matter if it is a virtual function or not; it will always use the
WyrmLog version of the function directly, without a vtable lookup. Whatever the cause of the problem is, it isn't because it is a pure virtual function.
As for
That would be an explicit call to the c'tor for
WyrmLog in the local variable declaration, which is what you want.
Re: Page Fault when calling the implementation of a pure vir
Posted: Fri Nov 28, 2014 7:20 am
by Schol-R-LEA
A few observations:
- Since only one WyrmLog can be directed at the video at once, you may want to make WyrmLog explictly a singleton class.
- In the methods you posted earlier, you return *this for method declared &WyrmLog. You should only be declaring them as this; the extra indirection isn't needed, as this is already a pointer. Your compiler ought to have thrown a warning, or even an error, on that; add the -Wall switch to you build command string to make sure it does in the future.
- Remember that the video text buffer is made up of character-attribute pairs. I would make a simple packed struct type made up of a char and a uint8_t to represent the video elements:
Code: Select all
struct __attribute__ ((__packed__)) DisplayChar
{
char text;
uint8_t attribute;
}
You'll need to #include <cstdint> for the uint8_t type, which means having a version of it in your library. This is adviseable anyway, as the size-specified types are more precise when it comes to system work.
- You have the video base as a static const int variable. You should be using a pointer rather than an int for this. If you use the struct above, change that to static const *DisplayChar TEXT_BUFFER_PAGE_0 = 0xB8000;.
- You should make your video handling (not just for logging) flexible enough to handle different text modes. As such, the max width and max height should not be constants.
- You have the base of the video buffer, but you don't have anything representing it's current state. You need a pointer to the current char-attribute pair for the cursor, and some sort of indicator for the text mode. A method from moving the cursor along the lines of goto_xy() would be adviseable as well.
Re: Page Fault when calling the implementation of a pure vir
Posted: Fri Nov 28, 2014 7:37 am
by xenos
Schol-R-LEA wrote:In the methods you posted earlier, you return *this for method declared &WyrmLog. You should only be declaring them as this; the extra indirection isn't needed, as this is already a pointer. Your compiler ought to have thrown a warning, or even an error, on that; add the -Wall switch to you build command string to make sure it does in the future.
The type
&WyrmLog is a reference type, not a pointer. If the declaration was
*WyrmLog, then your statement would be correct and the function would be supposed to return a pointer, so returning
this would be correct. But for the reference type
&WyrmLog it should be correct to return
*this.
Re: Page Fault when calling the implementation of a pure vir
Posted: Fri Nov 28, 2014 7:57 am
by Schol-R-LEA
Hits self on head Oh, right. Never mind, it was early morning and I must have spaced on that. The comment about using -Wall is still a good idea, though. You want as many warninigs as you can get, even if you think you know they don't apply to what you are doing, because they often do indicate something you have overlooked.
In fact, I spaced on a number of things I should have noticed, like the fact that WyrmLog does have a cursor variable, which is a pointer to a video cell type just as I described. D'oh!
Re: Page Fault when calling the implementation of a pure vir
Posted: Fri Nov 28, 2014 9:10 am
by RowDaBoat
Since only one WyrmLog can be directed at the video at once..., you may want to make WyrmLog explictly a singleton class.
Yes, I'm actually refeactoring the constant out by recieveing it in the constructor. I hate singletons.
In the methods you posted earlier, you return *this for method declared &WyrmLog.
What XenOS said, plus, yes I should be using -Wall, how could I forget that
Remember that the video text buffer is made up of character-attribute pairs.
That's exactly what the line
is doing, VideoCell is my char+attrib pair.
You'll need to #include <cstdint> for the uint8_t type, which means having a version of it in your library.
I created my own version of that header, stdint.h, I'm not sure now why I put that name, probably because I never liked not ending my includes in .h
You have the video base as a static const int variable.
Oh my god... how could I? Changing that.
You should make your video handling (not just for logging) flexible enough to handle different text modes. As such, the max width and max height should not be constants.
Sure, but (for now) the purpose of WyrmLog is just to log the bootstrap steps of the kernel on 80x25 text mode. My ideal kernel would handle logging in a userspace process, which would include all the more complex handling you suggested. (I guess that won't happen for a while
)
Re: Page Fault when calling the implementation of a pure vir
Posted: Fri Nov 28, 2014 9:28 am
by RowDaBoat
Thanks a lot for your replies!
If you are declaring the object as class WyrmLog rather than Log, and using a local variable rather than a pointer to memory on the heap, then it shouldn't matter if it is a virtual function or not; it will always use the WyrmLog version of the function directly, without a vtable lookup. Whatever the cause of the problem is, it isn't because it is a pure virtual function.
Yes, you are right, on a second look the place where I'm calling 'chr' shouldn't require a vtable lookup. However, I just tried removing the virtual keyword and implementing 'chr' as an empty function in the base, and now it works, it actually runs the code on chr, but crashes again with the same error when it tries to call another function defined as pure virtual (newline).
So, based on these behaviors, I'm most sure that somehow, methods declared as pure virtual are getting messed up on my kernel.
As for
Code:
WyrmLog log();
I'm pretty sure that WyrmLog log(); is the declaration of a function named log returning a concrete WyrmLog object. I have tried that before, the compilation errors that it generates seem to back up my belief, also I have searched for this on stackoverflow
http://stackoverflow.com/questions/5300 ... onstructor.
Re: Page Fault when calling the implementation of a pure vir
Posted: Sun Dec 07, 2014 8:14 am
by RowDaBoat
Hi, I managed to make some progress!
tl;dr I used . = ALIGN(4096); on every section in the linker script and the bug is no longer happening.
However, this does not satisfy me since I don’t completely understand why this fixed my problem.
This is the whole process I followed:
First of all, since I’m not doing anything really weird with memory, I assumed that both versions of my kernel (the crashing and the working one) are bugged, but only one showed the symptoms, so I could debug the working version to find the error.
I logged the address of my kernel’s sections and I found out that they were:
Code: Select all
text = 0x100000
rodata = 0x10232E
data = 0x1028E6
bss = 0x103670
Then I debugged with bochs to obtain the address of the vtable for WyrmLog, it happened to be 0x103580 which is inside the data section.
My output format is "flat binary" (bootloader requirement). Normally in an ELF file, the vtable is in the rodata section, I suppose this is the same for flat binaries. I considered that maybe my kernel was being loaded incorrectly by the bootloader.
So I thought about changing my bootloader to GRUB2 and read this article
http://wiki.osdev.org/Creating_a_64-bit_kernel, then I noticed that the linker script in there had page aligned sections, I tried out that on my script and presto… it worked.
However, when I debugged with bochs again I found out that the addresses of my sections were
Code: Select all
text = 0x100000
rodata = 0x103000
data = 0x104000
bss = 0x105000
and the vtable is still located inside the data section at address 0x104c90
So these are my new questions:
- Why does this work? Do sections have to be page aligned because of the memory protection model or is there any other reason I still don’t know of?
- Have I really fixed the error, or is it working just by chance and it might reappear in the future when the kernel grows?
- Why did my vtables ended up in the data section?
Re: Page Fault when calling the implementation of a pure vir
Posted: Mon Dec 08, 2014 1:30 am
by Combuster
This just sounds like a typical memory corruption bug. Sadly those are also the most annoying to debug. What you can do is dump the address and first 16 bytes at various places in the code to the screen and check when any of them change. Looking up where in the data section that address points might also help indicating which code is responsible for that mess.