(Fixed) Array Triple Fault

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
User avatar
Octacone
Member
Member
Posts: 1138
Joined: Fri Aug 07, 2015 6:13 am

Re: Undefined Array Triple Fault

Post by Octacone »

LtG wrote:
Octacone wrote: page_directory = (page_directory_t*) page_directory_address;
so TUI.Put_Hex((uint32_t) & page_directory, 0x0E); should return 0x11D000 right? No! It returns 0x11E050. Stuff likely getting overwritten. Right?
page_directory is a pointer, would you want to print that, not the _address_ of the pointer (the ampersand "&" in the second line).

Also I don't really get the point in your page_directory_t, why are there some virtual page tables? Also, I think customarily the "typedef struct X {..} X_t" stuff has the _t only on the latter "name", not the first one (X vs X_t)..
Because for some reason when I do (uint32_t) page_directory instead of (uint32_t) & page_directory it returns 0x0.
But then again something is horribly wrong. Even if I set it to 0xDEAD it returns 0x0.
Don't worry I know that:
pointer address = (uint32_t) that pointer
variable address = (uint32_t) & that variable

The other thing you asked is just my personal pedantic preference. My code looked at altogether is very clean, "pixel" perfect in a way. I have a very personal coding style.
Last edited by Octacone on Sat Jul 15, 2017 6:48 am, edited 2 times in total.
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
User avatar
Octacone
Member
Member
Posts: 1138
Joined: Fri Aug 07, 2015 6:13 am

Re: Undefined Array Triple Fault

Post by Octacone »

eryjus wrote:OK, so what to we know? I mean, really know and not assume to be true.

We know that cr3 has the value 0 in it when the system triple faults. I assume that it is written correctly initially (and the triple fault does not happen on the instruction immediately following setting the cr3 register on purpose).

Now, if you are setting the cr3 on purpose only once in your code, then you are executing some other thing that is setting is not on purpose. What might that be?

* You are executing data
* You are overwriting code
* You have several pages mapped to the same frame (my bet is on this one right now)

In particular, you are going to have a very big uphill battle to convince many people you found a compiler bug. Trust me on this (personal experience), even when you have convinced yourself 47 different ways you have a bug you probably don't. Let's assume there isn't a compiler bug.

So, how do we go about determining the real reason for the failure? Guessing at the cause and commenting out some code is not the optimal way to get the to root cause.

I would recommend you use something like `i686-elf-objdump -d kernel.elf` to compare your registers at crash with the line of assembly in EIP. Then scroll up until you can determine which function that is. Then go and review your C++ function to determine what line it is failing on and in particular which part of that line it is. Then, what are you assuming to be true with that line/state and are those valid assumptions?

I'm not trying to call you out, but I would not assume that your paging code is perfect yet (http://forum.osdev.org/viewtopic.php?f= ... 36&start=0). Remember, this is all still new and you might still have a bug buried deep in this code. My point is that you will want to disassemble what is in memory at the address of EIP at the time of the crash and compare it to what was in the output of *-objdump -- do they match? If not, find out why not. Verify your paging structures. You already indicate that this might be a problem, so it might be worthwhile to follow this line of thinking.
Okay let's assume it is not a compiler bug.
I did everything as you suggested. Here is some info:
(00100a3b) EIP does not point to anything that exists, the closest thing I could find is this (see no first two zeros)

Code: Select all

00100a34 <Set_Paging_Bit>:
  100a34:	8b 44 24 04          	mov    0x4(%esp),%eax
  100a38:	0f 22 c0             	mov    %eax,%cr0
  >>> 100a3b:	c3                   	ret  <<< 
The question remains, why does setting something to a certain value result in a failure? something = some address -> doesn't do anything just remains 0.
Something likely being overwritten. Duplicated mappings, why would there be any? I'll take a closer look once again, but I doubt it.
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Undefined Array Triple Fault

Post by iansjack »

"ret" causes error = corrupted stack.
User avatar
Octacone
Member
Member
Posts: 1138
Joined: Fri Aug 07, 2015 6:13 am

Re: Undefined Array Triple Fault

Post by Octacone »

iansjack wrote:"ret" causes error = corrupted stack.
That's the answer I've been looking for. So your theory did prove correct after all. Good job!

Now the question is, why does it happen? Any clues?

Edit: Google says, pointers, assembly function, violations of some one definition rule, pointers to stack variables.
Bochs says bx_dbg_read_linear: physical address not available for linear 0x0000000000100a3b, notice the 0x100A3B address?
Last edited by Octacone on Sat Jul 15, 2017 7:57 am, edited 1 time in total.
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Undefined Array Triple Fault

Post by iansjack »

You have to inspect the stack just before the "ret" to see what values are on it. You also need to check that esp is correct. Then you need to work backwards to see what went wrong. Using watches in gdb can help to determine how a particular memory location is being written to.
simeonz
Member
Member
Posts: 360
Joined: Fri Aug 19, 2016 10:28 pm

Re: Undefined Array Triple Fault

Post by simeonz »

The investigation pinpointed that simply your page directory pointer in cr3 is incorrectly initialized/uninitialized. The first memory access after enabling paging, the ret instruction, causes a page fault. It is not that the stack is not correctly mapped as we believed initially, it is that the entire virtual memory is incoherent, and hence the triple fault.

The array that you were investigating earlier has little to do with that directly. Unless your bootloader is loading the executable improperly, whatever side effects the array over-allocation causes, are arbitrary. Please, do investigate your pre-paging code with the debugger. That is the best approach for any such problem.
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Undefined Array Triple Fault

Post by iansjack »

That's true, if there is a page fault. If there is, investigating the stack in gdb will reveal this as it will produce a "memory inaccessible" error.
User avatar
Octacone
Member
Member
Posts: 1138
Joined: Fri Aug 07, 2015 6:13 am

Re: Undefined Array Triple Fault

Post by Octacone »

You guys are amazing!

After some painful line by line stepping, just before the "ret" instruction pressing the enter revealed:
Cannot access memory at address 0x100a34
Here is the screenshot:
Image

ESP changed from 0x11CF94 to 0x11CF64, normal? Don't think so.
Last edited by Octacone on Sat Jul 15, 2017 2:29 pm, edited 1 time in total.
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
User avatar
Ch4ozz
Member
Member
Posts: 170
Joined: Mon Jul 18, 2016 2:46 pm
Libera.chat IRC: esi

Re: Undefined Array Triple Fault

Post by Ch4ozz »

So, if you would have done what I said 3 pages before we would have been done way earlier.
Disassemblers dont lie, you just have to understand assembly.
User avatar
Octacone
Member
Member
Posts: 1138
Joined: Fri Aug 07, 2015 6:13 am

Re: Undefined Array Triple Fault

Post by Octacone »

Ch4ozz wrote:So, if you would have done what I said 3 pages before we would have been done way earlier.
Disassemblers dont lie, you just have to understand assembly.
GDB is very useful tool, when you learn how to use it.
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
User avatar
Octacone
Member
Member
Posts: 1138
Joined: Fri Aug 07, 2015 6:13 am

Re: Undefined Array Triple Fault

Post by Octacone »

What is the next step? Do you think it is paging caused or kernel caused?
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
LtG
Member
Member
Posts: 384
Joined: Thu Aug 13, 2015 4:57 pm

Re: Undefined Array Triple Fault

Post by LtG »

Octacone wrote:What is the next step? Do you think it is paging caused or kernel caused?
It seems to be caused by you setting CR3 to zero (or not setting it all and it's set to zero during system reset). Or did I miss some replies? So either single step thru the whole code or set breakpoints to where you're supposed to set CR3, then single step (or use breakpoints) until the triple fault, by that time CR3 should be zero. Now you know during which code sequence it's zeroed, find where that happens and fix it.

Octacone wrote:
LtG wrote:
Octacone wrote: page_directory = (page_directory_t*) page_directory_address;
so TUI.Put_Hex((uint32_t) & page_directory, 0x0E); should return 0x11D000 right? No! It returns 0x11E050. Stuff likely getting overwritten. Right?
page_directory is a pointer, would you want to print that, not the _address_ of the pointer (the ampersand "&" in the second line).

Also I don't really get the point in your page_directory_t, why are there some virtual page tables? Also, I think customarily the "typedef struct X {..} X_t" stuff has the _t only on the latter "name", not the first one (X vs X_t)..
Because for some reason when I do (uint32_t) page_directory instead of (uint32_t) & page_directory it returns 0x0.
But then again something is horribly wrong. Even if I set it to 0xDEAD it returns 0x0.
Don't worry I know that:
pointer address = (uint32_t) that pointer
variable address = (uint32_t) & that variable
Not sure if this is a communication issue or something else, but I think everything you said in that quote is wrong. I'm not sure if you understand how pointers work.

When you take the pointer of a pointer, that may very well be at 0x11E050. Given that in practice pointers alias with addresses (though the two are different things we can relatively safely assume they fully alias each other here):
0x11E050 is where the compiler has decided to hold the pointer towards your page_directory variable (which IIRC is itself a pointer).

Note also that if you set _the pointer_ to random values (0xDEAD) and then print the value at the pointed to address (0xDEAD in VM) then it's entirely possible that there's nothing there and you would get 0x0. Virtual machine may very well have zeroed out the memory.

Given all this, I suggested above that it's entirely possible that this is either a communication issue or you don't really understand pointers.

But in any case unless the CR3 is shown to be valid at the time (immediately before) of the triple fault, then I'd concentrate on why CR3 is invalid (zero).

Octacone wrote: The other thing you asked is just my personal pedantic preference. My code looked at altogether is very clean, "pixel" perfect in a way. I have a very personal coding style.
I'm assuming this refers to the typedef stuff (and not the virtual page stuff)? Having the struct named the same as the typedef doesn't seem like pedantic to me, I mean you are giving two _different_ things the same name, it just seems confusing. Also of note, the _t at the end is used to signify it's a typedef, the first occurrence isn't a typedef.. But to each their own I guess.

Btw, I also asked about the virtual page tables in your structs, what are they about?
LtG
Member
Member
Posts: 384
Joined: Thu Aug 13, 2015 4:57 pm

Re: Undefined Array Triple Fault

Post by LtG »

Again, wrt to the typedefs, if your issue is that you don't want extra names introduced and that's why you use the same name twice, you can also do the following:
typedef struct {
int a;
float b;
} name_of_struct_t;

Notice there's no name between "struct" and the first "{".. Or maybe you have some other reason why you want to use the same name twice?

Also note that some people think typedef shouldn't be used to give structs shorter names in this manner, but that's a bit more offtopic.
User avatar
Octacone
Member
Member
Posts: 1138
Joined: Fri Aug 07, 2015 6:13 am

Re: Undefined Array Triple Fault

Post by Octacone »

LtG wrote: Your reply. x2
Replying to both of your replies, can't use more than 3 within each other for some reason

I don't want it to be located at 0x0. PMM should provide an address for it to be placed at.
This could be a communication issue as well.
So I created a variable (page_directory_t* page_directory) that I need to place at some address returned by the physical memory manager. Can't be more simple than that.
The problem lies right there, why is it 0?
Right now as I am writing this reply I am also investigating a strange bug of sort, the PMM actually (for some baloney reason) returns 0x0, more on this later once I confirm it, worked before perfectly.
As for the other thing, that is just my personal preference + I don't have to write "struct some_struct" everything so that the compiler know it is a struct, I just like it that way, can't do any harm.
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
User avatar
Octacone
Member
Member
Posts: 1138
Joined: Fri Aug 07, 2015 6:13 am

Re: Undefined Array Triple Fault

Post by Octacone »

Here are some things I discovered.

-O2 enabled, PMM works perfectly, returns page aligned addresses, all good.
-O2 disabled, PMM broken as hell, only returns 0x0.

I am so mad at the compiler. Why does it keep breaking my code! Then that means, no PMM -> no VMM. Thus it is probably not VMM's fault.

Edit:
Manged to track down the issue.
You would never have guessed...
GRUB only reports 16 KILOBYTES of memory available with -02 disabled. 16 freaking kilobytes, and then I wonder why my memory managers don't work.
Now I need to find why is that so.
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
Post Reply