Page 1 of 1

Bochs translates page inconsistently?

Posted: Thu Dec 28, 2017 2:59 pm
by nakst
Hi,
I've been debugging a peculiar bug in my kernel that so far has only appeared on Bochs.
Through printf-debugging I managed to narrow down what was going wrong to a certain memory location.
In the Bochs GUI debugger, I used the `page` command to see which physical address the memory is at.
I then did a linear memory dump with the virtual address, and a physical memory dump with the address from the `page` command.
Thinking they would be the same, I was surprised to find that while the physical dump gave the values I was expecting, the linear memory dump gave completely different values - as if the page address had been translated differently.
I'm worried this might be a bug in Bochs - although I could be wrong. I'm using version 2.6.0, compiled with instructions from http://wiki.osdev.org/Bochs#Compiling_Bochs_from_Source.
Does anyone know what might be happening?

Re: Bochs translates page inconsistently?

Posted: Thu Dec 28, 2017 10:32 pm
by MichaelPetch
Do you have a project on Github (or similar service) that we could look at? If we could test your actual code and you tell us how to reproduce the problem we might be able to give you an answer. Although Bochs has bugs, I'd be surprised if the paging mechanism was an issue. but it is possible.

Maybe you didn't properly invalidate a page at some point?

Re: Bochs translates page inconsistently?

Posted: Fri Dec 29, 2017 12:44 am
by stlw
>Maybe you didn't properly invalidate a page at some point?

Good guess about TLB.
BTW, you also should be able to dump Bochs internal TLB through param tree or using 'tlb' shortcut and see what you have in the TLB at the same time.
You can also enable memory trace in Bochs debugger using 'trace-mem on' and see what actually happens in Bochs during the memory access.

Re: Bochs translates page inconsistently?

Posted: Fri Dec 29, 2017 12:51 pm
by nakst
Okay, it's working now!

Here are the two things I did:

1. I updated to Bochs 2.6.8 (I couldn't get 2.6.9 to compile for some reason) which appeared to get rid of the inconsistent page translation in the debugger. (It also broke ATA DMA which is something else I'll need to investigate.)

2. I found and fixed a longstanding error in my TLB invalidation code that was causing my bug.

Code: Select all

uintptr_t virtualAddressU = _virtualAddress;
_virtualAddress &= 0x0000FFFFFFFFF000;
...
tlbShootdownVirtualAddress = _virtualAddress; // <-- The bug is here!
tlbShootdownPageCount = pageCount;
tlbShootdownRemainingProcessors = scheduler.processors - 1;
ProcessorSendIPI(TLB_SHOOTDOWN_IPI);
I changed _virtualAddress to virtualAddressU, fixing my problem.

Thanks for the help.