Page 1 of 1

[solved?] Trouble with bochs

Posted: Mon May 07, 2007 1:31 am
by lode
Could somebody please tell me why bochs feels so crappy?

Code: Select all

bx_dbg_read_linear: physical memory read error (phy=0xc0001f02, lin=0xc0001f02
<the missing ) actually is missing in bochs also>
Ok. Just a second before this, the OS pulls a null pointer from the scheduler (meaning; no available processes to run)
So I'm not very suprised that it barfs when it references the null and slams the result into cr3.

BUT

Code: Select all

void schedule()
{
   cp = runque->pop();
   runque->add(cp);

   if (cp == 0)
     cp = idler;
   printf("CP: 0x%08x\n",cp);
}
Why doesn't that if catch the null? In Bochs, that is.
In bochs what I see is

Code: Select all

Process id 2 terminated.
CP: 0x00000000
QEmu hangs at this point, using 100% cpu.
Bochs throws the aforementioned error.

in VMWare it shows the correct idle thread pointer

Code: Select all

Process id 2 terminated.
CP: 0xc03fdba0
CP: 0xc03fdba0
CP: 0xc03fdba0
<repeat forever>
Any ideas?

Posted: Mon May 07, 2007 3:02 am
by Combuster
it has probably something to do with relying on undefined behaviour. Under Bochs and QEmu the pointer to the idle thread does not get set and keeps pointing to 0 (which indirectly causes a load from the IVT). You should check that said pointer is indeed written (and possibly, not overwritten) under all circumstances. If you have paging enabled, ensure that the TLB contents is valid as well.

Try Bochs' debugger to find out why any of these conditions does not hold

Posted: Mon May 07, 2007 10:16 am
by lode
Pointer gets set, I added an assert in there.
Now I actually manage to get a page fault for address 0xf... something.
It points to an unmapped area like it should.
The offending value is pulled from 0x000008 physical (realmode ivt?)
Now, the funny thing is the 0 - 4M is a large page which is NOT MAPPED after the boot.
Why won't it fault right at the null dereference?
Doesn't Bochs flush TLB on CR3 change?

Posted: Mon May 07, 2007 11:23 am
by lode
Seems that in Bochs my fault handler's attempt to kill the offending process also faults.

Actually, adding a few kernel threads shows that the process manager is unable to find and remove them from the queue on bochs.
Interesting since the queue handling is in C.

Posted: Mon May 07, 2007 2:10 pm
by Combuster
sounds like something along the lines of vaxocentrism or anything related to that.
could you attach full source code and a floppy image? Things not running only a subset of emulators is usually an indication of wrong assumptions or broken code.

Posted: Mon May 07, 2007 2:28 pm
by lode
The bug went away after adding several printf statements for diagnostics.
Could it be that Bochs was pulling the hd image from some kind of a cache?
I hereby assign all blame from bochs to samba. ;)

Edit:
Concerning vaxocentrism I admit items 3 and 5. :twisted: