Page 1 of 1

Crash on memcpy (movsb) ?

Posted: Sun Dec 17, 2006 12:20 pm
by MagicalTux
Hello,

Well, I recently tried to run my OS on vmware (I was always using qemu), and I found that it crashes when loading the OS in memory. I did various tests, and I found that the error happens while loading the kernel into memory (at 0x100000)...

Error message from VMWare :
*** Virtual machine kernel stack fault (hardware reset) ***
The virtual machine just suffered a stack fault in kernel mode. On a real computer, this would amount to a reset of the processor. It can be caused by an incorrect configuration of the virtual machine, a bug in the operating system, or a problem in the VMware Workstation software. Press OK to reboot virtual machine or Cancel to shut it down.

I tried to use my printf() function to see where it crashes, and I found that the problem happens while calling memcpy() (it's just a simple memcpy I coded, which use movsb, and works well on qemu).

If anyone knows why it happens, I'd be happy to have an explanation (seems that I'm able to write directly in this part of the memory, and read, etc...).

I also tried with a simple copy-loop (for+*(mem++)=buf) and I got the same result.

Posted: Sun Dec 17, 2006 2:09 pm
by Brynet-Inc
Interesting...

Code: Select all

void *
memcpy(void *dest, const void *src, size_t len)
{
  char *d = dest;
  const char *s = src;
  while (len--)
    *d++ = *s++;
  return dest;
}

Posted: Sun Dec 17, 2006 3:47 pm
by Combuster
Have you zeroed memory, and flushed the appropriate pages if necessary (which are common reasons why code doesnt run outside emulators)?

Posted: Mon Dec 18, 2006 12:55 am
by MagicalTux
Ok, the crash still happens with Brynet-Inc's function (a bit modified).

Fyi, this happens in the bootloader. 512 bytes of data is read to memory location 0x68010 (using 16bit interrupt), then it's copied to 0x100000 in the 32bit bootloader code.

Code: Select all

void *memcpy(void *dest, const void *src, size_t len) {
  kprintf("memcpy: %p->%p (len=%d)\r\n", src, dest, len);
  char *d = dest;
  const char *s = src;
  while (len--)
    *d++ = *s++;
  kprintf("memcpy: finished\r\n");
  return dest;
}
This just shows :

Code: Select all

memcpy: 0x00068010->0x00100000 (len=512)
(and then the vmware crashes, still same message)

I tried to put the kprintf() in the while loop... vmware no longer shows a message saying it crashed, however the copy seems to freeze quickly.
Shot: http://mikan.ookoo.org/vmware_crash_memcpy.png

Btw that's how my initial memcpy() function was :

Code: Select all

void *memcpy (void *__restrict __dest,
    __const void *__restrict __src, uint32_t __n) {
  __asm__ ("cld\n"
    "rep\n"
    "movsb"
    : /* no input (?) */
    : "S"(__src), "D"(__dest), "c"(__n));
  return __dest;
}
(found it on IBM's website, and it works well on qemu)

Posted: Mon Dec 18, 2006 2:48 am
by Walling
Have you enabled the A20 line? If not you don't have access to write to memory located above 1 MiB. Though it freezes on 0x100024 :?

Posted: Mon Dec 18, 2006 6:46 am
by MagicalTux
Ok, it seems to be related to the A20 line. I don't understand why, I already coded something to enable it, however it seems that it didn't work.

Well, I guess I'll have to try to fix that, thanks for the help, now I know where I have to look :)


EDIT:
Ok, the code I initially took from grub (I didn't want to bother with that) didn't work, so I replaced by this tiny code :

Code: Select all

        outb(0x92, inb(0x92)|2);
And... it worked! Thanks!

Posted: Mon Dec 18, 2006 7:23 am
by Brendan
Hi,
MagicalTux wrote:Ok, the code I initially took from grub (I didn't want to bother with that) didn't work, so I replaced by this tiny code :

Code: Select all

        outb(0x92, inb(0x92)|2);
And... it worked! Thanks!
You might like to read something like this web page about A20 issues.....


Cheers,

Brendan

Posted: Mon Dec 18, 2006 10:43 am
by MagicalTux
Yeah I know about A20 issues, however for now I had no bugreport from people with old machines.

Anyway my OS won't work on old hardware (pre-pentium) so it's probably not that problematic. Maybe one day more than two people will use my OS, in this case I'll consider about having a good fix to the A20 problem.