[SOLVED] My memcpy and memzero destroy the stack???

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
theesemtheesem
Member
Member
Posts: 31
Joined: Thu Mar 20, 2014 2:22 pm
Location: London, UK

[SOLVED] My memcpy and memzero destroy the stack???

Post by theesemtheesem »

Hi All,

This is my first post so sorry if I do something stupid.
Anyway, to the heart of the matter.

I am developing a kernel in Free Pascal. I have these two procedures:

Code: Select all

procedure kernel_memzero(dest : pointer; numbytes : dword);cdecl; [public, alias: 'kernel_memzero'];
begin
  asm
        mov edi, dest;
	mov ecx, numbytes;
	xor al, al

@loop1:
        mov [edi], al

        add edi, 1
        dec ecx
        jnz @loop1
  end ['EAX','ECX','EDI'];
end;


procedure kernel_memcpy(source, dest : pointer; numbytes : dword); cdecl; [public, alias: 'kernel_memcpy'];
begin
  asm
        mov esi, source
        mov edi, dest
        mov ecx, numbytes

@loop1:
        mov al, [esi]
        mov [edi], al

        add esi, 1
        add edi, 1
        dec ecx
        jnz @loop1
  end ['EAX','ECX','EDI','ESI'];
end;
Please don't mind these are slow and shuffle only 1 byte at a go, at this point it is not important.
The problem is that these DO work when called from another procedure, i.e my screen handling
procedures use memzero and memcpy to scroll the screen up, all variables are initialized with memzero,
large variables or blocks are moved around and it works. However if I try to use these inside my kmain,
they cause the system to go bananas. The screen gets trashed, and the system reboots. Looking into
BOCHS's logs is not very usefull - there's info about accessing null descriptors in the GDT, and the cause
of the reboot is always a triple fault. CR2 has a value of FFFFFFFC or something similar, and ESP is 00000000.
Please note that my kernel is not higher half, so addresses like FFFFFFFC should normally invoke the page fault handler.
This does not happen, so it seems that the IDT gets trashed as well. Please note that EIP seems to be more or less where it should be.

So my guess is that somehow the stack gets trashed? Maybe the return address gets trashed? Changing the
size of the stack seems to change the behavior a bit, i meen if I make the kernel stack ridiculously large, say 128KB
or thereabouts, it is often possible to use memzero once or twice in kmain, as long as the number of bytes to zero/copy
is small. Copying/zeroing 1000 bytes (1000, not 0x1000) always crashes the kernel.

Another funny issue is that even disabling interrupts with cli before calling memzero/memcpy does not help.

Anyone has any ideas?

Cheers,
Andrew
Last edited by theesemtheesem on Wed Apr 02, 2014 8:06 am, edited 1 time in total.
User avatar
eryjus
Member
Member
Posts: 286
Joined: Fri Oct 21, 2011 9:47 pm
Libera.chat IRC: eryjus
Location: Tustin, CA USA

Re: My memcpy and memzero destroy the stack???

Post by eryjus »

I cannot speak to the inner workings of Pascal, but I know in C the calling function "owns" the ESI and EDI registers, while the called function "owns" the ECX and EAX registers. Since you are manipulating registers in asm directly, I would compile to asm and review what is really happening under the covers.
Adam

The name is fitting: Century Hobby OS -- At this rate, it's gonna take me that long!
Read about my mistakes and missteps with this iteration: Journal

"Sometimes things just don't make sense until you figure them out." -- Phil Stahlheber
theesemtheesem
Member
Member
Posts: 31
Joined: Thu Mar 20, 2014 2:22 pm
Location: London, UK

Re: My memcpy and memzero destroy the stack???

Post by theesemtheesem »

Hi,

Yes, I thought about that, and disassembled the kernel to see what is going on. It turns out that the registers are saved by the compiler before the asm block starts (it pushes them to the stack and pops them back). That's what the ['EAX','ECX','EDI'] at the end of the asm block is for. Again, the funny thing is that it works fine when called from anywhere else than kmain (which is the equivalent of C's main). Now I tried to rewrite these functions in Pascal (not to use asm so to speak), and the problem is still there. Honestly I am loosing my sanity here, because this just makes no sense...

Cheers,
Andrew
User avatar
Bender
Member
Member
Posts: 449
Joined: Wed Aug 21, 2013 3:53 am
Libera.chat IRC: bender|
Location: Asia, Singapore

Re: My memcpy and memzero destroy the stack???

Post by Bender »

Hi,

Code: Select all

mov al, [esi]
        mov [edi], al

        add esi, 1
        add edi, 1
        dec ecx
I don't see any problem with the above code but try this, Since I guess you have tried everything.

Code: Select all

;; Clear #DF to make sure we are going UP^ in memory
cld
@loop:
lodsb
stosb
loop @loop
Well, I am assuming that DS:ESI points to your source and ES:EDI is your destination. (Notice the segments)
I did have some problems with mov al, [esi] before, I have no idea whatever caused it but now I use
rep movsb in my memcpy function.
Can your post what output you get? Dump? Bochs?
-Bender
"In a time of universal deceit - telling the truth is a revolutionary act." -- George Orwell
(R3X Runtime VM)(CHIP8 Interpreter OS)
theesemtheesem
Member
Member
Posts: 31
Joined: Thu Mar 20, 2014 2:22 pm
Location: London, UK

Re: My memcpy and memzero destroy the stack???

Post by theesemtheesem »

Hi,

Thanks for Your reply, Your code is more elegant and I think will be faster as it uses less instructions.
Unfortunately, no change - both still work when called from anywhere else besides kmain, and crash the kernel
with lots of interesting garbage on the screen when called from kmain :(

This is the code in kmain:

Code: Select all

        //kmem_kmalloc(size,owner : dword)   size in bytes, owner - tag for kmem_debug_dumpheapalloc
        //so we can see who allocated each block
        tmpp1:=kmem_kmalloc((9 shl 20),$ABCD0001);  // alloc a 9MB block
        tmpp2:=kmem_kmalloc((1 shl 20),$ABCD0002);  // alloc a 1MB block
        tmpp3:=kmem_kmalloc((2 shl 20),$ABCD0003);  // alloc a 2MB block
        kconsole_putstr('zero tmpp1');
        kernel_memzero(tmpp1,5000); // zero just the first 5000 bytes
        kconsole_putstr('zero tmpp2');
        kernel_memzero(tmpp2,5000);
        kconsole_putstr('zero tmpp3');
        kernel_memzero(tmpp3,5000);
As You can see I am working on kernel malloc at this moment. I am sure that the addresses returned by kmalloc
are OK - they fall within the range that I have put aside for the kernel heap. Furthermore, when I do:

Code: Select all

        var p : pchar;
(...)
        p:=kmem_kmalloc(10,$ABCD4444);
        p^:='TESTING!'#0;
        kconsole_putstr(p);
everything works. Bud obviously doing

Code: Select all

        kernel_memzero(p,10);
causes a crash.

It is crazy, since kmem_memzero is, as I explained, used all over the place - it clears the screen, it is used to clear the
bottom of the screen when it needs to be scrolled (and memcpy moves it up).

I am attaching two bohcs log files - one with memzero commented out (bochs_ok.txt), one with memzero actualy used (bochs_err.txt). I've been trying to tackle this problem for a week now and it's driving me insane.

Funny note - different behavior in bochs and qemu - bochs reboots after showing garbage on the screen, qemu just keeps
on going and going and going and spews trash to the screen, but does not reboot.

Any ideas are welcome, at this point I will try just about anything.

Cheers,
Andrew
Attachments
bochs_error.txt
memzero used in kmain
(22.12 KiB) Downloaded 88 times
bochs_ok.txt
memzero commented out in kmain
(14.9 KiB) Downloaded 92 times
User avatar
Bender
Member
Member
Posts: 449
Joined: Wed Aug 21, 2013 3:53 am
Libera.chat IRC: bender|
Location: Asia, Singapore

Re: My memcpy and memzero destroy the stack???

Post by Bender »

Is your stack sane? Can you try it without involving PUSH/POP?
"In a time of universal deceit - telling the truth is a revolutionary act." -- George Orwell
(R3X Runtime VM)(CHIP8 Interpreter OS)
jnc100
Member
Member
Posts: 775
Joined: Mon Apr 09, 2007 12:10 pm
Location: London, UK
Contact:

Re: My memcpy and memzero destroy the stack???

Post by jnc100 »

I note you use the cdecl calling convention in the memcpy routine. Are you also using this at the call site (in kmain)? If not and the calling function is using stdcall I can see how your stack would be trashed.

Regards,
John.
theesemtheesem
Member
Member
Posts: 31
Joined: Thu Mar 20, 2014 2:22 pm
Location: London, UK

Re: My memcpy and memzero destroy the stack???

Post by theesemtheesem »

Hi,
Yes, as the matter of fact kmain is stdcall.

Code: Select all


procedure kmain(mbinfo: Pmultiboot_info_t; mbmagic: DWORD); stdcall;

All the functions/procedures are cdecl in my kernel, just in case I ever needed to interface some C code with it...
Is that what's causing the problem?

Anyway, I changed kmain to cdecl just now and still no joy :( Or should I change everything else to stdcall?

As to the stack - it seems that there is something there, because as I said if I increase the stack size
the kernel will survive one or two calls to memzero from kmain. Still lots and lots of other functions are called (and all of these are cdecl too) - see attached source file, and it only fails precisely at memzero.

If anyone is willing to have an in-depth look I can put the complete sources somewhere.

Best Regards,
Andrew
Attachments

[The extension pas has been deactivated and can no longer be displayed.]

Gigasoft
Member
Member
Posts: 856
Joined: Sat Nov 21, 2009 5:11 pm

Re: My memcpy and memzero destroy the stack???

Post by Gigasoft »

You can single step through your code in Bochs to see exactly where it fails. Then you don't have to guess.
theesemtheesem
Member
Member
Posts: 31
Joined: Thu Mar 20, 2014 2:22 pm
Location: London, UK

Re: My memcpy and memzero destroy the stack???

Post by theesemtheesem »

Hi,

OK now this is really weird. When I single step it it works. Always.
Am I correct in my assumption that bochs does not issue interrupts during single stepping?
Because if it does not, than the problem must be in my ISRs.

Cheers
Andrew
theesemtheesem
Member
Member
Posts: 31
Joined: Thu Mar 20, 2014 2:22 pm
Location: London, UK

Re: My memcpy and memzero destroy the stack???

Post by theesemtheesem »

Hi,

For anyone who follows this - I have found the cause of the problem, the problem is I still do not understand why this happens.
memzero and mamcpy break when it inside a repeat...until loop. They work from kmain when called "on their own", but die miserably inside a loop.
So, to ilustrate:

Code: Select all

        tmppchar:=kmem_kmalloc(1000,$AABBCCDD);
        kernel_memzero(tmppchar,1000);
        kstrings_const2pchar('AAA'#66#77#88#0#0#0,tmppchar);
        tmpp1:=kmem_kmalloc((9 shl 20),$ABCD0001);
        tmpp2:=kmem_kmalloc((1 shl 20),$ABCD0002);
        tmpp3:=kmem_kmalloc((2 shl 20),$ABCD0003);
        kconsole_putstr('zero tmpp1');
        kernel_memzero(tmpp1,1000);
        kconsole_putstr('zero tmpp2');
        kernel_memzero(tmpp2,1000);
        kconsole_putstr('zero tmpp3');
        kernel_memzero(tmpp3,1000);

        kconsole_putstr(tmppchar);
        kmem_debug_blockinfo(tmppchar);
        kmem_debug_dumpmem(tmppchar-8,4);
works fine.

But:

Code: Select all

        repeat
        tmppchar:=kmem_kmalloc(1000,$AABBCCDD);
        kernel_memzero(tmppchar,1000);
        kstrings_const2pchar('AAA'#66#77#88#0#0#0,tmppchar);
        tmpp1:=kmem_kmalloc((9 shl 20),$ABCD0001);
        tmpp2:=kmem_kmalloc((1 shl 20),$ABCD0002);
        tmpp3:=kmem_kmalloc((2 shl 20),$ABCD0003);
        kconsole_putstr('zero tmpp1');
        kernel_memzero(tmpp1,1000);
        kconsole_putstr('zero tmpp2');
        kernel_memzero(tmpp2,1000);
        kconsole_putstr('zero tmpp3');
        kernel_memzero(tmpp3,1000);

        kconsole_putstr(tmppchar);
        kmem_debug_blockinfo(tmppchar);
        kmem_debug_dumpmem(tmppchar-8,4);
        until 1=1; //loop once
dies as soon as the first memzero is called.

Now I have no idea why.

Cheers,
Andrew
theesemtheesem
Member
Member
Posts: 31
Joined: Thu Mar 20, 2014 2:22 pm
Location: London, UK

Re: My memcpy and memzero destroy the stack???

Post by theesemtheesem »

Hi,

Nope :( still no joy

If it's outside a loop it only works for small numbers of bytes.
So zeroing a 1000 bytes - now ok,
but zeroing 2000 bytes - crash.

DARN!

Cheers,
Andrew
User avatar
eryjus
Member
Member
Posts: 286
Joined: Fri Oct 21, 2011 9:47 pm
Libera.chat IRC: eryjus
Location: Tustin, CA USA

Re: My memcpy and memzero destroy the stack???

Post by eryjus »

Andrew,

A couple of observations:
1. You are allocating memory inside the loop. Are you sure you are not consuming all your memory and the kernel_memzero() function is just is demonstrating the issue (yeah, I know it is a single iteration...)?
2. You are not checking for successful memory allocation. You should check for the proper return value of kmem_kmalloc() before trying to manipulate the contents of the memory pointer.
3. Can you trivialize the test scenarios? For example:

Code: Select all

tmpp1:=kmem_kmalloc((9 shl 20),$ABCD0001);
kernel_memzero(tmpp1,1000);
When that is successful:

Code: Select all

tmpp1:=kmem_kmalloc((9 shl 20),$ABCD0001);
{ notice the allocation is outside the loop }
repeat
    kernel_memzero(tmpp1,1000);
until 1=1;
And, when that is successful:

Code: Select all

tmpp1:=kmem_kmalloc((9 shl 20),$ABCD0001);
i := 0;
repeat
    kernel_memzero(tmpp1,1000);
    i := i + 1;
until i=100;
Is it just repeat...until loops? Or do all loop constructs have this issue? The challenge here is to narrow down the specific circumstances under which you have an issue. So, we go back to identifying and verifying assumptions...

Disclaimer: my Pascal is rusty -- it's been a while.
Adam

The name is fitting: Century Hobby OS -- At this rate, it's gonna take me that long!
Read about my mistakes and missteps with this iteration: Journal

"Sometimes things just don't make sense until you figure them out." -- Phil Stahlheber
theesemtheesem
Member
Member
Posts: 31
Joined: Thu Mar 20, 2014 2:22 pm
Location: London, UK

Re: My memcpy and memzero destroy the stack???

Post by theesemtheesem »

Hi,

Thanks to everyone for Your comments and input, I appreciate it very much.
It turns out that there is a problem somewhere in my paging code.

In desperation I started to turn sections of the kernel off to see what happens.
I disabled the keyboard and mouse handlers - no joy.
I disabled the rtc handler - no joy.
Next in line went the pit, and finally I even disabled initializing the pic.
But as soon as I disabled paging (my kernel is not higher half, so at this stage
it just sits above 1mb) everything started to run smoothly. I will go back to examine
the paging code - must be a stupid mistake in there, something must be mapped wrong.

Once again thanks to everyone for Your help and support.

Cheers,
Andrew
Post Reply