Page 1 of 1

Strange instruction pointer behaviour

Posted: Wed Nov 02, 2011 9:57 pm
by kfreezen
Hello,

I am working on the multitasking in my kernel. My current sequence of events involves the following.

1. Allocate pages in order to map stack to 0xE0000000.
2. Call new_start

Code: Select all

new_start: ; this switches to the new stack and calls new_main
	cli
	nop
	
        mov esi, [esp+4]

	mov esp, esi
	mov ebp, esp
	
	push esp
	push ebp
	
	sti
	call new_main
	
	jmp $
As you can see there, the new_main is called.

Code: Select all

int new_main() {	
	fs_root = (fs_node_t*) init_initrd(initrdloc);
	init_tasking();
	
	KB_Init();
	
	int a = fork();
	
	kprintf("%x\n", a);
	
	return 0;
}
EIP somehow manages to get a sub-1MB pointer through it all.
The CPU issues interrupt 5 at 0x2c6e1
Thanks in advance for your help.

EDIT: Also, in case its needed my task switch code.

Code: Select all

void switch_task() {
	if(!cur_task) {
		return;
	}

	#ifdef TASK_DEBUG
	kprintf("switch_task()\n");
	#endif
	
	UInt32 esp, ebp, eip;
	asm volatile("mov %%esp, %0" : "=r"(esp));
	asm volatile("mov %%ebp, %0" : "=r"(ebp));
	eip = read_eip();

	if(eip == 0x12345) {
		return;
	}
	
	kprintf("%x\n", eip);
	
	cur_task->eip = eip;
	cur_task->esp = esp;
	cur_task->ebp = ebp;
	
	cur_task = cur_task->next;
	if(!cur_task) cur_task = ready_queue;

	eip = cur_task->eip;
	esp = cur_task->esp;
	ebp = cur_task->ebp;
	
	cur_dir = cur_task->pd;
	
	#ifdef TASK_DEBUG
	kprintf("eip,esp,ebp:%x,%x,%x\n", eip, esp, ebp);
	#endif
	
	asm volatile("         \
	 cli;                 \
	 mov %0, %%ecx;       \
	 mov %1, %%esp;       \
	 mov %2, %%ebp;       \
	 mov %3, %%cr3;       \
	 mov $0x12345, %%eax; \
	 sti;                 \
	 jmp *%%ecx           "
	 	: : "r"(eip), "r"(esp), "r"(ebp), "r"(cur_dir->phys));
}
edit2: ecx should contain the correct eip when i jump to it as an int3 right before the jmp reports the correct address. I'm wondering whether the stack is set up properly. If it isn't how can I go about it?

Re: Strange instruction pointer behaviour

Posted: Thu Nov 03, 2011 1:26 am
by xenos
My first advice is using Bochs debugger and single stepping through your code to see what's going on. I guess either your stack or your page tables get messed up somehow, but I cannot say that for sure based on the information in your post.

Re: Strange instruction pointer behaviour

Posted: Thu Nov 03, 2011 3:22 am
by gerryg400
a) In new_start you do this

Code: Select all

 mov ebp, esp
Where does the value of ebp come from ? IS your stack initialised to something sane?

b) task_switch() has potential problems and should not be coded this way. It reads and writes esp and ebp in a C function. That's just not reliable.

c) I don't understand how your program ever leaves task_switch(). At the end of task_switch() there is a jmp which seems to take you back to the middle of task_switch() in a different thread. How's that supposed to work ?

d) Your code doesn't save and restore that FLAGS register. You must do that.

Re: Strange instruction pointer behaviour

Posted: Thu Nov 03, 2011 9:11 am
by kfreezen
Where does the value of ebp come from ? IS your stack initialised to something sane?
Yes, I believe so. My conclusion is that the bug comes from the

Code: Select all

if(eip == 0x12345) {
      return;
}
That itself returns properly, however, that code must somehow mess up the stack pointer. As I stated before, ecx contains the correct value. I believe eip gets that sub-1MB value on the return.
c) I don't understand how your program ever leaves task_switch(). At the end of task_switch() there is a jmp which seems to take you back to the middle of task_switch() in a different thread. How's that supposed to work ?
It's supposed to work by "tricking" 0x12345 into the "eip" variable via eax.

Re: Strange instruction pointer behaviour

Posted: Thu Nov 03, 2011 10:08 am
by Solar
kfreezen wrote:I'm wondering whether the stack is set up properly. If it isn't how can I go about it?
gerryg400 wrote:IS your stack initialised to something sane?
kfreezen wrote:Yes, I believe so.
Looks like you're trying to catch your own tail. 8) How about showing us the piece of code that sets up your stack?
kfreezen wrote:That itself returns properly, however, that code must somehow mess up the stack pointer. As I stated before, ecx contains the correct value. I believe eip gets that sub-1MB value on the return.
Which could perfectly well be the result of a stack not properly set up or smashed. (You remember that C gets the return address off the stack?)
c) I don't understand how your program ever leaves task_switch(). At the end of task_switch() there is a jmp which seems to take you back to the middle of task_switch() in a different thread. How's that supposed to work ?
It's supposed to work by "tricking" 0x12345 into the "eip" variable via eax.
Setting eip? In a C function?

{Mythbuster voice} OK, you wait here until I get into that bunker over there and close the door. Then you can try that. 8)

Re: Strange instruction pointer behaviour

Posted: Thu Nov 03, 2011 10:29 am
by kfreezen
(You remember that C gets the return address off the stack?)
Of course. Is there any reference on how to set up a new stack?
Setting eip? In a C function?
Notice I said the variable, not the actual instruction pointer. That is accomplished by the "jmp *%%ecx" you see in the original code. Or maybe you were meaning that.

The move_stack() function.

Code: Select all

void move_stack(Pointer new_stack_start, UInt32 size) {
	UInt32 i;
	
	/*int status = alloc_pages(NULL, (Pointer)new_stack_start-size, (Pointer)new_stack_start+0x1000);*/
	
	for(i=(UInt32)new_stack_start+0x1000; i>=((UInt32)new_stack_start-size); i -= 0x1000) {
		alloc_page(NULL, (Pointer) i-0x1000);
	}
	
	memset(new_stack_start-size,0,size+0x100);
	init_orig_dir();
}
That and the new_start are all I am doing to set up the stack. The init_orig_dir() function clones the kernel directory so that when a directory is cloned, the kernel can tell which pages are kernel code and kernel heap and which are not.

edit: OK, I changed the location of init_orig_dir() to the first of the above function. However, now it triple faults. I will try to determine the cause.

Re: Strange instruction pointer behaviour

Posted: Thu Nov 03, 2011 3:37 pm
by kfreezen
Here is a bochs crash log.

Code: Select all

00701970784i[CPU0 ] CPU is in protected mode (active)
00701970784i[CPU0 ] CS.d_b = 32 bit
00701970784i[CPU0 ] SS.d_b = 32 bit
00701970784i[CPU0 ] EFER   = 0x00000000
00701970784i[CPU0 ] | RAX=000000000000008e  RBX=0000000002504000
00701970784i[CPU0 ] | RCX=00000000dfffff28  RDX=0000000000000008
00701970784i[CPU0 ] | RSP=00000000dfffcffc  RBP=00000000dfffd014
00701970784i[CPU0 ] | RSI=00000000dffffffc  RDI=000000000002c6db
00701970784i[CPU0 ] |  R8=0000000000000000   R9=0000000000000000
00701970784i[CPU0 ] | R10=0000000000000000  R11=0000000000000000
00701970784i[CPU0 ] | R12=0000000000000000  R13=0000000000000000
00701970784i[CPU0 ] | R14=0000000000000000  R15=0000000000000000
00701970784i[CPU0 ] | IOPL=0 id vip vif ac vm RF nt of df if tf SF zf AF PF cf
00701970784i[CPU0 ] | SEG selector     base    limit G D
00701970784i[CPU0 ] | SEG sltr(index|ti|rpl)     base    limit G D
00701970784i[CPU0 ] |  CS:0008( 0001| 0|  0) 00000000 ffffffff 1 1
00701970784i[CPU0 ] |  DS:0010( 0002| 0|  0) 00000000 ffffffff 1 1
00701970784i[CPU0 ] |  SS:0010( 0002| 0|  0) 00000000 ffffffff 1 1
00701970784i[CPU0 ] |  ES:0010( 0002| 0|  0) 00000000 ffffffff 1 1
00701970784i[CPU0 ] |  FS:0010( 0002| 0|  0) 00000000 ffffffff 1 1
00701970784i[CPU0 ] |  GS:0010( 0002| 0|  0) 00000000 ffffffff 1 1
00701970784i[CPU0 ] |  MSR_FS_BASE:0000000000000000
00701970784i[CPU0 ] |  MSR_GS_BASE:0000000000000000
00701970784i[CPU0 ] | RIP=0000000000101107 (0000000000101107)
00701970784i[CPU0 ] | CR0=0xe0000011 CR2=0x00000000dfffcff8
00701970784i[CPU0 ] | CR3=0x0011e000 CR4=0x00000000
00701970784i[CPU0 ] 0x0000000000101107>> mov byte ptr ss:[ebp-24], al : 8845E8
00701970784e[CPU0 ] exception(): 3rd (14) exception with no resolution, shutdown status is 00h, resetting
I do not believe that the address in ESP (RSP) is mapped, thus giving 3 page faults and leading to a triple fault.
edit: also I extended the range of the mapping and it still triplefaults.
edit2: It appears that somehow esp gets the value of the bottom address of the mapped stack section minus 4. (0xdfffcffc as seen here)
edit3: I pinpointed the triplefault to the CloneDirectory() function.

Code: Select all

PageTable* CloneTable(PageDirectory* pd, PageTable* src, UInt32* phys) {
	#ifdef PAGING_TRACE
	kprintf("CloneTable(%x,%x,%x)\n", pd, src, phys);
	#endif
	
	PageTable* table = (PageTable*)kmalloc_ex(sizeof(PageTable), true, phys, false);
	
	memset(table, 0, sizeof(PageTable));
	
	int i;
	
	#ifdef PAGING_CHECKPOINT
	kprintf("CHECKPOINT:CloneTable().1\n");
	asm volatile("int $3");
	#endif
	
	for(i=0; i<1024; i++) {
		if((src->t[i]&0xFFFFF000)) {
			
			alloc_page(pd, (Pointer) src->t[i]);
		
			UInt32 flags_mask = ~(0xFFFFF000);
			table->t[i] |= (src->t[i]&flags_mask);
			
			#ifdef PAGING_CHECKPOINT
			if(i>=181) asm volatile("int $3");
			#endif
			
			copy_page_phys((src->t[i]&0xFFFFF000), (table->t[i]&0xFFFFF000));
		}
	}
	
	return table;
}