Stack and interrupt problems

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
mariuszp
Member
Member
Posts: 587
Joined: Sat Oct 16, 2010 3:38 pm

Stack and interrupt problems

Post by mariuszp »

In my interrupt handlers, including the interrupt handler for system calls, it seems that the stack keeps breaking (values suddenly change). The kernel is preemptive, so interrupts can occur in kernel mode, while the system calls are being handled. Before it was preemptive though, the stack was never breaking. For some reason, most of the time when the stack breaks, a #PF or #GP happens in this function:

Code: Select all

Dir *parsePath(const char *path, int flags, int *error)
{
	kprintf_debug("start of parsePath()\n");

	*error = VFS_NO_FILE;			// default error
	// TODO: relative paths
	if (path[0] != '/')
	{
		return NULL;
	};

	SplitPath spath;
	if (resolveMounts(path, &spath) != 0)
	{
		return NULL;
	};

	char token[128];
	char *end = (char*) &token[127];
	const char *scan = spath.filename;

	if (spath.fs->openroot == NULL)
	{
		return NULL;
	};

	Dir *dir = (Dir*) kmalloc(sizeof(Dir));
	memset(dir, 0, sizeof(Dir));
	if (spath.fs->openroot(spath.fs, dir, sizeof(Dir)) != 0)
	{
		kfree(dir);
		return NULL;
	};

	while (1)
	{
		char *put = token;
		while ((*scan != 0) && (*scan != '/'))
		{
			if (put == end)
			{
				*put = 0;
				panic("parsePath(): token too long: '%s'\n", token);
			};
			*put++ = *scan++;
		};
		*put = 0;

		kprintf_debug("token '%s'\n", token);
		if (strlen(token) == 0)
		{
			if (*scan == 0)
			{
				return dir;
			};

			if (dir->close != NULL) dir->close(dir);
			kfree(dir);
			return NULL;
		};

		while (strcmp(dir->dirent.d_name, token) != 0)
		{
			if (dir->next(dir) != 0)
			{
				if (dir->close != NULL) dir->close(dir);
				kfree(dir);
				return NULL;
			};
		};

		if (*scan == '/')
		{
			if ((dir->stat.st_mode & VFS_MODE_DIRECTORY) == 0)
			{
				*error = VFS_NOT_DIR;
				if (dir->close != NULL) dir->close(dir);
				kfree(dir);
				return NULL;
			};

			if ((!vfsCanCurrentThread(&dir->stat, 1)) && (flags & VFS_CHECK_ACCESS))
			{
				if (dir->close != NULL) dir->close(dir);
				kfree(dir);
				*error = VFS_PERM;
				return NULL;
			};

			Dir *subdir = (Dir*) kmalloc(sizeof(Dir));
			memset(subdir, 0, sizeof(Dir));

			if (dir->opendir(dir, subdir, sizeof(Dir)) != 0)
			{
				kfree(subdir);
				if (dir->close != NULL) dir->close(dir);
				kfree(dir);
				return NULL;
			};

			if (dir->close != NULL) dir->close(dir);
			kfree(dir);
			dir = subdir;

			scan++;		// skip over '/'
		}
		else
		{
			return dir;
		};
	};
};
When I look at the instructions that causes the #PF or #GP, it is always a different instruction but in every case I found so far, it is taking a pointer from the stack and deferences it. This leads me to ask:

I know that when an interrupt occurs in user mode, the CPU sets the RSP to the value RSP0 from the TSS. And as far as I know, if an interrupt occurs in kernel mode, the interrupt stack from is pushed onto the current stack - it does not take the value from RSP0, is that correct?

When I do a task switch, I change the value of RSP0 in the TSS to a distinct per-thread value, so I do not see how the stack could be breaking. On the other hand, the fact that parsePath() seems to be throwing those exceptions MOST (but not all) of the time, I think something could be wrong with it, but it does appear correct.

If I am right about the RSP0 thing, then what else could be causing the stack to break? And if I'm wrong, how can I make the kernel preemptive without breaking the stack?
User avatar
sortie
Member
Member
Posts: 931
Joined: Wed Mar 21, 2012 3:01 pm
Libera.chat IRC: sortie

Re: Stack and interrupt problems

Post by sortie »

I suggest you show your interrupt handlers. It's very likely it corrupts the interrupted state.
jnc100
Member
Member
Posts: 775
Joined: Mon Apr 09, 2007 12:10 pm
Location: London, UK
Contact:

Re: Stack and interrupt problems

Post by jnc100 »

Also, as you're using 64 bit mode (assumed due to you discussing RSP) are you compiling with the appropriate flags (disabling the red zone and also use of mmx/sse etc)?

Regards,
John.
mariuszp
Member
Member
Posts: 587
Joined: Sat Oct 16, 2010 3:38 pm

Re: Stack and interrupt problems

Post by mariuszp »

My interrupt handlers:

Code: Select all

;	Glidix kernel
;
;	Copyright (c) 2014, Madd Games.
;	All rights reserved.
;	
;	Redistribution and use in source and binary forms, with or without
;	modification, are permitted provided that the following conditions are met:
;	
;	* Redistributions of source code must retain the above copyright notice, this
;		list of conditions and the following disclaimer.
;	
;	* Redistributions in binary form must reproduce the above copyright notice,
;		this list of conditions and the following disclaimer in the documentation
;		and/or other materials provided with the distribution.
;	
;	THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
;	AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
;	IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
;	DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
;	FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
;	DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
;	SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
;	CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
;	OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
;	OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

section .text
bits 64

[global _intCounter]
_intCounter dq 0

[global loadIDT]
[extern idtPtr]
loadIDT:
	mov rax, qword idtPtr
	lidt [rax]
	ret

%macro pushAll 0
	push	 r15
	push	 r14
	push	 r13
	push	 r12
	push	 r11
	push	 r10
	push	 r9
	push	 r8
	push	 rax
	push	 rcx
	push	 rdx
	push	 rbx
	push	 rbp
	push	 rsi
	push	 rdi
%endmacro

%macro popAll 0
	pop	rdi
	pop	rsi
	pop	rbp
	pop	rbx
	pop	rdx
	pop	rcx
	pop	rax
	pop	r8
	pop	r9
	pop	r10
	pop	r11
	pop	r12
	pop	r13
	pop	r14
	pop	r15
%endmacro

[extern isrHandler]

isrCommon:
	pushAll

	mov			ax, ds
	push			rax

	mov			ax, 0x10
	mov			ds, ax
	mov			es, ax
	mov			fs, ax
	mov			gs, ax

	mov			rdi, rsp		; pass a pointer to registers as argument to isrHandler
	mov			rbx, rsp		; save the RSP (RBX is preserved, remember).
	and			rsp, ~0xF		; align on 16-byte boundary.
	call	 		isrHandler
	mov			rsp, rbx		; restore the real stack

	pop			rbx
	mov			ds, bx
	mov			es, bx
	mov			fs, bx
	mov			gs, bx

	popAll
	add			rsp, 16
	;sti
	iretq

%macro ISR_NOERRCODE 1
	global isr%1
	isr%1:
		cli
		push qword 0 
		push qword %1
		jmp isrCommon
%endmacro

%macro ISR_ERRCODE 1
	global isr%1
	isr%1:
		cli
		push qword %1
		jmp isrCommon
%endmacro

%macro IRQ 2
	global irq%1
	irq%1:
		cli
		push qword 0
		push qword %2
		jmp isrCommon
%endmacro

ISR_NOERRCODE 0
ISR_NOERRCODE 1
ISR_NOERRCODE 2
ISR_NOERRCODE 3
ISR_NOERRCODE 4
ISR_NOERRCODE 5
ISR_NOERRCODE 6
ISR_NOERRCODE 7
ISR_ERRCODE	 8
ISR_NOERRCODE 9
ISR_ERRCODE	 10
ISR_ERRCODE	 11
ISR_ERRCODE	12
ISR_ERRCODE	13
ISR_ERRCODE	14
ISR_NOERRCODE 15
ISR_NOERRCODE 16
ISR_NOERRCODE 17
ISR_NOERRCODE 18
ISR_NOERRCODE 19
ISR_NOERRCODE 20
ISR_NOERRCODE 21
ISR_NOERRCODE 22
ISR_NOERRCODE 23
ISR_NOERRCODE 24
ISR_NOERRCODE 25
ISR_NOERRCODE 26
ISR_NOERRCODE 27
ISR_NOERRCODE 28
ISR_NOERRCODE 29
ISR_NOERRCODE 30
ISR_NOERRCODE 31

IRQ	0,	32
IRQ	1,	33
IRQ	2,	34
IRQ	3,	35
IRQ	4,	36
IRQ	5,	37
IRQ	6,	38
IRQ	7,	39
IRQ	8,	40
IRQ	9,	41
IRQ	10,	42
IRQ	11,	43
IRQ	12,	44
IRQ	13,	45
IRQ	14,	46
IRQ	15,	47
And the CFLAGS:

Code: Select all

-ffreestanding -mcmodel=large -mno-red-zone -mno-mmx -mno-sse -mno-sse2 -fno-common -fno-builtin -I include -Wall -Werror
Here's the Regs structure (a pointer to it is passed to the C code by the interrupt handlers as shown above):

Code: Select all

typedef struct {
	uint64_t ds;
	uint64_t rdi, rsi, rbp, rbx, rdx, rcx, rax;
	uint64_t r8, r9, r10, r11, r12, r13, r14, r15;
	uint64_t intNo;
	uint64_t errCode;
	uint64_t rip, cs, rflags, rsp, ss;
} PACKED Regs;
I also have a function called switchContext(), which takes a pointer to a Regs structure and changes the register states to it (used for task switching):

Code: Select all

[global switchContext]
switchContext:
	; The argument is stored in RDI, and is the address of a Regs structure.
	; If we move the stack there, we can easily do a context switch with a
	; bunch of pops.
	cli
	mov	rsp,		rdi

	; first we switch the DS
	pop	rbx
	mov	ds,		bx
	mov	es,		bx
	mov	fs,		bx
	mov	gs,		bx

	; GPRs
	pop	rdi
	pop	rsi
	pop	rbp
	pop	rbx
	pop	rdx
	pop	rcx
	pop	rax
	pop	r8
	pop	r9
	pop	r10
	pop	r11
	pop	r12
	pop	r13
	pop	r14
	pop	r15

	; ignore "intNo" and "errCode"
	add	rsp,		16

	; the rest is popped by an interrupt return
	iretq
See the 'cli' at the top? Yesterday I did not have this instruction there before I posted this question, then I added that just in case it fixes anything, and I cannot reproduce this bug anymore. But I am not certain that it is gone, because even before, this bug did not occur at all times.

So, is anything in this code incorrect?
mariuszp
Member
Member
Posts: 587
Joined: Sat Oct 16, 2010 3:38 pm

Re: Stack and interrupt problems

Post by mariuszp »

Now in VirtualBox it raises #GP on the LTR instruction, with the TSS being segment 0x2B, and the error code on the #GP is 0x28.
Post Reply