Hi!
I have implemented kernel threads in my x86_64 kernel recently, and it's finally (sort of) working... I can start a few dozen threads, and they get scheduled alright. But after a while, threads start to die... All the threads do is write to the screen in a tight loop. somewhere in the logging functions, all of a sudden parameters are 0 (or some other random value) where they shouldn't.
I suspect that a timer interrupt occurs somewhere in a bad place, and somehow clobbers the registers used for parameter passing. i triple checked the thread state saving and restoring code [1], but cannot seem to find a problem there. also, i checked whether a nested interrupt could have confused some code, but in [2] the interrupt handlers are completely locked (i have only the BSP running so far) by disabling interrupts.
i also disabled the red-zone, so that should not be the problem. when disassembling the method that crashes, i can see, that the compiler generates code to save the register used to pass in the parameter to the stack. right after that, if i check the value, it's zero.... i debugged this for a while now (which is pretty much impossible, since the timer tends to fire faster than i can debug ) and am out of ideas. any suggestion what i could look out for would be appreciated...
[1] https://github.com/mduft/tachyon3/blob/ ... 64/state.S
[2] https://github.com/mduft/tachyon3/blob/ ... 6_64/idt.S - line 49
did i forget to mention something...?
thanks for the help
markus
Strange register (or stack?) clobber problem.
Re: Strange register (or stack?) clobber problem.
Are your kernel stacks 16 byte aligned ? I got caught by that.
If a trainstation is where trains stop, what is a workstation ?
Re: Strange register (or stack?) clobber problem.
thanks for the hint. i double checked and added a few logs. nothing going wrong there however; all stacks are page aligned (4K).
Code: Select all
trace: allocated new stack at 0xffffffff80000000 (16384 bytes comm, 16384 bytes res)
trace: allocated new stack at 0x0000800000000000 (8192 bytes comm, 1048576 bytes res)
trace: allocated new stack at 0x00007ffffff01000 (8192 bytes comm, 1048576 bytes res)
trace: allocated new stack at 0x00007fffffe02000 (8192 bytes comm, 1048576 bytes res)
trace: allocated new stack at 0x00007fffffd03000 (8192 bytes comm, 1048576 bytes res)
trace: allocated new stack at 0x00007fffffc04000 (8192 bytes comm, 1048576 bytes res)
trace: allocated new stack at 0x00007fffffb05000 (8192 bytes comm, 1048576 bytes res)
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: Strange register (or stack?) clobber problem.
Have you disabled the red zone? (Thereby permitting GCC to save data at negative offsets from ESP?)
Re: Strange register (or stack?) clobber problem.
yes, i disabled the red zone... the disassembly at the relevant location looks like this:
the noteworthy thing is: at 21a, the values passed in are "sane" (i.e. not NULL/zero). but at 23a, the cmpq fails, and the kernel stops (with fatal error "null format ..."). i have no idea what could be the problem. the code for the interrupt handler(s) is here, and the state saving/restoring code is here. i need somebody with good asm-foo to have a look at this please, as i'm totally stuck.
thanks
Code: Select all
000000000000021a <log_format_message>:
static void log_format_message(char* buf, size_t len, char const* fmt, va_list args) {
21a: 55 push %rbp
21b: 48 89 e5 mov %rsp,%rbp
21e: 48 83 ec 70 sub $0x70,%rsp
222: 48 89 7d a8 mov %rdi,-0x58(%rbp)
226: 48 89 75 a0 mov %rsi,-0x60(%rbp)
22a: 48 89 55 98 mov %rdx,-0x68(%rbp)
22e: 48 89 4d 90 mov %rcx,-0x70(%rbp)
char c;
char* p = buf;
232: 48 8b 45 a8 mov -0x58(%rbp),%rax
236: 48 89 45 f0 mov %rax,-0x10(%rbp)
if(!fmt)
23a: 48 83 7d 98 00 cmpq $0x0,-0x68(%rbp)
23f: 0f 85 9e 05 00 00 jne 7e3 <log_format_message+0x5c9>
fatal("null format in log_format_message!\n");
thanks
Re: Strange register (or stack?) clobber problem.
thanks to those that tried to help, i FOUND IT
... finally *phew*. it seems i forgot to save register from clobbering through GCC generated code that ran _before_ saving registers (while trying to find out where to save registers to... uh).
now it works, and kernel threading is pretty stable now (128 threads all printing thread-id + system time running endlessly without problem )
thanks again!
... finally *phew*. it seems i forgot to save register from clobbering through GCC generated code that ran _before_ saving registers (while trying to find out where to save registers to... uh).
now it works, and kernel threading is pretty stable now (128 threads all printing thread-id + system time running endlessly without problem )
thanks again!