Page 1 of 1

Structure is being corrupted by function call

Posted: Fri Apr 15, 2022 4:12 am
by lubenard
Hello !

I am curently developping a very small kernel in C and i have implemented a fake multi terminal handling.

It consist of a structure that look like this :

Code: Select all

typedef struct s_terminal {
			t_shell *first;
			t_shell *second;
			t_shell *third;
			t_shell *active_shell;
}				t_terminal;
I initialise my structure in the function init_shell :

Code: Select all

void	init_shell() {
	terminal_t real_term;
	t_shell first;
	t_shell second;
	t_shell third;

	memset(&real_term, 0, sizeof(terminal_t));
	memset(&first, 0, sizeof(t_shell));
	memset(&second, 0, sizeof(t_shell));
	memset(&third, 0, sizeof(t_shell));
	
	real_term.first = &first;
	real_term.second = &second;
	real_term.third = &third;
	first.is_shell_init = 1;
	real_term.active_shell = &first;

	real_term.active_shell->cursor_pos = 0;
	real_term.active_shell->cmd_hist_size = 4;
	real_term.active_shell->cmd_hist_curr = 4;
	real_term.active_shell->start_cmd_line = terminal_writestr("Shell > ");
	terminal = &real_term; // terminal is a global variable
	
	printk(KERN_INFO, "End init shell");
}
I also developped a small function to check this structure for debug purposes.
This function is called just after the end of init_shell in the calling function
Here it is :

Code: Select all

void check_term_struct() {
	if (terminal != 0) {
		printk(KERN_INFO, "-------------------------------");
		printk(KERN_INFO, "terminal is located at %p", &terminal);
		printk(KERN_INFO, "terminal->active is located at %p", terminal->active_shell);
		printk(KERN_INFO, "terminal->first is located at %p", terminal->first);
		printk(KERN_INFO, "terminal->second is located at %p", terminal->second);
		printk(KERN_INFO, "terminal->third is located at %p", terminal->third);
		printk(KERN_INFO, "terminal->active->cmd_size= %d", terminal->active_shell->cmd_size);
		printk(KERN_INFO, "-------------------------------");
	}
}
That's for ther context. Now the bug :

At one moment i realised the global terminal structure was being overwritten.
I searched everywhere on my code, and here is what i found :

Code: Select all

Breakpoint 1, init_shell () at srcs/io/shell/shell.c:312
312	}
1: /a terminal->first = 0x10cda4
2: /a terminal->second = 0x10bb78
3: /a terminal->third = 0x10a94c
(gdb) n
k_main (mb_mmap=0xf000e2c3, magic=4026597203) at srcs/kernel/kernel.c:68
68		check_term_struct();
1: /a terminal->first = 0x10cda4
2: /a terminal->second = 0x10bb78
3: /a terminal->third = 0x10a94c
(gdb) s
check_term_struct () at srcs/io/shell/shell.c:28
28	void check_term_struct() {
1: /a terminal->first = 0x10cda4
2: /a terminal->second = 0x10bb78
3: /a terminal->third = 0x10a94c
(gdb) s
check_term_struct () at srcs/io/shell/shell.c:29
29		if (terminal != 0) {
1: /a terminal->first = 0x10cda4
2: /a terminal->second = 0x10bb78
3: /a terminal->third = 0x10a94c
(gdb) s
30			printk(KERN_INFO, "-------------------------------");
1: /a terminal->first = 0x10cda4
2: /a terminal->second = 0x10bb78
3: /a terminal->third = 0x10a94c
(gdb) s
printk (info_type=-268377405, str=0xf000ff53 <error: Cannot access memory at address 0xf000ff53>) at srcs/lib/printk/printk.c:61
61	void printk(int info_type, const char *str, ...) {
1: /a terminal->first = 0x1041bb <check_term_struct+43>
2: /a terminal->second = 0x0
3: /a terminal->third = 0x10766a
As you can see, the structure seems to be overwritten when printk is called, but i cannot put a finger on what is causing it :? . Any help would be greatly appreciated !
Currently, the only initialised parts are terminal (for screen writing) and com port (for debug). No memory / gdt are init at this time
Thanks !

Re: Structure is being corrupted by function call

Posted: Fri Apr 15, 2022 5:10 am
by iansjack
You define real_term as a local variable; you then set a global variable to the address of this variable (which is allocated on the stack). As soon as your init_shell() routine exits the local variable goes out of scope (and the stack is free to be overwritten) so the global variable becomes invalid.

You need to allocate real_term on the heap via a malloc() call (or "new" in C++).

Re: Structure is being corrupted by function call

Posted: Fri Apr 15, 2022 6:09 am
by lubenard
Yes, i can see why, but the whole point was to avoid having to instantiate the memory.
Do you know where i can find documentation about this please ?
I just implemented it, and it works great !
Thanks !

Re: Structure is being corrupted by function call

Posted: Fri Apr 15, 2022 10:59 pm
by nullplan
lubenard wrote:Do you know where i can find documentation about this please ?
The C standard can tell you a lot of things about object lifetimes.
lubenard wrote:I just implemented it, and it works great !
Didn't you just say that it doesn't work? Anyway, to be extremely formal about things, by the time init_shell() ends, so does the life time of real_term, first, second, and third. At that point, they will have ceased to be. They'll lie astiff, bereft of life; they will be ex-objects. Any pointer to them is a dangling pointer, and dereferencing such is undefined behavior. In the usual implementation, you might get a few drops of life out of them as long as your stack does not touch the same memory area, and then all goes to hell when you call one function too many.

In order to solve this problem properly, you need these objects to have more than block lifetime. And the C standard only gives you two alternatives: Program lifetime, or allocated lifetime. There is also thread-local lifetime, but that doesn't really apply to kernels. You can select program lifetime by declaring all of these variables "static", but that also means that you will only get one instance of them at run-time. Or you can allocate them on heap. These are your only options.