'Bogus' Memory

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
Therx

'Bogus' Memory

Post by Therx »

As you may have gathered from my previous I'm rewriting my OS with a GUI built in. A lot of the code can just be ripped from my old OS. So far I've added DM and tasks. But a problem which keeps on poping up is with my temporary memory mangement. It works just by adding the size to a pointer to get the pointer for the next request etc. There is no free() yet. With my testing code which just has two tasks. One just waits for a key press then displays it at one coordinate the other constantly prints "HELLO WORLD from TASK2" at another coordinate. At first it all works fine but after several key presses random coloured dashes start to appear(on real PC). It bochs it complains of running in bogus memory. Here is the relevant part of bochsout.txt:-

00190681838p[CPU ] >>PANIC<< prefetch: running in bogus memory
00190681838i[SYS ] Last time is 1055589325
00190681838i[CPU ] protected mode
00190681838i[CPU ] CS.d_b = 32 bit
00190681838i[CPU ] SS.d_b = 32 bit
00190681838i[CPU ] | EAX=00000001 EBX=f000ff53 ECX=00000000 EDX=000000a0
00190681838i[CPU ] | ESP=00000018 EBP=f000ff53 ESI=f000ff53 EDI=f000ff53
00190681838i[CPU ] | IOPL=0 NV UP EI PL NZ NA PO NC
00190681838i[CPU ] | SEG selector base limit G D
00190681838i[CPU ] | SEG sltr(index|ti|rpl) base limit G D
00190681838i[CPU ] | DS:0010( 0002| 0| 0) 00000000 000fffff 1 1
00190681838i[CPU ] | ES:0010( 0002| 0| 0) 00000000 000fffff 1 1
00190681838i[CPU ] | FS:0010( 0002| 0| 0) 00000000 000fffff 1 1
00190681838i[CPU ] | GS:0010( 0002| 0| 0) 00000000 000fffff 1 1
00190681838i[CPU ] | SS:0010( 0002| 0| 0) 00000000 000fffff 1 1
00190681838i[CPU ] | CS:0018( 0003| 0| 0) 00000000 000fffff 1 1
00190681838i[CPU ] | EIP=f000ff53 (f000ff53)
00190681838i[CPU ] | CR0=0x60000011 CR1=0x00000000 CR2=0x00000000
00190681838i[CPU ] | CR3=0x00000000 CR4=0x00000000

I've attached the files from the OS which are relevant to the problem. The test code is in main.c. dm.c is the device manager. The most likely problem files are tasks.c and mm.c. I've spent an age looking for the bug but I can't find it. Please can someone look through the code and suggest problems. If there is any other parts of the OS you need to see just say and I'll post them.

Thanks in advance for any help at all

Pete

PS I think the problem occured once before I added the multitasking but as there was another bug which when I fixed it went away I forgot about it till now

[attachment deleted by admin]
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

Re:'Bogus' Memory

Post by Pype.Clicker »

i fear you'll be the only one able to help yourself this time ...

a few hints i can suggest :
- what happens if you just don't wait for keys at all ? can you run your 2 tasks for as long as you want ?

- make sure both tasks have been executed correctly before the problem occur (i.e. isn't there a problem with task switches)

- make sure the stack aren't overflowing. this is a common problem with multitasking if one isn't aware of every single bits.
if you receive interrupt while you were already handling another interrupt because the IF flag wasn't preserved in the switch, you can easily have a stack overflow and damage some code of you kernel or overwrite the other task's return pointer, etc.

- you could try to activate more debugging options of BOCHS so that you know what code called the jump/call to bogus memory.

- also make sure you don't have static shared variable without synchronization code. I had once a problem with my "print" function because i had 2 threads mixing commands to the display buffer...

- your "stack*" pointer in keyboard.c seems weird to me. What is it used for ? isn't there a risk of racing condition if 2 threads do a kbd_read ?
Therx

Re:'Bogus' Memory

Post by Therx »

Pype.Clicker wrote:- what happens if you just don't wait for keys at all ? can you run your 2 tasks for as long as you want ?
If I change the first task so it just prints out Hi from Task1 then the error still occurs(ruling out bug in kbd driver) If I change task 1 so its the same except it doesn't print out the key then the error goes.

Ok so the bug is in sysprintf or its helper functions.
- make sure both tasks have been executed correctly before the problem occur (i.e. isn't there a problem with task switches)
Not sure quite what you mean.
- make sure the stack aren't overflowing. this is a common problem with multitasking if one isn't aware of every single bits.
if you receive interrupt while you were already handling another interrupt because the IF flag wasn't preserved in the switch, you can easily have a stack overflow and damage some code of you kernel or overwrite the other task's return pointer, etc.
How could I check for this.
- you could try to activate more debugging options of BOCHS so that you know what code called the jump/call to bogus memory.
How?
- also make sure you don't have static shared variable without synchronization code. I had once a problem with my "print" function because i had 2 threads mixing commands to the display buffer...
This is probally the problem. What should I do? The problem is in the display mode that I'm in to plot a pixel you have to write a bit to each of four planes of video memory. Therefore should I make it that only one instance of write_pixel can run. How would I do this so that it forms a fair queue?
- your "stack*" pointer in keyboard.c seems weird to me. What is it used for ? isn't there a risk of racing condition if 2 threads do a kbd_read ?
Don't worry the kbd driver is temporary just to check things and until its properly written there wont be two tasks doing kbd_read.


Thanks for your hints. Now at least I know that the problem is in one of three functions in video.c : sysprintf, putchar OR write_pixel. Most probally the later due to the outportb's done to switch the VGA memory plane. So I've got to add a queuing system to the beginning of that function. Would a asm("cli"); at the beginning and a asm("sti"); at the end do?(Stopping task switches) That solution would explain the random dashs experienced on a real PC but wouldn't fix a bug of running in "bogus memory". This is most likely related to the task switching / stack allocation. The later means the fault could also be in the mm.

EDIT : I've added the enable/disable instructions and that solves the problem on a actually PC but it crashes bochs. ie. illegal operation from windows

Pete
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

Re:'Bogus' Memory

Post by Pype.Clicker »

Therx wrote:
- make sure the stack aren't overflowing. this is a common problem with multitasking if one isn't aware of every single bits.
if you receive interrupt while you were already handling another interrupt because the IF flag wasn't preserved in the switch, you can easily have a stack overflow and damage some code of you kernel or overwrite the other task's return pointer, etc.
How could I check for this.
For instance, check the "esp" value at the time of the crash. If it's within the faulty task's area, then it means there's probably no overflow. If it's out of that area ... well. I clearly overflowed.

Note that setting a missing page at the end of the kernel stacks will make a "tripple fault" when overflowing if the page fault handler is not a task gate.
- you could try to activate more debugging options of BOCHS so that you know what code called the jump/call to bogus memory.
How?
i think the first step is to compile bochs with --enable-debugger, then check Bochs's manual for internal debugger docs (it should be something like "display call" or something.
- also make sure you don't have static shared variable without synchronization code. I had once a problem with my "print" function because i had 2 threads mixing commands to the display buffer...
This is probally the problem. What should I do? The problem is in the display mode that I'm in to plot a pixel you have to write a bit to each of four planes of video memory. Therefore should I make it that only one instance of write_pixel can run. How would I do this so that it forms a fair queue?
You have several options, one of these being a "display server" which would be the only task to access video memory and display the commands queued in by other process.

Alternatively, you may have a monitor that will prevent a task to start a display access if another task is displaying.
So I've got to add a queuing system to the beginning of that function. Would a asm("cli"); at the beginning and a asm("sti"); at the end do?(Stopping task switches)
asm("cli") will be helpfull, yes, but i doubt about the correctness of asm("sti") here : what if your code has been called from an interrupt ? IF was 0 and you CLI: no change, then you STI and enable interrupts in an interrupt handler ... not fun at all :-(

The best way around is to save the state of the flags in a temporary variable before you CLI and restore that state (instead of STIying) when you're done. You can get a look at kSysLock and kSysUnlock macros in src/head/sys/mutex.h in Clicker's CVS browser.
Post Reply