Page 1 of 1

Help with scheduler (source attached)

Posted: Wed Jul 18, 2007 7:47 am
by AndrewAPrice
I'm in the process of transferring my old C kernel over into C++ (I've completely restructured the code and how everything works). I'm having a problem with my scheduler.

In kernel/timer.cpp, there's a method Timer::Handler() that gets called upon each clock interrupt. This function is suppose to rotate the system between tasks. Here is what it currently looks like:

Code: Select all

unsigned int Timer::Handler(unsigned int oldEsp)
{
    if(m_currentThread != 0)
    {
        m_currentThread->GetStack()->esp0 = (unsigned int *)oldEsp;
        m_currentThread = m_currentThread->GetNext();
    }

    if(m_currentThread == 0)
    {
        m_currentThread = m_firstThread;
    }

    m_activeThread = m_firstThread;

    memory.LoadProcessPageTable(m_currentThread->GetPageTable());

    return (unsigned int)m_currentThread->GetStack()->esp0;
}
What that function should do:
- Set m_currentThread (the currently loaded thread) to the next thread on the linked list.
- If m_currentThread points to 0 (either we've passed the end of the list, or it's the first call and nothing is currently running) then point to the first thread again (m_firstThread).
- Set the active thread to m_firstThread. The active thread is currently hard coded, and it's the thread keyboard events are sent to.
- Load the thread's page table into memory.

(Actually I should replace the name Thread with Process because while they started out as threads, they now each have their own address space.)

All the function does is keep the system running m_firstThread. If I manually enter

Code: Select all

m_currentThread = m_firstThread->GetNext();
above memory.LoadProcessPageTable I know there is nothing wrong with my code since it will place the system in the 2nd thread.

I think it may have something to do with optimization, but I am compiling with: (where $GPP='i586-elf-g++' and $FILE='memory.cpp')

Code: Select all

    $GPP -c $FILE -nostdlib -fno-builtin -fno-rtti -fno-exceptions
I have included my kernel if someone would please have a look at it if you have the time. To build the source, just run build.sh from bash (you will want to edit it and change the cd path first). There's also a build.bat which runs build.sh through Cygwin's bash. The 4 folders included in the attachment are:
- Kernel - The kernel (it builds kernel.bin). The disk image is call stored in here (dev_kernel_grub.img).
- Sample Program - My testing program. It should loop and print a yellow '!' on the screen (SampleProgram.prog).
- Library - My user library (generates Library.lib which is statically linked to .prog files).
- AnnoyMe - It should flood the screen with white 'a''s (AnnoyMe.prog).

The programs are loaded through Grub with the module command. Everything works fine if I only load one module, but I would like my OS to be a multitasking system. The desired effect I'm looking for is to visually see the operating system switch between the two programs (by outputting "!!!!!!!!!!!!aaaaaaaaa!!!!!!!!!!!aaaaaaaaaaa" so I can see each program output while it has its share of the processor).

EDIT: This forum has a 64KB limit and the .zip is 125KB. So I uploaded it to my website instead: http://messiahandrw.netfast.org/Source.zip
EDIT 2: I may just be my browser, that link brings up advertisements when I click on it. Copying and pasting the address works fine.

Posted: Wed Jul 18, 2007 12:46 pm
by frank
I haven't found the problem just some little thing I want to comment on. Unless you have some reason for the double call, TimerHandler in the following code snippet is unnecessary.

Code: Select all

unsigned int TimerHandler(unsigned int oldEsp)
{
    return timer.Handler(oldEsp);
}

extern "C"
{
    unsigned int timer_handler(unsigned int oldEsp)
    {
        return TimerHandler(oldEsp);
    }
}
You could just put timer.Handler(oldEsp) in timer_handler.

EDIT: I see several examples of code like that. Why?

Posted: Wed Jul 18, 2007 6:46 pm
by AndrewAPrice
I have a reason for that. My assembler code can't see C++ functions, so I have to jump to a C function (timer_handler). C can't access classes, so I need a wrapper in C++ (TimerHandler) which can then call my C++ function (Timer::TimerHandler).

Posted: Wed Jul 18, 2007 7:58 pm
by frank
MessiahAndrw wrote:I have a reason for that. My assembler code can't see C++ functions, so I have to jump to a C function (timer_handler). C can't access classes, so I need a wrapper in C++ (TimerHandler) which can then call my C++ function (Timer::TimerHandler).
When you make a function extern "C" it does not magically become a C function. It just avoids changing the name of the function name to something cryptic like _Z12Read_ConsolePKcmmPv. You can still do everything you can do in a normal function as in an extern "C" function. You can access classes, call member functions and all that wonderful stuff. You are just adding an unnecessary level of function calls. But its your kernel do things how you want to do them.

Anyways about the scheduler, I can't seem to find anything obviously wrong with it. The included boot disk boots up and then proceeds to flood the screen with A's so something is working right. I will look a little harder when I find some more time.

Posted: Thu Jul 19, 2007 1:08 am
by AndrewAPrice
frank wrote:
MessiahAndrw wrote:I have a reason for that. My assembler code can't see C++ functions, so I have to jump to a C function (timer_handler). C can't access classes, so I need a wrapper in C++ (TimerHandler) which can then call my C++ function (Timer::TimerHandler).
When you make a function extern "C" it does not magically become a C function. It just avoids changing the name of the function name to something cryptic like _Z12Read_ConsolePKcmmPv. You can still do everything you can do in a normal function as in an extern "C" function. You can access classes, call member functions and all that wonderful stuff. You are just adding an unnecessary level of function calls. But its your kernel do things how you want to do them.

Anyways about the scheduler, I can't seem to find anything obviously wrong with it. The included boot disk boots up and then proceeds to flood the screen with A's so something is working right. I will look a little harder when I find some more time.
Thanks - I'll fix that now.. I still can't get my head around why my scheduler doesn't rotate tasks properly.

EDIT: If I comment out

Code: Select all

        console.SetColour(0xE, 0x0);
        console.WriteCharacter('!');
in Sample Program/main.cpp then I get a page fault and the timer doesn't call at all. If I include them lines then Sample Program runs fine (except for the fact the timer doesn't switch tasks - I know the timer is running because I'm using the good old printf trick (or in my case, console.WriteCharacter())). But if Sample Program is the only module loaded, then it runs fine regardless of them lines. This is just getting weird.

EDIT2: Hmm.. I found something interesting.. Sometimes, instead of page faulting, I get: (It should only output this once, but I think the program may be loading twice).

Code: Select all

Welcome to Sample Program.prog!
Welcome to Sample Prorgam.prog!

Posted: Thu Jul 19, 2007 2:23 am
by Kevin McGuire
You might try printing the value of the pointer for the task each time it switches, along with the next pointer to ensure that it is actually selecting the next thread.

Then if that does not help try inserting a asm("cli; hlt; nop; nop") in the exit routine (returning from the interrupt) in the kernel and run with BOCHS, then once it appears to have halted press ctrl+c. Lastly, type print-stack or info registers to try and determine if it is actually loading the next task, looking if esp0 or eip is changing.

Just try to incrementally move forward find the point where it is working incorrectly, then move backwards to find how it is working incorrectly (zero in on the problem spot). This method usually always solves any problem like that.

Posted: Thu Jul 19, 2007 3:51 am
by AndrewAPrice
My scheduler does return the new EIP correctly.

I added the line

Code: Select all

    console << "Welcome to AnnoyMe.prog\n";
to the beginning of AnnoyMe/main.cpp and now I'm encountering another strange symptom:

I will either get:

Code: Select all

Welcome to SampleProgram.prog!
Welcome to SampleProgram.prog!
or

Code: Select all

Welcome to SampleProgram.prog!
Welcome to AnnoyMe.prog
However, after the initial message, it still gets stuck in the first loaded program's loop and doesn't even touch the second's.

I'm thinking it could be related to either of these problems:
- Do I need to disable paging or re-write to cr3 when I change a page directory entry? The following code gets called during my task switch: (I know the page is loaded in supervisor mode, but I want to get this working before I concentrate on protection).

Code: Select all

void Memory::LoadProcessPageTable(unsigned int *pageTable)
{
    m_pageDirectory[1] = (unsigned int)pageTable | 3; // supervisor, r/w, present
}
- Can interrupts fire before other interrupts return? Could the timer fire an interrupt while a syscall to print a string to the screen is still running? My first instruction when an interrupt is called is cli to avoid this, but is it still possible?

Posted: Thu Jul 19, 2007 4:10 am
by AndrewAPrice
It all works fine now! I added

Code: Select all

write_cr3((unsigned int)m_pageDirectory);
to the end of Memory::LoadProcessPageTable and it works perfectly! That one little bug wasted me 2 perfect days of coding. Mind you, I did sort out a lot of other bugs along the way.

Posted: Thu Jul 19, 2007 4:14 am
by pcmattman
Shouldn't you be loading a new CR3 per process on the task switch?

Posted: Thu Jul 19, 2007 6:55 am
by AndrewAPrice
pcmattman wrote:Shouldn't you be loading a new CR3 per process on the task switch?
I was talking in #osdev on irc.freenode.net and I gained inspiration for a completely new way to over come the 4MB limit I have imposed myself for applications. I probably won't start implementing it while I can still handle the 4MB limit.