OSDev.org

Posted: **Tue Aug 19, 2008 4:17 pm**

I've been working on my APIC code that boots existing AP's, but once again, my xeon processors dont like the pre-existing code. I've read through the intel manuals about how to init APs and have almost mimicked the code from the manual, but even that doesnt work. This code works in bochs and on my c2d btw.

Code: Select all

unsigned int svr = *(unsigned int*)(offset + 0xF0); // store the SVR
svr |= 0x100;                                                      // set bit 8 to enable the APIC
*(unsigned int*)(offset + 0xF0) = svr;                    // set the new SVR
		
 *(unsigned int*)(offset + 0x300) = 0x000C4500;     // send a broadcast INIT IPI
apic_wait(1000);                                    // wait
*(unsigned int*)(offset + 0x300) = 0x000C4610;     // send a broadcast SIPI IP
apic_wait(100);                                     // wait
*(unsigned int*)(offset + 0x300) = 0x000C4610;     // send a second broadcast SIPI IP

this would 'normally' send the startup commands to all APs and then send them off to 0x1000:0x0000 where I have some trampoline code present. They never make it to the vector code. I can read the cpu's and cores just fine by parsing and they all have normal valid information, but sending the IPIs seem to have no effect. Any help would be appreciated.

Posted: **Tue Aug 19, 2008 4:55 pm**

According to my knowledge your startup IPI will start the AP on 0x10000(0x1000:0000)

Posted: **Tue Aug 19, 2008 5:29 pm**

oops, typo.
I'll fix that in a sec.

I think I have figured it out though. I added:

Code: Select all

*(unsigned int*)(offset + 0x310) = (unsigned int)0xFF000000;      // set the destination as 0xFF

before all the lower-dword assignments.
I also reduced the timeout to only 10.

Now both bochs and the xeons are happy and it boots all 8 cores to the trampoline code and through to the 64-bit init and to the kernel. Now I just need to figure out how to synch the processors together and allow them to share caches that way when I printf() on different CPUs it wont be all garbled. Anyone know anyway to do this?

Posted: **Tue Aug 19, 2008 5:53 pm**

Can't you just keep a global lock for any kernel output routines? You might get deadlocks if you print from interrupts that way though

Posted: **Tue Aug 19, 2008 5:55 pm**

That sounds like a completely typical "resource sharing" problem that is typically solved with a spinlock or a non-locking algorithm?

And what turned out to be the problem with getting the xeons to longjump into Pmode?

Posted: **Tue Aug 19, 2008 6:08 pm**

bewing wrote:That sounds like a completely typical "resource sharing" problem that is typically solved with a spinlock or a non-locking algorithm?

I think it is more of a syncronization problem, as printing from one cpu produces different results than printing from another, thus leading me to think that the offset variables are being incorrectly accessed due to offsets not being the same accross all cpus.

bewing wrote:And what turned out to be the problem with getting the xeons to longjump into Pmode?

I explained that in my previous post.

I've already implemented basic 'lock' 'unlock' 'spin' functions, but even when one CPU is locked and the others spinning, printf only works properly on the bsp.

Posted: **Tue Aug 19, 2008 7:16 pm**

Here's how my printk() function works in my kernel. I've got no problems with it on SMP

I use two locks; a spinlock called logbuf_lock and a semaphore called console_lock. Everything is buffer based, and uses a couple of housekeeping variables to keep track of where in the buffer you are. These variables are log_start, log_end, and con_start

The first thing printk does is grab the logbuf_lock. Once it has that, it spews into a rather large (32K) buffer using vsnprintk (my vsnprintf variant for the kernel). It then attempts to grab the console_lock semaphore (using semaphore_tryacquire). If it cannot, it releases the logbuf_lock spinlock and exits.

If printk was able to grab the console_lock semaphore, it immediately calls a utility function release_console_lock(). This function is just a huge for loop. As soon as you enter the for loop, release_console_lock grabs the logbuf_lock spinlock again. It makes sure there is something in the log to print by seeing if con_start is equal to log_end. If that is true, the loop exits (still holding the spinlock). Otherwise, I set some local variables, _start and _end to con_start and log_end, respectively. I also set con_start to log_end, as those two being the same is the only way to exit the loop. Then I release the logbuf_lock spinlock. Next up, I call the console printing routine, passing _start and _end.

The console printing routine loops through every registered console driver, printing whatever is in the log buffer between _start and _end. In this way, my kernel messages can go to serial output and be displayed on the screen quite easily.

After the console printing routine exists, we jump back to the top of that for loop in release_console_lock. We grab the logbuf_lock again, and if we're lucky, nobody else printed to the buffer while we were printing to screen, and we can exit. Otherwise it's lather, rinse, repeat. Once out of the for loop, we release the console semaphore, and then release the logbuf_lock spinlock. Don't think that's backwards, remember the console lock was grabbed way up in printk()

Anyway, doing things this way allows most processors to keep spewing to the printk buffer, and then happily go on their way to something else. Only one processor at a time ever handles the task of "draining" the printk buffer by calling the console printing routines.

Of course, there's all kinds of other housekeeping here that I didn't cover. For instance, setting log_start and log_end appropriately, and how to handle the situation where log_end wraps back to the beginning of the buffer, but log_start is back at the end somewhere. And what happens when you are in a kernel panic and want to dump the buffer to screen, but some other processor has the locks, and can't release them because panic() sends an IPI to halt all processors? Well, in that situation, I have a "in_panic" global that gets set, and if that is set, printk zaps all the locks so it can happily grab them and dump the buffer.

The only other problem I really know of here is that it is possible to lose messages, or get "corrupt" messages if you're printing to the buffer faster than you can drain it. If that happens, simply increasing the buffer size will help greatly.

My printk also uses syslog style message numbers, so I can filter what gets displayed (on each console) by setting the syslog_level variable in the console driver. Usually I'll have all debug messages go to a serial port, while only the more "important" stuff gets displayed on screen. I also provide some boot variables that can change this so you don't need to recompile all the time, and of course there will be a system call to handle it as well.

And finally, one of the first things I do in my kernel is grab the console_lock semaphore, which is a statically allocated variable. Doing that means I can easily use printk long before the console drivers are set up, and then flush the buffer without losing any messages after the console drivers are available.

Posted: **Tue Aug 19, 2008 9:41 pm**

That is deffinately something I will look into (the printf scheduling algorithm).

I've been reading that I should be forcing locked operations on memory, but I can't find out how to do this properly? Also, should I be invalidating caches whenever I need the processors to see the same update variables or not?

Posted: **Wed Aug 20, 2008 6:09 am**

01000101 wrote:
bewing wrote:And what turned out to be the problem with getting the xeons to longjump into Pmode?
I explained that in my previous post.

I thought you were having trouble getting the xeon BSPs to boot into Pmode in the first place? No? It was only secondary cores and APIC issues the whole time?

Posted: **Wed Aug 20, 2008 8:27 am**

oh, if you're referring to my other post about my bootloader, that was due to my real-mode segment selectors being wrong, so I set them all to CS, and then everything worked fine.

This post was just about booting APs. I'm going through my old 32-bit dins code, re-writing it for 64-bit and then working out some compatibility issues between my code and the xeon processors.

OSDev.org

APIC issues.

APIC issues.

Re: APIC issues.

Re: APIC issues.

Re: APIC issues.

Re: APIC issues.

Re: APIC issues.

Re: APIC issues.

Re: APIC issues.

Re: APIC issues.

Re: APIC issues.