*SOLVED* Problem with local APIC

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
giszo
Member
Member
Posts: 124
Joined: Tue Nov 06, 2007 2:37 pm
Location: Hungary

*SOLVED* Problem with local APIC

Post by giszo »

Hi!

Today I just tried to do some SMP development on my kernel. I just started to write some code to parse the MP tables. This is done, now I get a list of available CPUs in the system. As the next step I wrote some code to handle the local APIC.

The problem comes here... From the MP tables I got two CPU instances with local APIC id 0 and 1. When I read the APIC ID register of the local APIC on the BSP processor I got 4. I've run through the local APIC part in the Intel manuals what I'm doing wrong, but I'm out of ideas...

What I'm doing to get the value from the APIC ID register:
* Get the base address of the APIC registers from the MP configuration table (this will be 0xFEE00000)
* Map the above address somewhere to the kernel address space
* Read the value from the mapped address + 0x20 and shift right with 24 bits

If anyone has an idea what could be wrong, please let me know ;)

Thanks,
giszo

Ps: I'm getting these results in VMware Workstation if this makes any difference...
Last edited by giszo on Fri Dec 12, 2008 11:54 am, edited 1 time in total.
User avatar
IanSeyler
Member
Member
Posts: 326
Joined: Mon Jul 28, 2008 9:46 am
Location: Ontario, Canada
Contact:

Re: Problem with local APIC

Post by IanSeyler »

It looks like what you are doing is correct as the APIC ID is bits 24-27 at APIC Base + 0x20.

I'm also developing an OS with SMP. Currently it runs fine in a virtual machine but triple faults on the physical hardware I've tested.

If needed here is some good tutorial code on initializing SMP.
http://www.cs.usfca.edu/~cruse/cs630f08/mphello.s

Good luck!
BareMetal OS - http://www.returninfinity.com/
Mono-tasking 64-bit OS for x86-64 based computers, written entirely in Assembly
Hyperdrive
Member
Member
Posts: 93
Joined: Mon Nov 24, 2008 9:13 am

Re: Problem with local APIC

Post by Hyperdrive »

giszo wrote: The problem comes here... From the MP tables I got two CPU instances with local APIC id 0 and 1. When I read the APIC ID register of the local APIC on the BSP processor I got 4. I've run through the local APIC part in the Intel manuals what I'm doing wrong, but I'm out of ideas...

What I'm doing to get the value from the APIC ID register:
* Get the base address of the APIC registers from the MP configuration table (this will be 0xFEE00000)
* Map the above address somewhere to the kernel address space
* Read the value from the mapped address + 0x20 and shift right with 24 bits

If anyone has an idea what could be wrong, please let me know ;)

Thanks,
giszo

Ps: I'm getting these results in VMware Workstation if this makes any difference...
This looks okay. And *should* work this way. At least in my implementation it does. I didn't test it in VMWare - so maybe there are some "surprises" which I'm not aware of.

What do you mean by "Map the above address somewhere to the kernel address space"? When you "read the value from the mapped address + 0x20", are you sure that this maps to physical address 0xFEE00020? In other words: Are you sure that your mapping is right?

Regards,
Thilo
LoseThos
Member
Member
Posts: 112
Joined: Tue Oct 30, 2007 6:41 pm
Location: Las Vegas, NV USA
Contact:

Re: Problem with local APIC

Post by LoseThos »

Here's my multiprocessing code.

Code: Select all

asm {
////**************************PROCEDURE*************************
	ALIGN	16,OC_NOP
	USE16
MP_INIT_START::
	JMP	MP2_START2
	ALIGN	4,OC_NOP
MP2_SYS_TEMP_PTR:	DU4	0,0;

MP2_START2:
	CLI

	MOV_EAX_CR0
	OR	EAX,0x60000000
	MOV_CR0_EAX

	INVD

	MOV	AX,MP_VECTOR_ADDRESS/16
	MOV	DS,AX

	LGDT	U4 [MP_SYS_TEMP_PTR]

	MOV	EAX,SYS_START_CR0|0x60000000
	MOV_CR0_EAX

	DU1	0x66,0xEA;		 //JMP SYS_CS_SEL:MP_INIT_OS
	DU4	MP_INIT_OS;
	DU2	SYS_CS_SEL;

MP_INIT_END::
	USE32
MP_INIT_OS:
	MOV	AX,ZERO_DS_SEL
	MOV	DS,AX
	MOV	ES,AX
	MOV	FS,AX
	MOV	GS,AX
	MOV	SS,AX

	FLDCW	U2 [SYS_INIT_FLOAT_CTRL_WORD]

@@1:	LOCK
	BTS	U4 [SYS_MP_CNT_LOCK],0
	JC	@@1
	WBINVD	//This might be paranoid
	MOV	ESI,U4 [SYS_MP_CNT]
	LOCK
	INC	U4 [SYS_MP_CNT]
	LOCK
	BTR	U4 [SYS_MP_CNT_LOCK],0


	MOV_EAX_CR0
	AND	EAX,~0x60000000 //enable cache
	MOV_CR0_EAX

	IMUL2	ESI,CPU_CACHED_SIZE
	ADD	ESI,U4 [SYS_CPU_CACHED]

	LEA	EAX,U4 CPU_START_STACK_TOP[ESI]
	MOV	ESP,EAX
	PUSH	U4 SYS_START_RFLAGS
	POPFD
	PUSH	U4 0
	CALL	INIT_EM64T
USE64
	PUSH	RSI
	CALL	CP_SET_GS_BASE
	POP	RSI
@@2:	MOV	RBX,U8 CPU_SETH_TSS[RSI]
	OR	RBX,RBX
	JZ	@@2
	MOV	U8 TSS_GS[RBX],RSI
	MOV	RAX,RBX
	CALL	SET_FS_BASE

	JMP	I4 RESTORE_CONTEXT
};

void MPInt(U8 num,U8 cpu_num=1)
{
  U4 *dl=MP_ICR_LOW,*dh=MP_ICR_HIGH;
  while (LBts(&mp_ctrl->flags,MPCCf_APIC_LOCKED))
    SwapInNextTask;
  PushFD;
  Cli;
  while (*dl&0x1000);
  *dh=mp_apic_ids[cpu_num]<<24;
  *dl=0x4800+num;
  PopFD;
  LBtr(&mp_ctrl->flags,MPCCf_APIC_LOCKED);
}

void MPEOI()
{
  U4 *dl=MP_EOI;
  while (LBts(&mp_ctrl->flags,MPCCf_APIC_LOCKED))
    SwapInNextTask;
  *dl=0;
  LBtr(&mp_ctrl->flags,MPCCf_APIC_LOCKED);
}

void MPIntAll(U8 num)
{ //(All but self)
  U4 *dl=MP_ICR_LOW;
  while (LBts(&mp_ctrl->flags,MPCCf_APIC_LOCKED))
    SwapInNextTask;
  PushFD;
  Cli;
  while (*dl&0x1000);
  *dl=0xC4800+num;
  PopFD;
  LBtr(&mp_ctrl->flags,MPCCf_APIC_LOCKED);
}

void MPNMInt()
{
  U4 *dl=MP_ICR_LOW;
  *dl=0xC4400;
}

void MPHalt()
{
  mp_cnt=1;
  MPNMInt; //Hlt All other processors
}

void MPWbInvdAll()
{
  MPIntAll(I_WBINVD);
  WbInvd;
}

void MPInitAPIC()
{
  RaxRbxRcxRdx cpu_id;
  I8 i;
  U4 *d;
  d=MP_SVR;
  *d|=MP_APIC_ENABLED;

  CpuId(1,&cpu_id);
  i=cpu_id.rbx>>24&0xFF;
//This is the only way I could get it to work
  mp_apic_ids[Gs->num]=1<<i;

//We're not supposed to change this
  d=MP_LDR;
  *d=mp_apic_ids[Gs->num]<<24;

  d=MP_DFR;
  *d=0xF0000000;
  MemSet(MP_IRR,0,0x20);
  MemSet(MP_ISR,0,0x20);
  MemSet(MP_TMR,0,0x20);

  SetRAX(Gs->tr);
  asm {
	LTR	U4 RAX
  }

  if (Gs->num)
    InitIDT;
}

void MPWaitForTask()
{
  U8 timeout=0;
  MPCmdStruct *tempm;
  Preempt(OFF);
  Sti;
  Bts(&Fs->task_flags,TSSf_IDLE);
  while (TRUE) {
    while (TRUE) {
      if (GetTimeStamp>timeout) {
	WbInvd;
	FinishOffDyingTsses;
	timeout=GetTimeStamp+time_stamp_freq>>9;
      } else if (mp_ctrl->next_waiting==mp_ctrl)
	SwapInNextTask;
      else
	break;
    }
    while (LBts(&mp_ctrl->flags,MPCCf_LOCKED))
      SwapInNextTask;
    tempm=mp_ctrl->next_waiting;
    while (tempm!=mp_ctrl && !Bt(&tempm->target_cpu_mask,Gs->num))
      tempm=tempm->next;
    if (tempm!=mp_ctrl) {
      RemQue(tempm);
      LBts(&tempm->flags,MPCf_DISPATCHED);
      tempm->handler_cpu=Gs->num;
      LBtr(&mp_ctrl->flags,MPCCf_LOCKED);
      Btr(&Fs->task_flags,TSSf_IDLE);
      switch (tempm->cmd_code) {
	case MPCT_CALL:
	  tempm->result=CallInd(tempm->add,tempm->data);
	  Preempt(OFF);
	  Sti;
	  break;
	case MPCT_SPAWN_TASK:
	  if (tempm->desc)
	    tempm->tss=Spawn(tempm->add,tempm->data,tempm->desc);
	  else
	    tempm->tss=Spawn(tempm->add,tempm->data,"MP Job");
	  break;
      }
      if (Bt(&tempm->flags,MPCf_FREE_ON_COMPLETE)) {
	Free(tempm->desc);
	Free(tempm);
      } else {
	while (LBts(&mp_ctrl->flags,MPCCf_LOCKED))
	  SwapInNextTask;
	InsQue(tempm,mp_ctrl->last_done);
	LBts(&tempm->flags,MPCf_DONE);
	LBtr(&mp_ctrl->flags,MPCCf_LOCKED);
      }
      LBtr(&Gs->uncached_address->uncached_flags,CPUUf_NOT_READY);
      Bts(&Fs->task_flags,TSSf_IDLE);
    } else
      LBtr(&mp_ctrl->flags,MPCCf_LOCKED);
  }
}

MPCmdStruct *MPQueueJob(void *add,void *data=NULL,
       I1 *desc=NULL,
       U8 flags=1<<MPCf_FREE_ON_COMPLETE,
       BoolI1 spawn=FALSE,I8 target_cpu_mask=ALL_MASK)
{
  MPCmdStruct *tempm=MAllocHCZ(sizeof(MPCmdStruct),mp_heap);
  if (desc)
    tempm->desc=StrNewHC(desc,mp_heap);
  if (spawn)
    tempm->cmd_code=MPCT_SPAWN_TASK;
  else
    tempm->cmd_code=MPCT_CALL;
  tempm->add=add;
  tempm->data=data;
  tempm->target_cpu_mask=target_cpu_mask;
  tempm->flags=flags;
  while (LBts(&mp_ctrl->flags,MPCCf_LOCKED))
    SwapInNextTask;
  InsQue(tempm,mp_ctrl->last_waiting);
  LBtr(&mp_ctrl->flags,MPCCf_LOCKED);
  return tempm;
}

MPCmdStruct *MPJob(void *add,void *data=NULL,
       U8 flags=1<<MPCf_FREE_ON_COMPLETE,
       I8 target_cpu_mask=ALL_MASK)
//Set flags to zero if you wish to
//get the result.
{
  return MPQueueJob(add,data,NULL,flags,FALSE,target_cpu_mask);
}

I8 MPJobResult(MPCmdStruct *tempm)
{
  I8 result;
  while (!Bt(&tempm->flags,MPCf_DONE))
    SwapInNextTask;
  while (LBts(&mp_ctrl->flags,MPCCf_LOCKED))
    SwapInNextTask;
  RemQue(tempm);
  LBtr(&mp_ctrl->flags,MPCCf_LOCKED);
  result=tempm->result;
  Free(tempm->desc);
  Free(tempm);
  return result;
}

TssStruct *MPSpawn(void *add,I8 data=0,I1 *desc=NULL,I8 target_cpu=ALL_MASK)
{
  TssStruct *result;
  MPCmdStruct *tempm=MPQueueJob(add,data,desc,0,TRUE,target_cpu);
  while (!tempm->tss)
    SwapInNextTask;
  result=tempm->tss;
  while (LBts(&mp_ctrl->flags,MPCCf_LOCKED))
    SwapInNextTask;
  RemQue(tempm);
  LBtr(&mp_ctrl->flags,MPCCf_LOCKED);
  Free(tempm->desc);
  Free(tempm);
  return result;
}

void MPInitCPUTask()
{
  MPInitAPIC;
  Fs->rip=&MPWaitForTask;
  Fs->time_slice_start=GetTimeStamp;
  RestoreContext;
}

void MPStart()
{
  TssStruct *tss;
  I1 buf[128];
  U4 *d;
  MPMainStruct *mp=MP_VECTOR_ADDRESS;
  I8 i=0;
  CPUCachedStruct *c;
  CPUUncachedStruct *uc;
  BlkPool *bp,*saved_bp;
  I8 shared_blks;
  MPCmdStruct *tempm,*tempm1;

  PushFD;
  Cli;
  if (mp_cnt>1) {
    MPHalt;
    BusyWait(10000);

    tempm=mp_ctrl->next_waiting;
    while (tempm!=mp_ctrl) {
      tempm1=tempm->next;
      RemQue(tempm);
      Free(tempm->desc);
      Free(tempm);
      tempm=tempm1;
    }

    tempm=mp_ctrl->next_done;
    while (tempm!=&mp_ctrl->next_done) {
      tempm1=tempm->next;
      RemQue(tempm);
      Free(tempm->desc);
      Free(tempm);
      tempm=tempm1;
    }

    mp_cnt=1;
  }
  MemSet(&cpu_cached[1],0,sizeof(CPUCachedStruct)*(MP_MAX_PROCESSORS-1));
  MemSet(&cpu_uncached[1],0,sizeof(CPUUncachedStruct)*(MP_MAX_PROCESSORS-1));
  MemCpy(MP_VECTOR_ADDRESS,MP_INIT_START,MP_INIT_END-MP_INIT_START);
  mp->sys_temp_ptr=MAXGDT*16-1+gdttab><(U1 *)<<16;
  mp_cnt=1;
  mp_cnt_lock=0;

  d=MP_LVT3;
  *d=*d&0xFFFFFF00+MP_VECTOR;
  WbInvd;

  d=MP_ICR_LOW;
  *d=0xCC500; //assert init IPI
  BusyWait(10000);

  *d=0xC4600+MP_VECTOR; //start-up
  BusyWait(200);
  *d=0xC4600+MP_VECTOR;$FG$

  BusyWait(10000);

  for (i=1;i<mp_cnt;i++) {
    c =&cpu_cached[i];
    uc=&cpu_uncached[i];
    SPrintF(buf,"Seth Task CPU#%d",i);

    shared_blks=1; //One 2Meg blk
    bp=Alloc2MegMemBlks(&shared_blks,sys_code_bp);
    BlkPoolInit(bp,shared_blks<<12);

    saved_bp=Gs->code_bp;
    Gs->code_bp=bp;
    tss=Spawn(&MPInitCPUTask,0,buf,NULL,NULL,DEFAULT_STACK,FALSE);
    tss->in_queue_signature=TSSS_IN_QUEUE_SIGNATURE;
    InitCPUCachedStruct(i,c);
    c->uncached_address=InitCPUUncachedStruct(uc);
    c->code_bp=bp;
    Gs->code_bp=saved_bp;
    c->tr=InitRealTssStruct;
    c->seth_tss=tss;
    WbInvd;
  }
  PopFD;
}

void MPInit()
{
  I8 shared_blks=1; //One 2Meg blk
  TssStruct *tss;
  RaxRbxRcxRdx ee;
  CPUCachedStruct *c;
  BlkPool *bp;
  CpuId(0x1,&ee);

  mp_cnt=1;
  mp_cnt_lock=0;
  WbInvd;
  bp=AllocUncachedMemBlks(&shared_blks);
  mp_heap=IndependentHeapCtrlInit(bp,shared_blks<<12);

  mp_ctrl=MAllocHCZ(sizeof(MPCmdCtrl),mp_heap);
  mp_ctrl->next_waiting=mp_ctrl->last_waiting=mp_ctrl;
  mp_ctrl->next_done=mp_ctrl->last_done=&mp_ctrl->next_done;

  mp_crash=MAllocHCZ(sizeof(MPCrashStruct),mp_heap);

  cpu_cached  =MAllocHCZ(sizeof(CPUCachedStruct)*MP_MAX_PROCESSORS,adam_tss->code_heap);
  cpu_uncached=MAllocHCZ(sizeof(CPUUncachedStruct)*MP_MAX_PROCESSORS,mp_heap);

  c=cpu_cached;
  MemCpy(c,&sys_temp_cpu0_struct,sizeof(CPUCachedStruct));
  c->cached_address  =cpu_cached;
  c->uncached_address=InitCPUUncachedStruct(cpu_uncached);
  SetGs(c);
  tss=adam_tss;
  tss->time_slice_start=GetTimeStamp;
  tss->in_queue_signature=TSSS_IN_QUEUE_SIGNATURE;
  c->seth_tss=tss;
  c->code_bp=sys_code_bp;
  c->data_bp=sys_data_bp;
  c->tr=InitRealTssStruct;
  c->idle_tss=Spawn(0,NULL,"Idle",Fs,NULL,DEFAULT_STACK,FALSE);
  if (Bt(&ee.rdx,9))
    MPInitAPIC;
}

asm {MP_CRASH::}
void MPCrash()
{
  MPEOI;
  mp_cnt=1;
  Raw(ON);
  dc_flags|=DCF_SHOW_DOLLAR;
  coutln "MP Crash CPU#",mp_crash->cpu_num," Tss:",mp_crash->tss;
  WbInvd;
  Debugger(mp_crash->msg,mp_crash->msg_num);
}

User avatar
01000101
Member
Member
Posts: 1599
Joined: Fri Jun 22, 2007 12:47 pm
Contact:

Re: Problem with local APIC

Post by 01000101 »

LoseThos:... try not to post your *entire* multiprocessing code. Multiple reasons, A: you just said, here's my code and plopped it on here without any explanations or well... anything; B: the OP asked for something specific, not an entire codeset like that.

anyways, I have a feeling that you will find the answer to the strange APIC id's in 7-24 of the Intel manual's vol. 3A. It breaks apart what the APIC id really means.

You say that you're getting a 4, but MP reports 0... that makes some sense as MP is probably only report the processor ID and doesn't give you the Cluster ID (which is bits 2-3 on P6 and 3-4 on Xeons). So 4 would equal a Processor ID of 0, and a Cluster ID of 1 on a P6 machine.
giszo
Member
Member
Posts: 124
Joined: Tue Nov 06, 2007 2:37 pm
Location: Hungary

Re: Problem with local APIC

Post by giszo »

Actually I found the problem. My APIC code was ok, the bug was in my paging functions that made a corrupt mapping for the local APIC. This was the reason for the garbage from the APIC registers. :)

Thanks for your help!
Hyperdrive
Member
Member
Posts: 93
Joined: Mon Nov 24, 2008 9:13 am

Re: *SOLVED* Problem with local APIC

Post by Hyperdrive »

Congrats, giszo.

Just a little note...

If IDs from the MPSpec/ACPI tables and in the Local APIC register do not match - how could one sanely kick the APs to get to work?

1) Which ID would you assume to send the INIT-SIPI-SIPI? The one from the BIOS tables or the one that you discovered in the APIC registers plus some maths to match them? You can't be sure. You could just try and maybe there would be an AP starting - but is it the one you wanted?

2) You could do a broadcast. But that would start the disabled, maybe erratic, CPUs, too.

Either way, not the way to go.

Regards,
Thilo
giszo
Member
Member
Posts: 124
Joined: Tue Nov 06, 2007 2:37 pm
Location: Hungary

Re: *SOLVED* Problem with local APIC

Post by giszo »

Hyperdrive wrote:Congrats, giszo.

Just a little note...

If IDs from the MPSpec/ACPI tables and in the Local APIC register do not match - how could one sanely kick the APs to get to work?

1) Which ID would you assume to send the INIT-SIPI-SIPI? The one from the BIOS tables or the one that you discovered in the APIC registers plus some maths to match them? You can't be sure. You could just try and maybe there would be an AP starting - but is it the one you wanted?

2) You could do a broadcast. But that would start the disabled, maybe erratic, CPUs, too.

Either way, not the way to go.

Regards,
Thilo
Currently I'm using only the MP tables because I have no support for the ACPI tables yet. With those CPU local APIC IDs coming from the MP tables I can successfully power up the AP processors in VMware and in Bochs as well. Unfortunately I don't have any SMP machine at home to test on.

So what would be the best way to start the APs? :)

giszo
User avatar
01000101
Member
Member
Posts: 1599
Joined: Fri Jun 22, 2007 12:47 pm
Contact:

Re: *SOLVED* Problem with local APIC

Post by 01000101 »

what I do is:

* test for usable CPU's in the MP tables.
* if there are no unusable CPU's detected, broadcast the startup IPI's
* if there are, then get the ID's of the usable CPU's from the MP tables
* then send the startup IPI's to each individual usage CPU.

I don't use ACPI, but if I did, I'd only startup the CPU's that were both usable and had matching ID's between the two tables.
quok
Member
Member
Posts: 490
Joined: Wed Oct 18, 2006 10:43 pm
Location: Kansas City, KS, USA

Re: *SOLVED* Problem with local APIC

Post by quok »

Parsing the ACPI tables to grab the SMP information isn't very hard at all as it doesn't require interpreting any of the ACPI byte code. It's very much similar to parsing the MP tables.

In my kernel, I look for and parse ACPI tables first, and if those don't exist (or don't contain any SMP information), then I look for and parse the MP tables.

The MP Specification though, is long outdated and doesn't properly contain support for things like hyperthreading and multi-core processors, although in practice I haven't yet come across a situation where the information I needed wasn't there to get SMP working correctly.
Hyperdrive
Member
Member
Posts: 93
Joined: Mon Nov 24, 2008 9:13 am

Re: *SOLVED* Problem with local APIC

Post by Hyperdrive »

I also use ACPI tables and fall back to MPSpec tables in the case there aren't any ACPI tables. I cross-check using information from both tables. If this check fails you have four options: Rely on ACPI tables, rely on MPSpec tables, fall back to uniprocessor operation or refuse to boot. For now I chose "rely on ACPI tables".

Parsing ACPI tables has the advantage of discovering the NUMA nodes in the system and all that related stuff.

Till now I started all the APs individually. But, as 01000101 pointed out, the special case in which none AP is marked as disabled allows you to broadcast the start sequence, which speeds up things a bit.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: *SOLVED* Problem with local APIC

Post by Brendan »

Hi,
Hyperdrive wrote:Till now I started all the APs individually. But, as 01000101 pointed out, the special case in which none AP is marked as disabled allows you to broadcast the start sequence, which speeds up things a bit.
How do you determine if an AP is disabled?

The enable/disable bit in the "CPU flags" field of Processor Entries (for the Multi-Processor Specification), and the enable/disable bit in the "flags" field of the Processor Local APIC Structure (for ACPI) only tells you if the entry is enabled/disabled - it doesn't correspond to enabled/disabled CPUs. For example, a BIOS might always generate tables with enough entries for 64 CPUs and then (if there's 4 CPUs installed) only enable 4 of these entries (leaving 60 entries that are "disabled" because they weren't used). For a computer with 4 CPUs where one of those CPUs fails it's BIST (Built In Self Test) the BIOS could generate tables that have 3 entries where all entries are enabled (and where there simply isn't any entry for the CPU that failed).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Hyperdrive
Member
Member
Posts: 93
Joined: Mon Nov 24, 2008 9:13 am

Re: *SOLVED* Problem with local APIC

Post by Hyperdrive »

Brendan wrote:
Hyperdrive wrote:Till now I started all the APs individually. But, as 01000101 pointed out, the special case in which none AP is marked as disabled allows you to broadcast the start sequence, which speeds up things a bit.
How do you determine if an AP is disabled?

The enable/disable bit in the "CPU flags" field of Processor Entries (for the Multi-Processor Specification), and the enable/disable bit in the "flags" field of the Processor Local APIC Structure (for ACPI) only tells you if the entry is enabled/disabled - it doesn't correspond to enabled/disabled CPUs. For example, a BIOS might always generate tables with enough entries for 64 CPUs and then (if there's 4 CPUs installed) only enable 4 of these entries (leaving 60 entries that are "disabled" because they weren't used). For a computer with 4 CPUs where one of those CPUs fails it's BIST (Built In Self Test) the BIOS could generate tables that have 3 entries where all entries are enabled (and where there simply isn't any entry for the CPU that failed).
You are right. I was aware of your first argument - it does not break the algorithm, it just inhibits the "nice" special case. The most systems I saw, didn't work this way, so I considered it to be a minor issue. (Of course, one has to consider "all" (potentially evil) systems out there. Not only the few ones, you have gotten your fingers on.)

Your second point, however, is the one that shows the situation in which the requirement "avoid starting defective APs" is violated. At the time of my post it slipped my mind. But as I was implementing my MP code, I definitely came across this and maybe decided for this reason to start APs individually in all cases. At least that's what some comment lines in my code say. :oops:

Thanks for putting this straight.

Regards,
Thilo
Post Reply