Need some tips for multiple processing (MP) implementation

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
iman
Member
Member
Posts: 84
Joined: Wed Feb 06, 2019 10:41 am
Libera.chat IRC: ImAn

Need some tips for multiple processing (MP) implementation

Post by iman »

Hi.
I have my OS to the point where I'd like to start implementation of the symmetric multiple processing for the sake of multi-processing or multi-tasking in between more than one CPU core.
So far having parsed the MP_FLOATING_POINTER and MP_CONFIGURATION_TABLE, I know that there are 4 logical processors available, all enabled, one is BSP and the other three, are AP processors. The physical address of Local APIC is 0xFEE00000. There is also one I/O-APIC available at address 0xFEC00000.
Now I need to know which steps and in what order (e.g. lapic configuration, sending INIT IPI, SIPI, disabling PIC, etc.) should I follow to be able to bring the system to the multi processing. I have started to look at the Intel MP manual, where there is a whole explanations, but some extra implementation tips seem to be necessary for me.
I would appreciate if someone can give me some points.
Best.
Iman.
Iman Abdollahzadeh
Github
Codeberg
nexos
Member
Member
Posts: 1081
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Need some tips for multiple processing (MP) implementati

Post by nexos »

First, parse the MP tables. Then, start the BSP's Local APIC. Next, I map all I/O APICs. Then, send an INIT IPI followed by a startup IPI to each AP, passing the address of the trampoline code. This code runs in Real Mode, so you will need to put the APs in protected mode (or long mode, if your OS is 64 bit) Then the APs should load the GDT and IDT which will be the same as the BSP's. Lastly, the APs sould initialize there LAPICs. It is a very complicated process, but here a here few helpful resources.
The Intel MP Spec (of course)
MIT's xv6 operating system. (My LAPIC code is based off of this)
And my operating system (All the related files can be found at https://github.com/Nexware-Project/NexO ... x86-common).
Good luck!
nexos
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
nullplan
Member
Member
Posts: 1790
Joined: Wed Aug 30, 2017 8:24 am

Re: Need some tips for multiple processing (MP) implementati

Post by nullplan »

What particular thing are you having problems with? The general MP startup procedure for anything after the 486 is:
  1. Send INIT IPI to other processor.
  2. Wait a couple of microseconds as specified in the Intel SDM or MP specification
  3. Send startup IPI to other processor.
  4. Wait a couple of milliseconds for other processor to begin working.
  5. If other processor did not begin working, send second startup IPI
  6. Now, wait a very long time (a couple of seconds).
  7. If other processor is still not working, send another INIT IPI to it and write off that processor as a lost cause.
This requires an MP trampoline capable of showing the other processor is working immediately, but you can do that by setting a bit in memory. Having to send a second startup IPI is very rare these days. That last INIT IPI is just in case the other processor did somehow start up, but is not working correctly. Sending it an INIT IPI should stop it from executing anything.
Carpe diem!
User avatar
iman
Member
Member
Posts: 84
Joined: Wed Feb 06, 2019 10:41 am
Libera.chat IRC: ImAn

Re: Need some tips for multiple processing (MP) implementati

Post by iman »

nexos wrote:Next, I map all I/O APICs.
What if you haven't enabled paging. What's the reason to map I/O APIC?
nexos wrote: passing the address of the trampoline code.
Would you please explain more here? I cannot get the concept of the trampoline code.

Thanks.
Iman.
Iman Abdollahzadeh
Github
Codeberg
8infy
Member
Member
Posts: 185
Joined: Sun Apr 05, 2020 1:01 pm

Re: Need some tips for multiple processing (MP) implementati

Post by 8infy »

iman wrote:
nexos wrote:Next, I map all I/O APICs.
What if you haven't enabled paging. What's the reason to map I/O APIC?
nexos wrote: passing the address of the trampoline code.
Would you please explain more here? I cannot get the concept of the trampoline code.

Thanks.
Iman.
If you don't have paging enabled then you don't need to "map" the IOAPICs, you just write to the physical address from the MP tables.

Trampoline code is a small assembly stub that gets an AP processor from real mode to a state where it can execute tasks from the scheduler.
The address (page index) of the trampoline code is put in the interrupt vector field in the ICR register of BSP LAPIC.
Ready4Dis
Member
Member
Posts: 571
Joined: Sat Nov 18, 2006 9:11 am

Re: Need some tips for multiple processing (MP) implementati

Post by Ready4Dis »

Yes, each AP starts out in real mode and jumps to the locarion you tell it to (it starts in real mode, so sub 1mb). This code must get the AP into pmode or long mode and set up the GDT/IDT and then start running. The "trampoline" as it's been named is just a short stub of code that is available in real mode to get an AP from reset/init to running condition. It can be a really short sequence of a complicated setup, just depends on what you need. My trampoline code was/is pretty minimal.
nexos
Member
Member
Posts: 1081
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Need some tips for multiple processing (MP) implementati

Post by nexos »

Yes, @8infy and @Ready4Dis are correct. If you need guidance, please take a look at my OS.
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
User avatar
iman
Member
Member
Posts: 84
Joined: Wed Feb 06, 2019 10:41 am
Libera.chat IRC: ImAn

Re: Need some tips for multiple processing (MP) implementati

Post by iman »

nexos wrote: If you need guidance, please take a look at my OS.
Thanks to share your OS code with me. It explains all I needed very well.
Iman Abdollahzadeh
Github
Codeberg
User avatar
iman
Member
Member
Posts: 84
Joined: Wed Feb 06, 2019 10:41 am
Libera.chat IRC: ImAn

Re: Need some tips for multiple processing (MP) implementati

Post by iman »

8infy wrote:Trampoline code is a small assembly stub that gets an AP processor from real mode to a state where it can execute tasks from the scheduler.
Now it makes more sense to me. Basically every AP CPU must go into a similar procedure, to enter pmode or long mode, to what you had done before for your BSP CPU to enter the pmode. Right?
Iman Abdollahzadeh
Github
Codeberg
nullplan
Member
Member
Posts: 1790
Joined: Wed Aug 30, 2017 8:24 am

Re: Need some tips for multiple processing (MP) implementati

Post by nullplan »

iman wrote:Now it makes more sense to me. Basically every AP CPU must go into a similar procedure, to enter pmode or long mode, to what you had done before for your BSP CPU to enter the pmode. Right?
Yes. The difference is that the APs only need to configure the CPU itself, not any external hardware. So you do need to get them to pmode, but don't need to reconfigure the PIC, that sort of thing. In SMP systems you can also assume that the other CPUs are the same kind as your BSP, so you only need to detect available features once.

My trampoline code uses a very small amount of actual code, but will initialize the IDTR and keep it up to date with the current operating mode, so if an exception occurs, the trampoline can report which exception and where. It is in fact older than the rest of my OS, and pretty self-contained. The host kernel only needs to fill in a few blanks at the start.
Carpe diem!
Octocontrabass
Member
Member
Posts: 5572
Joined: Mon Mar 25, 2013 7:01 pm

Re: Need some tips for multiple processing (MP) implementati

Post by Octocontrabass »

nullplan wrote:In SMP systems you can also assume that the other CPUs are the same kind as your BSP, so you only need to detect available features once.
That's not always a safe assumption. Firmware has been known to disable features on some cores but not others. (It's also possible for different CPU models to be installed in a single multi-socket board, but you'll probably never see that in an x86 computer.)
User avatar
iman
Member
Member
Posts: 84
Joined: Wed Feb 06, 2019 10:41 am
Libera.chat IRC: ImAn

Re: Need some tips for multiple processing (MP) implementati

Post by iman »

nexos wrote:please take a look at my OS.
nullplan wrote:What particular thing are you having problems with?
Hi.
I tried to start the implementation from what you suggested. I faced some difficulties in my code that I'd like to ask you for any suggestion.
These are what I did so far:
(1) BSP's LAPIC initialization
(2) disabling PIC and configuration of I/O APIC
After that I would test if I/O APIC is capable of sending interrupts to the BSP' LAPIC or not. For this reason I told I/O APIC to remap the vector numbers 0 through 16 (ISA IRQs) and installed once more the interrupt handlers for IRQ0, IRQ1, and IRQ15 (i.a. PIT timer, keyboard, and spurious handlers).
I cannot get any interrupt once I push a key on the keyboard.
Could you please have a look at my abstracted code to see if something is wrong or if I did something completely wrong during the way to get interrupts on I/O APIC?

Code: Select all


static void enable_bsp_lapic(BSP_CPU* cpu) {
	cli();
	
	/* mask and remap all IRQs */
	for(unsigned char i=0; i<256; i++)
        IRQ_set_mask(i);

	all_irqs_remap();
	
	/* set MSR register to enable the local apic */
	unsigned int ECX = 0x1B;
	unsigned int EAX = (lapic_address | BIT(8) | BIT(11));
	unsigned int EDX = 0;
	set_msr(ECX, EAX, EDX);
	
	LapicWrite(Destination_Format_Register, 0xFFFFFFFF); // Put the APIC into flat delivery mode
    LapicWrite(Logical_Destination_Register, (LapicRead(Logical_Destination_Register) & 0x00FFFFFF)); // LDR mask
	LapicWrite(Logical_Destination_Register, (LapicRead(Logical_Destination_Register) | (0<<24))); // LDR
    LapicWrite(Task_Priority_Register, 0);  
	
	/* local timer setting */
	LapicWrite(Timer_Divide_Configuration_Register, 0x03);
	LapicWrite(Local_Vector_Table_TIMER, 32 + IRQ0);  // IRQ0 = 0
	LapicWrite(Initial_Count_Register, 0xFFFFFFFF);

	/* set BSP CPU lapic */
	LapicWrite(Spurious_Interrupt_Vector_Register, (BIT(8) | 15)); // spurious handler at IRQ15
	
	/* local INT0 and INT1 */
	LapicWrite(Local_Vector_Table_LINT0, (BIT(8) | BIT(9) | BIT(10) | BIT(15)));
	LapicWrite(Local_Vector_Table_LINT1, BIT(10));

	LapicWrite(Local_Vector_Table_Error, BIT(16));
	LapicWrite(Thermal_Sensor, BIT(16));
	LapicWrite(Performance_Counter_LVT, BIT(10));
	
	LapicWrite(Interrupt_Command_LOW_Register, 0x88500); // BCAST | INIT | LEVEL
	while(LapicRead(Interrupt_Command_LOW_Register) & 0x1000);
	    
	LapicWrite(Task_Priority_Register, 0);
	
	/* repeat local INT0 and INT1 */
	LapicWrite(Local_Vector_Table_LINT0, (BIT(8) | BIT(9) | BIT(10) | BIT(15)));
	LapicWrite(Local_Vector_Table_LINT1, BIT(10));
	
	/* initialize the LAPIC Timer */
	LapicWrite(Initial_Count_Register, 0xFFFFFFFF);
	
	/* install spurious, PIT and keyboard handlers */
	set_spurious_interrupt_handler();   // install the handler at IRQ15 + 32
	set_timer_interrupt_handler();       // install the handler at IRQ0 + 32
	set_keyboard_interrupt_handler(); // install the handler at IRQ1 + 32
	
	/* enable CPU again */
	sti();
}

//-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

static void ioapic_remapVector(unsigned char vector, unsigned int mapped, bool level, bool low, bool disabled) {
	if (disabled) mapped |= BIT(16);
	if (level) mapped |= BIT(13);
	if (low) mapped |= BIT(15);

	unsigned int apic_id = LapicRead(LAPIC_ID_Register) << 24;
	
	IOapicWrite(0x10 + vector * 2, mapped);
	IOapicWrite(0x11 + vector * 2, apic_id);
}

//-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

static void enable_ioapic(IOAPIC* ioapic, BSP_CPU* cpu) {
	/* disable pic */
	disable_pic();

	/* version */
	unsigned char MaxIndexRedirTab = (unsigned char)((IOapicRead(IOAPIC_VERSION) & 0x00FF0000) >> 16); // It is 23 in my case
	
	/* remap ISA irqs */
	for(unsigned char i=0; i<16; i++)
		ioapic_remapVector(i, i + 32, EDGE, HIGH, i==2);
	
        /* install spurious, PIT and keyboard handlers */
	set_spurious_interrupt_handler();   // install the handler at IRQ15 + 32
	set_timer_interrupt_handler();       // install the handler at IRQ0 + 32
	set_keyboard_interrupt_handler(); // install the handler at IRQ1 + 32

	/* activate IOAPIC => I have ICMR pointer = 0, but if I remove the following code, it does not make any change at all */
	out(0x22, 0x70);
	out(0x23, (in(0x23) | 1));
}
After doing these steps, I still get no interrupt. This was a quick test to make sure that I am in the right track to follow setting the trampoline code for other AP CPUs and entering the multi processing.

Best.
Iman.
Iman Abdollahzadeh
Github
Codeberg
Post Reply