[SOLVED] Help With Arm64 MMU Setup

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
CorkiMain
Posts: 8
Joined: Wed Aug 07, 2024 10:13 am
Libera.chat IRC: CorkiMain

[SOLVED] Help With Arm64 MMU Setup

Post by CorkiMain »

Hello, I'm looking for some advice with my MMU setup. This part of my kernel is written in Arm64 assembly. Right now, I'm debugging on a Raspberry Pi 5 with GDB, OpenOCD, and a Pi Debug Probe. The Raspberry Pi 5 has four cortex-a76 cores.

Problem Explanation
Here's the boot process for the kernel:
1. The kernel boots in EL2
2. It sets control registers and drops down to EL1
3. Then the MMU is setup, which includes setting up page tables and setting TCR_EL1, MAIR_EL1, TTBR0_EL1, and SCTLR_EL1.
4. The primary core tries to wake the three secondary cores
5. The secondary cores wake up in EL3
6. The secondary cores try to set their stack pointer. They do this by reading them from memory. The stack pointers were allocated by the primary core with the MMU on.
7. The secondary cores read a 0 for their stack pointers instead of the allocated value
8. The next instruction that pushes a value onto the stack causes an exception

I think that the primary core is putting the stack pointers in its cache after it allocates the stacks. Then the secondary cores read a 0 from memory because the stack pointers got put in a cache and not memory.

System Requirements
I need the cache on because my kernel uses LDAXR and STLXR. The Arm Architecture A-Profile Reference Manual states that global monitors do not work with non-cacheable memory. The MAIR_EL1 register and the shareability page attribute set the kernel memory as Outer Shareable, Inner Write-Back, Outer Write-Back Normal memory with Read allocation hints, Write allocation hints, and as non-transient. The Arm Architecture A-Profile Reference Manual says that this memory setup architecturally guarantees that the global monitor will work. If I change MAIR_EL1 to have the kernel be non-cacheable, I can load the stack pointers but then exceptions are thrown later when running LDAXR instructions.

Help [-o<
I'm not sure what's the best way to solve this issue. I'm hoping I need to set a value in a control register or that there's a way to force the cache to update the memory. I appreciate any help or ideas! Let me know if any more information would be helpful.
Last edited by CorkiMain on Wed Nov 20, 2024 9:34 am, edited 1 time in total.
nullplan
Member
Member
Posts: 1790
Joined: Wed Aug 30, 2017 8:24 am

Re: Help With Arm64 MMU Setup

Post by nullplan »

Does ARM64 not have cache coherency? Normally, any CPU asking for a cache line should cause any other CPU holding the same cache line to pipe up and write the line to memory first. Even the old PowerPC 603 had this, so I highly doubt that a modern CPU would be lacking it. This isn't a new problem, either, since PCI has made bus-mastering normal for devices other than CPUs.

But if you think the problem is the cache, then why don't you try flushing it? There should be a way to do that, right?
Carpe diem!
Octocontrabass
Member
Member
Posts: 5571
Joined: Mon Mar 25, 2013 7:01 pm

Re: Help With Arm64 MMU Setup

Post by Octocontrabass »

Page table entries have space for you to specify the memory attributes for each page independently. You could set the memory attributes to bypass the cache for some pages.

There are system instructions for cache maintenance. You could use them to flush the data caches.
User avatar
CorkiMain
Posts: 8
Joined: Wed Aug 07, 2024 10:13 am
Libera.chat IRC: CorkiMain

Re: Help With Arm64 MMU Setup

Post by CorkiMain »

Thanks for the help! I used the

Code: Select all

DC CVAC
instruction to flush the cache for the virtual address that pointed to the stack pointers
Post Reply