Page 1 of 1

[SOLVED] Help With Arm64 MMU Setup

Posted: Tue Nov 19, 2024 4:56 pm
by CorkiMain
Hello, I'm looking for some advice with my MMU setup. This part of my kernel is written in Arm64 assembly. Right now, I'm debugging on a Raspberry Pi 5 with GDB, OpenOCD, and a Pi Debug Probe. The Raspberry Pi 5 has four cortex-a76 cores.

Problem Explanation
Here's the boot process for the kernel:
1. The kernel boots in EL2
2. It sets control registers and drops down to EL1
3. Then the MMU is setup, which includes setting up page tables and setting TCR_EL1, MAIR_EL1, TTBR0_EL1, and SCTLR_EL1.
4. The primary core tries to wake the three secondary cores
5. The secondary cores wake up in EL3
6. The secondary cores try to set their stack pointer. They do this by reading them from memory. The stack pointers were allocated by the primary core with the MMU on.
7. The secondary cores read a 0 for their stack pointers instead of the allocated value
8. The next instruction that pushes a value onto the stack causes an exception

I think that the primary core is putting the stack pointers in its cache after it allocates the stacks. Then the secondary cores read a 0 from memory because the stack pointers got put in a cache and not memory.

System Requirements
I need the cache on because my kernel uses LDAXR and STLXR. The Arm Architecture A-Profile Reference Manual states that global monitors do not work with non-cacheable memory. The MAIR_EL1 register and the shareability page attribute set the kernel memory as Outer Shareable, Inner Write-Back, Outer Write-Back Normal memory with Read allocation hints, Write allocation hints, and as non-transient. The Arm Architecture A-Profile Reference Manual says that this memory setup architecturally guarantees that the global monitor will work. If I change MAIR_EL1 to have the kernel be non-cacheable, I can load the stack pointers but then exceptions are thrown later when running LDAXR instructions.

Help [-o<
I'm not sure what's the best way to solve this issue. I'm hoping I need to set a value in a control register or that there's a way to force the cache to update the memory. I appreciate any help or ideas! Let me know if any more information would be helpful.

Re: Help With Arm64 MMU Setup

Posted: Tue Nov 19, 2024 8:16 pm
by nullplan
Does ARM64 not have cache coherency? Normally, any CPU asking for a cache line should cause any other CPU holding the same cache line to pipe up and write the line to memory first. Even the old PowerPC 603 had this, so I highly doubt that a modern CPU would be lacking it. This isn't a new problem, either, since PCI has made bus-mastering normal for devices other than CPUs.

But if you think the problem is the cache, then why don't you try flushing it? There should be a way to do that, right?

Re: Help With Arm64 MMU Setup

Posted: Tue Nov 19, 2024 8:40 pm
by Octocontrabass
Page table entries have space for you to specify the memory attributes for each page independently. You could set the memory attributes to bypass the cache for some pages.

There are system instructions for cache maintenance. You could use them to flush the data caches.

Re: Help With Arm64 MMU Setup

Posted: Wed Nov 20, 2024 9:33 am
by CorkiMain
Thanks for the help! I used the

Code: Select all

DC CVAC
instruction to flush the cache for the virtual address that pointed to the stack pointers