Page 1 of 1

ARM TTBR0 and TTBR1.. BUT using them swapped OR .....

Posted: Mon Mar 24, 2014 10:48 pm
by Pancakes
I got TTBR0 and TTBR1 working, but my question is more of a what if.

The documentation states TTBR0 should be for user space and TTBR1 for kernel space basically.

My question and topic is basically what if I wanted to use TTBR0 for kernel space, and TTBR1 for user space? I can not seem to find the cons and pros of doing so.

The first mode basically lets bit 31 of the virtual address (or address) determine which paging table is used. For example mode 1 uses TTBR0 if bit 31 is zero and TTBR1 if bit 31 is one. This sounds simple enough. Basically splitting the address space into 2GB for the kernel space and 2GB for the user space.
For N = 2; if VA[31:30] == 0b00 use TTBR0, otherwise use TTBR1
Now, if I use mode 2. If bit 31:30 are zero then it uses TTBR0 and if bit 31:30 are anything other than zero than it uses TTBR1. This means TTBR1 can address 3GB of memory and TTBR0 can address 1GB.

My problem is this seems odd because are we giving the kernel space 3GB and the user space 1GB? If we go by the docs and use TTBR0
The expected use for these two page table base registers is for TTBR1 to be used for operating system and I/O addresses. These do not change on context switches. TTRB0 is used for process specific addresses with each process maintaining a separate first level page table. On a context switch TTBR0 and the ContextID register are modified
See it says TTBR1 to be used for operating system, but if using mode 2 we just gave TTBR1 the ability to map 3GB.

My point is I want to use TTBR1 for user space and TTBR0 for kernel space, but why do the docs seem to recommend to use TTBR1 for kernel space despite it having the ability to map 3GB?

Also, I am getting confused with this boundary issue. It seems changing the value of the N field not only effects which TTBR is used but also the boundary for the table. What in the world is going on? I bet there is a piece of this puzzle I am missing...

<edit>
I think I understand the boundary issue.... but the docs stating that they recommend using TTBR0 for user space (process specific) is confusing when it honestly seems like they wrote it backwards..
</edit>

Re: ARM TTBR0 and TTBR1.. BUT using them swapped OR .....

Posted: Mon Mar 24, 2014 10:49 pm
by Pancakes
Here is a link to the datasheet and the section if that helps anyone:
https://www.scss.tcd.ie/~waldroj/3d1/ar ... f#page=741

Re: ARM TTBR0 and TTBR1.. BUT using them swapped OR .....

Posted: Mon Mar 24, 2014 11:24 pm
by thepowersgang
You _could_ use the lower half of the address space for kernel space, but then you lose the advantage of a higher-half kernel.

From what I could glean, the idea of having two base registers is that TTBR0 acts as an "overlay" to TTBR1. If an accessed address falls outside the size of the area pointed to by BR0, then the lookup will be done in BR1. The idea of allowing the user space to be smaller than kernel space is to reduce the cost of allocating a user address space, instead of having to allocate 2 consecutive pages, you only need to allocate one (at the cost of having the user address space be 1GB).

Re: ARM TTBR0 and TTBR1.. BUT using them swapped OR .....

Posted: Tue Mar 25, 2014 7:53 am
by Owen
One common optimization you can do is have a small TTBR0 by default (say, 64MB) inlined into the process structure. You save allocating a full page table for small processes (and many processes are small)

Re: ARM TTBR0 and TTBR1.. BUT using them swapped OR .....

Posted: Tue Mar 25, 2014 8:38 am
by Pancakes
Wow, thanks for the replies guys. Really appreciate the help on this. You all have given some very helpful answers.

If anyone else knows anything feel free to put in your 2 cents even if it is mostly opinion. I will compile the information and add it to the wiki so someone else can find it and not be confused like I was.

I took a look at my Linux box and counted the process. I included some of the kernel threads and came up to a total of 107. So I multiplied that by 16KB and got 1.67MB. Now 44 had brackets and no VSZ so I just subtract that out and have roughly 1MB. About 40 were under 64MB VSZ. So I come up with roughly 442KB.

That is not bad. So basically if I was using the full sized first level page table with 1MB sections I would consume 1.67MB in page table structures for the processes, and if I used a 64MB address space for anything under 64MB VSZ and a 16KB structure for anything over I would consume 442KB.

So 1670KB versus 442KB is a significant number. It could actually be a lower than that because some process used even less than 64MB of VSZ.

Now, does anyone know of any performance reasons why using TBR1 (full table) for user space might be a problem? I mean just asking in case anyone out there might know.

You guys came up with some really good reasons. I think a second pair of eyes really helps!

Re: ARM TTBR0 and TTBR1.. BUT using them swapped OR .....

Posted: Tue Mar 25, 2014 6:42 pm
by Owen
Using TTBR1 for user space entails using TTBR0 for the kernel, which means you have a lower half kernel with all the attendant disadvantages.

Re: ARM TTBR0 and TTBR1.. BUT using them swapped OR .....

Posted: Tue Mar 25, 2014 9:04 pm
by Pancakes
Right, it seems NUMA, user applications not dependent on how much memory is in kernel space, and 32-bit user space code on a 64-bit CPU are going to be my main concerns. In that case let me try to do a higher half kernel.