ARM coarse table; sub-pages disabled; [SOLVED]

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
Pancakes
Member
Member
Posts: 75
Joined: Mon Mar 19, 2012 1:52 pm

ARM coarse table; sub-pages disabled; [SOLVED]

Post by Pancakes »

Ok, I am sorry guys. I have spent the last few days trying to figure out what in the world is going on and I have just become more confused at the results. I think I have tried just about every bit. Basically, here is what I got going on.



I create a first level table. I identify map it.

Code: Select all

	for (x = 0; x < 1024; ++x) {
		ktlb[x] = (x << 20) | TLB_AP_PRIVACCESS | TLB_SECTION;
	}
Basically, (x << 20) | 0xc02.

That works great. No problems.

I decide well let me try a coarse table. This allows the electronic monkeys to come out.

Code: Select all

          ktlb[3] = (uint32)tlbsub | TLB_COARSE;
That appears to work. It knows I am using a coarse take 1KB in size.

Code: Select all

	for (x = 0; x < 4096; ++x)
		tlbsub[x] = 2 | (32 << 16) | 0x550;
Looks, good. Should be mapping 4KB pages. At 1KB it has 256 4-byte entries.

Before, I turn on pages I do this for a debugging aid.

Code: Select all

	a = (uint32*)(32 << 16);
	for (x = 0; x < 1024 * 1024; ++x) {
		a[x] = (uintptr)&a[x];
	}

Now. I turn on paging with sub-pages disabled.

Code: Select all

        arm4_tlbsetmode(2); 
	arm4_tlbset1((uintptr)utlb);
	arm4_tlbset0((uintptr)ktlb);
	/* set that all domains are checked against the TLB entry access permissions */
	arm4_tlbsetdom(0x55555555);
	/* enable TLB 0x1 and disable subpages 0x800000 */
	arm4_tlbsetctrl(arm4_tlbgetctrl() | 0x1 | (1 << 23));
I have tried arm4_tlbsetmode(0) and arm4_tlbsetmode(1) just FYI. Just checked.

Now, I should have 4KB pages mapped... monkeys say otherwise.

Here is a debugging loop.

Code: Select all

	for (x = 0; x < 256; ++x) {
		b = (uint32*)0x300000 + x * 1024;
		ksprintf(buf, "small[%x]:%x\n", x, b[0]);
		kserdbg_puts(buf);
	}
You would expect:

Code: Select all

small[0]:0x200000
small[1]:0x200400
small[2]:0x200800
small[3]:0x200c00
small[4]:0x200000
small[5]:0x200400
small[6]:0x200800
small[7]:0x200c00
Wrong... I get:

Code: Select all

small[0]:0x200000
small[1]:0x200000
small[2]:0x200000
small[3]:0x200000
small[4]:0x200000
small[5]:0x200000
small[6]:0x200000
small[7]:0x200000
It seems to have mapped 1KB pages.. Well don't worry the monkeys in my head are at it again. It gets worse.

Lets try a slightly different test. Notice the + 4 at the end of assignment to b.

Code: Select all

	for (x = 0; x < 256; ++x) {
		b = (uint32*)0x300000 + x * 1024 + 4;
		ksprintf(buf, "small[%x]:%x\n", x, b[0]);
		kserdbg_puts(buf);
	}

Code: Select all

small[0]:0x200010
small[1]:0x200010
small[2]:0x200010
small[3]:0x200010
small[4]:0x200010
small[5]:0x200010
small[6]:0x200010
small[7]:0x200010
If I add 8 I get 0x20.. and so forth... I have tried all sorts of bits. The only thing I can accomplish is getting 1KB pages mapped and then I get this crazy 0x10 physical offset for each 0x4 bytes of virtual offset. Not to mention I only have 256 entries so I can not map the entire 1MB region.

So I got two problems. Why am I getting that odd offset and why am I unable to map 4KB pages instead of 1KB pages?

I must be setting a bit somewhere wrong, but I can not seem to find it.
Last edited by Pancakes on Sat Mar 29, 2014 4:28 am, edited 1 time in total.
User avatar
Pancakes
Member
Member
Posts: 75
Joined: Mon Mar 19, 2012 1:52 pm

Re: ARM coarse table; sub-pages disabled; im freaking confus

Post by Pancakes »

Let me update with some links to help anyone that might spot my problem. Because, I know a many of you might not have done this before. But, some of you are really smart and might figure it out in like 5 minutes which would be great.

First Level Table: https://www.scss.tcd.ie/~waldroj/3d1/ar ... f#page=727

Second Level Table: https://www.scss.tcd.ie/~waldroj/3d1/ar ... f#page=731

Registers: https://www.scss.tcd.ie/~waldroj/3d1/ar ... f#page=739

Start Of VMSA: https://www.scss.tcd.ie/~waldroj/3d1/ar ... f#page=701
User avatar
Pancakes
Member
Member
Posts: 75
Joined: Mon Mar 19, 2012 1:52 pm

Re: ARM coarse table; sub-pages disabled; im freaking confus

Post by Pancakes »

Update:

I figured out the 0x10 offset. I feel like an idiot.

Code: Select all

    b = (uint32*)0x300000 + x * 1024 + 4;    /* WRONG */
    b = (uint32*)(0x300000 + x * 1024 + 4);  /* CORRECT */
I can not believe I was doing that. Anyway, maybe I will figure out the coarse table problem next!!
User avatar
Pancakes
Member
Member
Posts: 75
Joined: Mon Mar 19, 2012 1:52 pm

Re: ARM coarse table; sub-pages disabled; im freaking confus

Post by Pancakes »

Oh... I fear the monkeys have gone to get bananas..

Apparently everything is working..

Code: Select all

	for (x = 0; x < 256; ++x) {
		tlbsub[x] = 2 | (16 << 16) | 0x550;
	}
	
	tlbsub[1] = 2 | (0x101000) | 0x550;
Output:

Code: Select all

/* tlbsub[0] ... mapped 4KB page at 0x100000 */
small[00]=0x100000
small[04]=0x100400
small[08]=0x100800
small[0c]=0x100c00
/* tlbsub[1] ... mapped 4KB page at 0x101000 */
small[10]=0x101000
small[14]=0x101400
small[18]=0x101800
small[1c]=0x101c00
/* .... */
small[20]=0x100000
small[24]=0x100400
Well, I am still in a problem. Because, I can not seem to get 64K pages (large pages) to work.

Code: Select all

tlbsub[1] = 1 | (0x101000) | 0x550;
I changed the 2 to a 1 which should specify a large page (64KB)... oddly it changes nothing. It acts the same no matter if I use a 1 or a 2.

I will just reply when I figure it out, LOL
User avatar
Pancakes
Member
Member
Posts: 75
Joined: Mon Mar 19, 2012 1:52 pm

Re: ARM coarse table; sub-pages disabled; im freaking confus

Post by Pancakes »

When using a 64K entry versus a 4K entry the only benefits I see is that:
  • It is easier to map in 64K of memory because you just duplicate the same entry 16 times in a tight loop (no need to continually offset by 0x1000 each cycle of the loop)
  • Could kind of provide an indicator of a continuous 64K of memory, but you could easily mess that up by say changing the n-th entry to a 4K mapping entry.
Maybe I am missing something though...

For example. Straight from the source code of QEMU:

http://git.qemu.org/?p=qemu.git;a=blob; ... 13;hb=HEAD

Code: Select all

3315         /* Lookup l2 entry.  */
3316         table = (desc & 0xfffffc00) | ((address >> 10) & 0x3fc);
3317         desc = ldl_phys(cs->as, table);
3318         ap = ((desc >> 4) & 3) | ((desc >> 7) & 4);
3319         switch (desc & 3) {
3320         case 0: /* Page translation fault.  */
3321             code = 7;
3322             goto do_fault;
3323         case 1: /* 64k page.  */
3324             phys_addr = (desc & 0xffff0000) | (address & 0xffff);
3325             xn = desc & (1 << 15);
3326             *page_size = 0x10000;
3327             break;
3328         case 2: case 3: /* 4k page.  */
3329             phys_addr = (desc & 0xfffff000) | (address & 0xfff);
3330             xn = desc & 1;
3331             *page_size = 0x1000;
3332             break;
3333         default:
3334             /* Never happens, but compiler isn't smart enough to tell.  */
3335             abort();
3336         }
3337         code = 15;
3338     }
It select the entry in the table with table = (desc & 0xfffffc00) | ((address >> 10) & 0x3fc). It does not consider if an entry is 64K so you have to populate all 16 entries just like the ARM manual specifies..

I want to get inside the head of the designers and understand why. I know there is a reason. Nobody spends millions of dollars to make it work a certain way just because the monkeys came out and decided to play.
User avatar
Pancakes
Member
Member
Posts: 75
Joined: Mon Mar 19, 2012 1:52 pm

Re: ARM coarse table; sub-pages disabled; im freaking confus

Post by Pancakes »

What was confusing me is how QEMU was handling the situation.

It appears that real hardware may enforce the 16 count replication in order to reduce memory access and space used by the TLB cache/memory. QEMU, does not enforce this and will happily work just fine with blank entries in the 64K mapping or even 4K mappings mixed in.

But, I am still waiting to confirm this. QEMU is what was confusing me because it did not enforce it, but you know once I wrapped my head around it and some help from someone it makes sense that QEMU does not enforce it for performance reasons and real HW could be different.
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: ARM coarse table; sub-pages disabled; [SOLVED]

Post by Owen »

Real hardware won't enforce it, but it will assume it.

You are right on with your TLB utilisation (and faster filling) guess. Hardware isn't required to do anything special with large pages... But it may, and probably does.
User avatar
Pancakes
Member
Member
Posts: 75
Joined: Mon Mar 19, 2012 1:52 pm

Re: ARM coarse table; sub-pages disabled; [SOLVED]

Post by Pancakes »

Thanks for your input Owen. It helps a lot to get some affirmation. As for anyone else feel just free to put in your 2 cents.
Post Reply