invlpg

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
matthias
Member
Member
Posts: 158
Joined: Fri Oct 22, 2004 11:00 pm
Location: Vlaardingen, Holland
Contact:

invlpg

Post by matthias »

Ok, I have this virtual address (say 0xc00000) of a page I want to invalidate, is this the right way?

Code: Select all

asm("invlpg %0" : "=g" (virtual_addr));
Is this the right way to invalidate the page in tlb??

This is why I'm asking:

Code: Select all

addr_t* lol2 = (addr_t*)0xa0000000;

	puts("Allocating page\n");
	addr_t* lol = (addr_t*)AllocPage();
	puts("Mapping page\n");
	MapPageIn((addr_t)lol, 0xa0000000, (addr_t)page_directory, 0);
	puts("Clearing page\n");
	ClearPage(lol);

	*lol2  = 0xaabbccdd;

	printf("lol = 0x%08x\n", *lol);

	MapPageOut((addr_t)lol2, (addr_t)page_directory);

	/* fault */
	*lol2  = 0xaabbccdd;
It does not fault.

Code: Select all

addr_t MapPageOut(addr_t virtual, addr_t cr3)
{
	/* assign pointers */
	addr_t* pg_dir = (addr_t*)cr3;
	addr_t* pg_tab;

	/* calculate pde en pte entries */
	size_t pde = virtual / LOOKUP_VALUE_L3;
	size_t pte = (virtual % LOOKUP_VALUE_L3) / PAGE_SIZE;

	/* get pde */
	pg_tab = (addr_t*) (pg_dir[pde] & 0xfffff000);

	/* zero entry */
	pg_tab[pte] = 0;

	/* invlpg address */
	asm("invlpg %0" : "=g" (virtual));

	return pg_tab[pte] & 0xfffff000;
}
Am I missing something here, it does map the page in, but the mapping out fails :?
The source of my problems is in the source.
User avatar
gaf
Member
Member
Posts: 349
Joined: Thu Oct 21, 2004 11:00 pm
Location: Munich, Germany

Post by gaf »

Shouldn't there be another colon in your inline code ? It's asm( "statements" : output_registers : input_registers : clobbered_registers)..

Code: Select all

asm("invlpg %0" : :  "0" (virtual_address));
cheers,
gaf
User avatar
matthias
Member
Member
Posts: 158
Joined: Fri Oct 22, 2004 11:00 pm
Location: Vlaardingen, Holland
Contact:

Post by matthias »

It does work doing it my way, if I write to lol2 before the mapping I'll get a page fault, this is OK. After mapping in I don't get a page fault and I can access it through lol and lol2, this is OK But when I map the memory out it doesn't seem to invlpg the page (not OK), or does it invlpg the wrong address (i.e. coding fault) ?? Anyways I can still access lol2 even when it is supposed to be mapped out.
The source of my problems is in the source.
User avatar
matthias
Member
Member
Posts: 158
Joined: Fri Oct 22, 2004 11:00 pm
Location: Vlaardingen, Holland
Contact:

Post by matthias »

Ok I've tested my MapPageOut function and it does seem to clear the right address so that's ok, only problem still is that it is still in the tlb. I will test my code using your previous suggestion ;) Uh, yeah:

Code: Select all

mm.c:154: error: matching constraint references invalid operand number
The source of my problems is in the source.
User avatar
gaf
Member
Member
Posts: 349
Joined: Thu Oct 21, 2004 11:00 pm
Location: Munich, Germany

Post by gaf »

My bad.. I normally use concrete registers with my inline assembly. You'll also have to add some brackets around the operant as invlpg takes a value and not a register or a memory address.

Code: Select all

asm("invlpg (%0)"
    : /* output registers */
    : /* input  registers */
      "r"(address)
    : /* clobbered registers */
   );
I hope you now understand why the extra colon is needed. If not you might have a look at this brief tutorial (especially the section called "Extended inline assembly" should be interresting).

regards,
gaf
User avatar
matthias
Member
Member
Posts: 158
Joined: Fri Oct 22, 2004 11:00 pm
Location: Vlaardingen, Holland
Contact:

Post by matthias »

Uh yeah:

Code: Select all

mm.c:153: error: parse error before ')' token
I'll stick to my own code, which I've written according to a tutorial on osdever: http://www.osdever.net/tutorials/gccasm ... ?the_id=68

I'll test it again and now instead of using invlpg I will try reloading cr3 ;)

like this:

http://www.osdev.org/phpBB2/viewtopic.php?p=11354#11354
The source of my problems is in the source.
User avatar
matthias
Member
Member
Posts: 158
Joined: Fri Oct 22, 2004 11:00 pm
Location: Vlaardingen, Holland
Contact:

Post by matthias »

Yep this works, well actualy it doesn't -_- It does map the page out but:

Code: Select all

addr_t MapPageOut(addr_t virtual, addr_t cr3)
{
	/* assign pointers */
	addr_t* pg_dir = (addr_t*)cr3;
	addr_t* pg_tab;

	/* calculate pde en pte entries */
	size_t pde = virtual / LOOKUP_VALUE_L3;
	size_t pte = (virtual % LOOKUP_VALUE_L3) / PAGE_SIZE;

	/* get pde */
	pg_tab = (addr_t*) (pg_dir[pde] & 0xfffff000);

	addr_t ret_addr = pg_tab[pte] & 0xfffff000;

	/* zero entry */
	pg_tab[pte] = 0;

	/* invlpg address */
	//asm("invlpg %0" : "=g" (virtual));

	asm("movl %cr3, %eax");
	asm("movl %eax, %cr3");

	return ret_addr;
}
this function doesn't return the right physical address, but when I use the invlpg it does return the right physical address :?
The source of my problems is in the source.
User avatar
gaf
Member
Member
Posts: 349
Joined: Thu Oct 21, 2004 11:00 pm
Location: Munich, Germany

Post by gaf »

Argh.. If gcc doesn't like a blank colon at the end you might of course remove it:

Code: Select all

asm("invlpg (%0)" : :  "r"(address));
I would really suggest that you have another look at either of the tutorials ,as you still seem to have some problems with extended gas syntax. They both explain it the same way and especially the tutorial you linked to is actually quite clear about it:

Code: Select all

asm("foo %1,%2,%0" : "=r" (output) : "r" (input1), "r" (input2));
To put it into words: The assembler code is followed by a colon, after it all output registers must get listed. You use this section if you want to move the result(s) of your assembler code to a local C variable. The section is followed by another colon that start a list of input variables. Here you have to enumerate all registers that shall get initialized before the assembler code gets to run. After the input section there might be another section that includes all registers that get overwritten in the assembler code and neither appear in the input- nor output section. This list is needed to let gcc know which registers will get overwritten by the code.
I'll test it again and now instead of using invlpg I will try reloading cr3
Depending on your processor rewriting cr3 with the same value might not have any effect at all. If the cpu is smart, it detects there's no change and ommits the rewrite. You can force it to flush the whole tlb by first resetting cr3 to zero before writing the old value to it. In my opinion it's however still nonsense, as there's really no need to flush the whole tlb. Rebuilding the working set afterwards comes at the expense of several hundred page-faults, making a tlb flush the most expensive operation possible on a x86. Note that we're really talking about several thousand cycles. You do want to avoid it at all costs..

cheers,
gaf
User avatar
matthias
Member
Member
Posts: 158
Joined: Fri Oct 22, 2004 11:00 pm
Location: Vlaardingen, Holland
Contact:

Post by matthias »

Hmm, ok, I didn't intend to use the cr3 method, it was just for testing ;)

Code: Select all

asm("invlpg (%0)" : :  "r"(address));
It works :D But I still have the same problem as with the cr3 method. The address returned by MapPageOut is not ok. With my version of invlpg it did return the right address, but with yours and the cr3 method it returns something that is not ok. Any idea about that?? because I need this address to free the page ;)

It seems to return the virtual address :?

Image

output of following code:

Code: Select all

	addr_t* lol2 = (addr_t*)0xa0000000;

	puts("Allocating page\n");
	addr_t* lol = (addr_t*)AllocPage();
	printf("0x%08x\nMapping page\n",(addr_t)lol);
	MapPageIn((addr_t)lol, 0xa0000000, (addr_t)page_directory, 3);
	puts("Clearing page\n");
	ClearPage(lol);

	*lol2  = 0xaabbccdd;

	printf("lol = 0x%08x\n", *lol);

	addr_t test = MapPageOut((addr_t)lol2, (addr_t)page_directory);

	printf("test = 0x%08x\n", test);

	malloc(4097);

	/* fault */
	*lol2  = 0xaabbccdd;
Looks like a stack problem?

edit:

I've left out the invlpg command and now it returns the right physical address. I think I know how to solve this, I'll update
Last edited by matthias on Sun Aug 20, 2006 12:56 pm, edited 1 time in total.
The source of my problems is in the source.
User avatar
gaf
Member
Member
Posts: 349
Joined: Thu Oct 21, 2004 11:00 pm
Location: Munich, Germany

Post by gaf »

It's quite easy to expain what goes wrong in your cr3 code:
On the x86 the eax register is used to pass the return value to the caller. The compiler thus uses this register for ret_addr, so that there's no need to move the variable later on. Your assembler code however also makes use of eax when it caches cr3, and thus destroys the register's contents. As you haven't included eax in your clobber list, gcc doesn't know about the problem and assumes that ret_addr is still in eax:

Code: Select all

addr_t ret_addr = pg_tab[pte] & 0xfffff000;  // eax = phys address

asm("movl %cr3, %eax");   // eax = cr3
asm("movl %eax, %cr3");

return ret_addr;  // return eax
I'm not quite sure why the invlpg code doesn't work either. If the returned value is really the virtual address, the compiler might for some reason have chosen to use eax ("ret_addr") for the assembly code. I however don't see why it would do that..

Maybe one of these works for you:

Code: Select all

asm("invlpg %%eax" : : "a"(address));
asm("invlpg %0" : : "r"(address) : "memory"); 
asm("invlpg %0" : : "m"(*((uint*)address));
If they also fail, please post the disassembly of your function ("objdump -d memory.o > disasm.asm" -> search for the function name). It should make it much easier to figure out what exactly goes wrong..

regards,
gaf
User avatar
matthias
Member
Member
Posts: 158
Joined: Fri Oct 22, 2004 11:00 pm
Location: Vlaardingen, Holland
Contact:

Post by matthias »

gaf wrote:I'm not quite sure why the invlpg code doesn't work either. If the returned value is really the virtual address, the compiler might for some reason have chosen to use eax ("ret_addr") for the assembly code. I however don't see why it would do that..
That was my idea as well.

Code: Select all

000001f0 <_MapPageOut>:
 1f0:	55                   	push   %ebp
 1f1:	89 e5                	mov    %esp,%ebp
 1f3:	53                   	push   %ebx
 1f4:	8b 5d 08             	mov    0x8(%ebp),%ebx
 1f7:	8b 45 0c             	mov    0xc(%ebp),%eax
 1fa:	89 d9                	mov    %ebx,%ecx
 1fc:	89 da                	mov    %ebx,%edx
 1fe:	c1 e9 16             	shr    $0x16,%ecx
 201:	81 e2 ff ff 3f 00    	and    $0x3fffff,%edx
 207:	c1 ea 0c             	shr    $0xc,%edx
 20a:	8b 0c 88             	mov    (%eax,%ecx,4),%ecx
 20d:	81 e1 00 f0 ff ff    	and    $0xfffff000,%ecx
 213:	8b 04 91             	mov    (%ecx,%edx,4),%eax
 216:	c7 04 91 00 00 00 00 	movl   $0x0,(%ecx,%edx,4)
 21d:	25 00 f0 ff ff       	and    $0xfffff000,%eax
 222:	0f 01 3b             	invlpg (%ebx)
 225:	5b                   	pop    %ebx
 226:	5d                   	pop    %ebp
 227:	c3                   	ret    
 228:	90                   	nop    
 229:	8d b4 26 00 00 00 00 	lea    0x0(%esi,1),%esi
I'll try those other invlpg commands now ;)

The only one that compiled was:
asm("invlpg %0" : : "m"(*((uint*)address));

and it did the same thing as before:

Code: Select all

000001f0 <_MapPageOut>:
 1f0:	55                   	push   %ebp
 1f1:	89 e5                	mov    %esp,%ebp
 1f3:	53                   	push   %ebx
 1f4:	8b 5d 08             	mov    0x8(%ebp),%ebx
 1f7:	8b 45 0c             	mov    0xc(%ebp),%eax
 1fa:	89 d9                	mov    %ebx,%ecx
 1fc:	89 da                	mov    %ebx,%edx
 1fe:	c1 e9 16             	shr    $0x16,%ecx
 201:	81 e2 ff ff 3f 00    	and    $0x3fffff,%edx
 207:	c1 ea 0c             	shr    $0xc,%edx
 20a:	8b 0c 88             	mov    (%eax,%ecx,4),%ecx
 20d:	81 e1 00 f0 ff ff    	and    $0xfffff000,%ecx
 213:	8b 04 91             	mov    (%ecx,%edx,4),%eax
 216:	c7 04 91 00 00 00 00 	movl   $0x0,(%ecx,%edx,4)
 21d:	25 00 f0 ff ff       	and    $0xfffff000,%eax
 222:	0f 01 3b             	invlpg (%ebx)
 225:	5b                   	pop    %ebx
 226:	5d                   	pop    %ebp
 227:	c3                   	ret    
 228:	90                   	nop    
 229:	8d b4 26 00 00 00 00 	lea    0x0(%esi,1),%esi
The source of my problems is in the source.
User avatar
matthias
Member
Member
Posts: 158
Joined: Fri Oct 22, 2004 11:00 pm
Location: Vlaardingen, Holland
Contact:

Post by matthias »

Ok, making ret_addr volatile did the trick, but:

Code: Select all

000001f0 <_MapPageOut>:
 1f0:	55                   	push   %ebp
 1f1:	89 e5                	mov    %esp,%ebp
 1f3:	53                   	push   %ebx
 1f4:	51                   	push   %ecx
 1f5:	8b 4d 08             	mov    0x8(%ebp),%ecx
 1f8:	8b 45 0c             	mov    0xc(%ebp),%eax
 1fb:	89 cb                	mov    %ecx,%ebx
 1fd:	89 ca                	mov    %ecx,%edx
 1ff:	c1 eb 16             	shr    $0x16,%ebx
 202:	81 e2 ff ff 3f 00    	and    $0x3fffff,%edx
 208:	c1 ea 0c             	shr    $0xc,%edx
 20b:	8b 1c 98             	mov    (%eax,%ebx,4),%ebx
 20e:	81 e3 00 f0 ff ff    	and    $0xfffff000,%ebx
 214:	8b 04 93             	mov    (%ebx,%edx,4),%eax
 217:	25 00 f0 ff ff       	and    $0xfffff000,%eax
 21c:	89 45 f8             	mov    %eax,0xfffffff8(%ebp)
 21f:	c7 04 93 00 00 00 00 	movl   $0x0,(%ebx,%edx,4)
 226:	0f 01 39             	invlpg (%ecx)
 229:	8b 45 f8             	mov    0xfffffff8(%ebp),%eax
 22c:	5a                   	pop    %edx
 22d:	5b                   	pop    %ebx
 22e:	5d                   	pop    %ebp
 22f:	c3                   	ret
As you can see it sends a lot of payload into my function -_-
The source of my problems is in the source.
User avatar
gaf
Member
Member
Posts: 349
Joined: Thu Oct 21, 2004 11:00 pm
Location: Munich, Germany

Post by gaf »

Hmm, I just had a look a the two functions. The only difference I could find is, that the latter stores eax during the invlpg, while the version without volatile doesn't. Although both versions use eax to chache the return value, there actually shouldn't be any need to store the register, as invlpg doesn't seem to overwrite it:

Code: Select all

(with volatile)
mov    %eax,0xfffffff8(%ebp)   // store  ret_addr
movl   $0x0,(%ebx,%edx,4)      // *ret_addr = 0
invlpg (%ecx)                
mov    0xfffffff8(%ebp),%eax   // restore ret_addr
You can also use the "memory" version I posted last time to make sure that eax gets stored during the assembler code. Just add the brackets that I always seem to forget :). It's probably the better solution as it will work for all variables and not just those that are defined as volatile:

Code: Select all

asm("invlpg (%0)" : : "r"(address) : "memory");
I'm using the very same code on my kernel and will have a closer look at it later today. In case that I find out anything I'll let you know..

regards,
gaf
User avatar
matthias
Member
Member
Posts: 158
Joined: Fri Oct 22, 2004 11:00 pm
Location: Vlaardingen, Holland
Contact:

Post by matthias »

In the wiki on mega-tokyo they use:
mega-tokyo wrote: PGFLUSHTLB
invalidates the TLB (Translation Lookaside Buffer) for one specific virtual address (next memory reference for the page will be forced to re-read PDE and PTE from main memory. Must be issued every time you update one of those tables). m points to a logical address, not a physical or virtual one: an offset for your ds segment. Note *m is used, not just m: if you use m here, you invalidate the address of the m variable (not what you want!).

Code: Select all

static __inline__
void pgFlushOneTlb(void *m)
{
  asm volatile("invlpg %0"::"m" (*m));
}
The source of my problems is in the source.
Post Reply