Page 1 of 2
[SOLVED] Unexpected behaviour
Posted: Wed Jun 20, 2012 1:27 am
by nop
Hi to everyone!
This innocent code takes the value of the CR2 register and puts it in the provided variable.
Code: Select all
static inline void store_cr2(u64_t *cr2)
{
bochs_bp();
__asm__ __volatile__(
"movq %%cr2,%%rax\n\t"
:"=A"(*cr2));
}
As you can easily guess this is invoked in my do_page_fault ISR.
What I didn't expect was is that, in long mode, only the low four bytes are copied and the remaining are all set.
To be more explicit if the faulting address is 0xc0010000 what is copied in RAX is 0xFFFFFFFFc0010000.
Just think what happens to my mmap routines when they want to get the PML4 offset of the linear address: in the case above they get 511 rather than 0.
This gave me headaches because I looked elsewhere for days to find where the problem was.
In long mode the CR2 is 64 bits wide but the internal debugger of Bochs shows it 32 bits wide while CPU is running in long mode.
What am I missing?? I'm cross compiling in a 32 bit environment with gcc 4.x
Thank you in advance!
Teo
Re: Unexpected behaviour
Posted: Wed Jun 20, 2012 2:04 am
by iansjack
It may just be me, but I find it a bit confusing that you are using "cr2" as both a register name and a variable name. What happens if you use:
Code: Select all
static inline void store_cr2(long *cr2val)
{
__asm__ __volatile__("movq %%cr2,%%rax;"
"movq %%rax, %0;"
:"=m"(*cr2val));
}
instead?
Re: Unexpected behaviour
Posted: Wed Jun 20, 2012 3:21 am
by qw
The problem is in the "=A". The destination is somewhere in memory, so it must be "=m" (as in iansjack's code).
BTW this is a bit more efficient:
Code: Select all
static inline uint64_t get_cr2(void)
{
uint64_t rval;
__asm__ __volatile__ ("movq %%cr2, %0" : "=a" (rval));
return rval;
}
Re: Unexpected behaviour
Posted: Wed Jun 20, 2012 3:37 am
by gerryg400
You also need to clobber something to stop the function from being re-ordered when its called, like this
Code: Select all
static __inline__ unsigned long rd_cr2() {
unsigned long rval;
__asm__ __volatile__ ("mov %%cr2, %0" : "=r" (rval) : : "memory");
return rval;
}
This one works in both 32bit and 64bit programmes.
Re: Unexpected behaviour
Posted: Wed Jun 20, 2012 4:41 am
by qw
Gerryg400, in this case that is not necessary, because the asm instruction has side effects: "=r" (rval).
Re: Unexpected behaviour
Posted: Wed Jun 20, 2012 4:44 am
by gerryg400
Ooops, yes, true.
Re: Unexpected behaviour
Posted: Wed Jun 20, 2012 6:25 pm
by jbemmel
You are seeing canonical 64-bit addresses:
http://en.wikipedia.org/wiki/X86-64#Can ... _addresses
x86-64 mode currently does not use all 64 available bits; the upper bits are required to be set to 1 (in order to keep OS designers from using them for internal purposes, hindering future expansion)
Re: Unexpected behaviour
Posted: Wed Jun 20, 2012 11:52 pm
by nop
Thanks for all your prompt replies!
Hobbes wrote:The problem is in the "=A". The destination is somewhere in memory, so it must be "=m" (as in iansjack's code).
BTW this is a bit more efficient:
Code: Select all
static inline uint64_t get_cr2(void)
{
uint64_t rval;
__asm__ __volatile__ ("movq %%cr2, %0" : "=a" (rval));
return rval;
}
This is exactly my first implementation (if I'm not wrong it's also in the asm inline example of the wiki), the code I posted in the opening of the thread was a (wrong) variant - I was exploring the problem looking for a solution: the "=A" was because I wanted to make sure to use a 64 bits register.
jbemmel wrote:You are seeing canonical 64-bit addresses:
http://en.wikipedia.org/wiki/X86-64#Can ... _addresses
x86-64 mode currently does not use all 64 available bits; the upper bits are required to be set to 1 (in order to keep OS designers from using them for internal purposes, hindering future expansion)
I thought the same, but actually I think this isn't the case: the faulting address doesn't have the 47th bit set, is in the lower half; however I'm thinking of a quirk that I'm missing related to this
To recap: the following code
Code: Select all
static inline u64_t get_cr2()
{
u64_t val = 0;
bochs_bp(); // Bochs magic breakpoint
__asm__ __volatile__(
"nop\n\t"
"movq %%cr2,%0\n\t"
"nop\n\t"
:"=r"(val));
return val;
}
moves the value 0xFFFFFFFFc0010000 to the destination register chosen (R12 in my case) when CR2 contains 0xc0010000.
Is this a standard behaviour? I don't know what to think, both bochs and qemu behave the same way and my code fails to track the page directory table of the faulting address to make the proper checks - I have to dump physical address (puke)
What happens in your long mode OS when you pagefault in the lower half?
In any way I'm taking again the Intel manuals in my hands - if it is a standard behaviour (as it probably is) shame on me for having missed this for all that time
Thanks again!
Re: Unexpected behaviour
Posted: Thu Jun 21, 2012 12:37 am
by iansjack
Look at the code that is actually generated by the variants. That way you can see what is really happening and which is the most efficient implementation. Personally I do this sort of thing as an assembler routine rather than using inline assembler. I can assure you that "mov %cr2, %rax" does exactly what you would expect it to.
It might also help to single-step the assembler code in a debugger to see exactly what is happeing to the various memory locations and registers. This will give a better understanding of the result you are getting.
Re: Unexpected behaviour
Posted: Thu Jun 21, 2012 1:32 am
by qw
nop wrote:This is exactly my first implementation (if I'm not wrong it's also in the asm inline example of the wiki), the code I posted in the opening of the thread was a (wrong) variant - I was exploring the problem looking for a solution: the "=A" was because I wanted to make sure to use a 64 bits register.
"=A" does not mean RAX but RDX:RAX (or EDX:EAX in 32-bits code). If you want to force the use of RAX, use "=a" instead of "=r". However, for optimization, you'd better leave the decision what register is used to the compiler.
Re: Unexpected behaviour
Posted: Thu Jun 21, 2012 1:58 am
by iansjack
Just as a matter of interest, here is the code (omitting the "bochs_bp()" call) generated by the OP version:
Code: Select all
000000000040049c <store_cr2>:
40049c: 55 push %rbp
40049d: 48 89 e5 mov %rsp,%rbp
4004a0: 48 89 7d f8 mov %rdi,-0x8(%rbp)
4004a4: 0f 20 d0 mov %cr2,%rax
4004a7: 48 8b 45 f8 mov -0x8(%rbp),%rax
4004ab: 48 89 10 mov %rdx,(%rax)
4004ae: 5d pop %rbp
4004af: c3
and here is the version produced by the version I suggested:
Code: Select all
000000000040049c <store_cr2>:
40049c: 55 push %rbp
40049d: 48 89 e5 mov %rsp,%rbp
4004a0: 48 89 7d f8 mov %rdi,-0x8(%rbp)
4004a4: 48 8b 45 f8 mov -0x8(%rbp),%rax
4004a8: 0f 20 d0 mov %cr2,%rax
4004ab: 48 89 00 mov %rax,(%rax)
4004ae: 5d pop %rbp
4004af: c3
The original version is ignoring what it read from %cr2 and instead is putting the value of %rdx, on entry to the routine, into the input parameter which is not what is required. (You'd have to inspect the code at a deeper level to know what this value actually is.) My version puts the value of %cr2 into the parameter which is what you want.
Moral - if inline assembler isn't working as intended inspect the generated code.
Re: Unexpected behaviour
Posted: Thu Jun 21, 2012 3:26 am
by gerryg400
nop wrote:What happens in your long mode OS when you pagefault in the lower half?
In my OS cr2 contains the expected value (i.e. 0xc0010000). It really looks like you are moving cr2 to the EAX register and that it's being sign extended. As iansjack said look at the asm.
Re: Unexpected behaviour
Posted: Thu Jun 21, 2012 5:03 am
by iansjack
Actually, examining the code more carefully, my example doesn't give the correct result either.
It should be:
Code: Select all
static inline void store_cr2(long *cr2val)
{
__asm__ __volatile__("movq %%cr2,%%rax;"
"movq %%rax, %0;"
:"=m"(*cr2val)
:
:"rax");
}
producing the assembler
Code: Select all
000000000040049c <store_cr2>:
40049c: 55 push %rbp
40049d: 48 89 e5 mov %rsp,%rbp
4004a0: 48 89 7d f8 mov %rdi,-0x8(%rbp)
4004a4: 48 8b 55 f8 mov -0x8(%rbp),%rdx
4004a8: 0f 20 d0 mov %cr2,%rax
4004ab: 48 89 02 mov %rax,(%rdx)
4004ae: 5d pop %rbp
4004af: c3 retq
which, at last, gives the desired result.
Morals, and red faces, all round!
Re: Unexpected behaviour
Posted: Thu Jun 21, 2012 6:33 am
by Combuster
Hobbes wrote:However, for optimization, you'd better leave the decision what register is used to the compiler.
For optimisation, return the value instead of writing an argument by reference. Eliminates bad pointer possibilities for free as well.
Re: Unexpected behaviour
Posted: Thu Jun 21, 2012 6:44 am
by iansjack
I agree with your second point, but optimisation? (Not that it was you who raised the subject of optimisation.) Why worry about optimisation in a function like this that is only going to be called occasionally. In fact, why make it inline. I would just write a simple assembler function:
and leave it at that. You may have the overhead of a function call (but you've eliminated a lot of unnecessary instructions), but so what? You'll declare the function in a header file as returning a long value, so the compiler knows that %rax is going to be clobbered.