[SOLVED] Unexpected behaviour

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
nop
Posts: 20
Joined: Sat Jan 07, 2012 7:42 am
Location: Italy

[SOLVED] Unexpected behaviour

Post by nop »

Hi to everyone!

This innocent code takes the value of the CR2 register and puts it in the provided variable.

Code: Select all

static inline void store_cr2(u64_t *cr2)
{
    bochs_bp();
    __asm__ __volatile__(
            "movq %%cr2,%%rax\n\t"
            :"=A"(*cr2));
}

As you can easily guess this is invoked in my do_page_fault ISR.
What I didn't expect was is that, in long mode, only the low four bytes are copied and the remaining are all set.

To be more explicit if the faulting address is 0xc0010000 what is copied in RAX is 0xFFFFFFFFc0010000.
Just think what happens to my mmap routines when they want to get the PML4 offset of the linear address: in the case above they get 511 rather than 0.
This gave me headaches because I looked elsewhere for days to find where the problem was.
In long mode the CR2 is 64 bits wide but the internal debugger of Bochs shows it 32 bits wide while CPU is running in long mode.
What am I missing?? I'm cross compiling in a 32 bit environment with gcc 4.x

Thank you in advance!

Teo
Last edited by nop on Sat Jun 23, 2012 12:00 pm, edited 1 time in total.
OS development is the intelligent alternative to drugs
User avatar
iansjack
Member
Member
Posts: 4711
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Unexpected behaviour

Post by iansjack »

It may just be me, but I find it a bit confusing that you are using "cr2" as both a register name and a variable name. What happens if you use:

Code: Select all

static inline void store_cr2(long *cr2val)
{
 	__asm__ __volatile__("movq %%cr2,%%rax;"
            "movq %%rax, %0;"
            :"=m"(*cr2val));
}
instead?
User avatar
qw
Member
Member
Posts: 792
Joined: Mon Jan 26, 2009 2:48 am

Re: Unexpected behaviour

Post by qw »

The problem is in the "=A". The destination is somewhere in memory, so it must be "=m" (as in iansjack's code).

BTW this is a bit more efficient:

Code: Select all

static inline uint64_t get_cr2(void)
{
    uint64_t rval;
    __asm__ __volatile__ ("movq %%cr2, %0" : "=a" (rval));
    return rval;
}
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: Unexpected behaviour

Post by gerryg400 »

You also need to clobber something to stop the function from being re-ordered when its called, like this

Code: Select all

static __inline__ unsigned long rd_cr2() {

    unsigned long rval;
    __asm__ __volatile__ ("mov %%cr2, %0" : "=r" (rval) : : "memory");
    return rval;
}
This one works in both 32bit and 64bit programmes.
If a trainstation is where trains stop, what is a workstation ?
User avatar
qw
Member
Member
Posts: 792
Joined: Mon Jan 26, 2009 2:48 am

Re: Unexpected behaviour

Post by qw »

Gerryg400, in this case that is not necessary, because the asm instruction has side effects: "=r" (rval).
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: Unexpected behaviour

Post by gerryg400 »

Ooops, yes, true.
If a trainstation is where trains stop, what is a workstation ?
jbemmel
Member
Member
Posts: 53
Joined: Fri May 11, 2012 11:54 am

Re: Unexpected behaviour

Post by jbemmel »

You are seeing canonical 64-bit addresses: http://en.wikipedia.org/wiki/X86-64#Can ... _addresses

x86-64 mode currently does not use all 64 available bits; the upper bits are required to be set to 1 (in order to keep OS designers from using them for internal purposes, hindering future expansion)
nop
Posts: 20
Joined: Sat Jan 07, 2012 7:42 am
Location: Italy

Re: Unexpected behaviour

Post by nop »

Thanks for all your prompt replies!
Hobbes wrote:The problem is in the "=A". The destination is somewhere in memory, so it must be "=m" (as in iansjack's code).

BTW this is a bit more efficient:

Code: Select all

static inline uint64_t get_cr2(void)
{
    uint64_t rval;
    __asm__ __volatile__ ("movq %%cr2, %0" : "=a" (rval));
    return rval;
}
This is exactly my first implementation (if I'm not wrong it's also in the asm inline example of the wiki), the code I posted in the opening of the thread was a (wrong) variant - I was exploring the problem looking for a solution: the "=A" was because I wanted to make sure to use a 64 bits register.
jbemmel wrote:You are seeing canonical 64-bit addresses: http://en.wikipedia.org/wiki/X86-64#Can ... _addresses

x86-64 mode currently does not use all 64 available bits; the upper bits are required to be set to 1 (in order to keep OS designers from using them for internal purposes, hindering future expansion)
I thought the same, but actually I think this isn't the case: the faulting address doesn't have the 47th bit set, is in the lower half; however I'm thinking of a quirk that I'm missing related to this

To recap: the following code

Code: Select all

static inline u64_t get_cr2()
{
	u64_t val = 0;

	bochs_bp(); // Bochs magic breakpoint
	__asm__ __volatile__(
			"nop\n\t"
			"movq %%cr2,%0\n\t"
			"nop\n\t"
			:"=r"(val));
	return val;
}
moves the value 0xFFFFFFFFc0010000 to the destination register chosen (R12 in my case) when CR2 contains 0xc0010000.
Is this a standard behaviour? I don't know what to think, both bochs and qemu behave the same way and my code fails to track the page directory table of the faulting address to make the proper checks - I have to dump physical address (puke)
What happens in your long mode OS when you pagefault in the lower half?
In any way I'm taking again the Intel manuals in my hands - if it is a standard behaviour (as it probably is) shame on me for having missed this for all that time

Thanks again!
OS development is the intelligent alternative to drugs
User avatar
iansjack
Member
Member
Posts: 4711
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Unexpected behaviour

Post by iansjack »

Look at the code that is actually generated by the variants. That way you can see what is really happening and which is the most efficient implementation. Personally I do this sort of thing as an assembler routine rather than using inline assembler. I can assure you that "mov %cr2, %rax" does exactly what you would expect it to.

It might also help to single-step the assembler code in a debugger to see exactly what is happeing to the various memory locations and registers. This will give a better understanding of the result you are getting.
User avatar
qw
Member
Member
Posts: 792
Joined: Mon Jan 26, 2009 2:48 am

Re: Unexpected behaviour

Post by qw »

nop wrote:This is exactly my first implementation (if I'm not wrong it's also in the asm inline example of the wiki), the code I posted in the opening of the thread was a (wrong) variant - I was exploring the problem looking for a solution: the "=A" was because I wanted to make sure to use a 64 bits register.
"=A" does not mean RAX but RDX:RAX (or EDX:EAX in 32-bits code). If you want to force the use of RAX, use "=a" instead of "=r". However, for optimization, you'd better leave the decision what register is used to the compiler.
User avatar
iansjack
Member
Member
Posts: 4711
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Unexpected behaviour

Post by iansjack »

Just as a matter of interest, here is the code (omitting the "bochs_bp()" call) generated by the OP version:

Code: Select all

000000000040049c <store_cr2>:
  40049c:	55                   	push   %rbp
  40049d:	48 89 e5             	mov    %rsp,%rbp
  4004a0:	48 89 7d f8          	mov    %rdi,-0x8(%rbp)
  4004a4:	0f 20 d0             	mov    %cr2,%rax
  4004a7:	48 8b 45 f8          	mov    -0x8(%rbp),%rax
  4004ab:	48 89 10             	mov    %rdx,(%rax)
  4004ae:	5d                   	pop    %rbp
  4004af:	c3
and here is the version produced by the version I suggested:

Code: Select all

000000000040049c <store_cr2>:
  40049c:	55                   	push   %rbp
  40049d:	48 89 e5             	mov    %rsp,%rbp
  4004a0:	48 89 7d f8          	mov    %rdi,-0x8(%rbp)
  4004a4:	48 8b 45 f8          	mov    -0x8(%rbp),%rax
  4004a8:	0f 20 d0             	mov    %cr2,%rax
  4004ab:	48 89 00             	mov    %rax,(%rax)
  4004ae:	5d                   	pop    %rbp
  4004af:	c3
The original version is ignoring what it read from %cr2 and instead is putting the value of %rdx, on entry to the routine, into the input parameter which is not what is required. (You'd have to inspect the code at a deeper level to know what this value actually is.) My version puts the value of %cr2 into the parameter which is what you want.

Moral - if inline assembler isn't working as intended inspect the generated code.
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: Unexpected behaviour

Post by gerryg400 »

nop wrote:What happens in your long mode OS when you pagefault in the lower half?
In my OS cr2 contains the expected value (i.e. 0xc0010000). It really looks like you are moving cr2 to the EAX register and that it's being sign extended. As iansjack said look at the asm.
If a trainstation is where trains stop, what is a workstation ?
User avatar
iansjack
Member
Member
Posts: 4711
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Unexpected behaviour

Post by iansjack »

Actually, examining the code more carefully, my example doesn't give the correct result either. :oops:

It should be:

Code: Select all

static inline void store_cr2(long *cr2val)
{
   __asm__ __volatile__("movq %%cr2,%%rax;"
            "movq %%rax, %0;"
            :"=m"(*cr2val)
            :
            :"rax");
}
producing the assembler

Code: Select all

000000000040049c <store_cr2>:
  40049c:	55                   	push   %rbp
  40049d:	48 89 e5             	mov    %rsp,%rbp
  4004a0:	48 89 7d f8          	mov    %rdi,-0x8(%rbp)
  4004a4:	48 8b 55 f8          	mov    -0x8(%rbp),%rdx
  4004a8:	0f 20 d0             	mov    %cr2,%rax
  4004ab:	48 89 02             	mov    %rax,(%rdx)
  4004ae:	5d                   	pop    %rbp
  4004af:	c3                   	retq   
which, at last, gives the desired result.

Morals, and red faces, all round!
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Unexpected behaviour

Post by Combuster »

Hobbes wrote:However, for optimization, you'd better leave the decision what register is used to the compiler.
For optimisation, return the value instead of writing an argument by reference. Eliminates bad pointer possibilities for free as well. :wink:
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
iansjack
Member
Member
Posts: 4711
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Unexpected behaviour

Post by iansjack »

I agree with your second point, but optimisation? (Not that it was you who raised the subject of optimisation.) Why worry about optimisation in a function like this that is only going to be called occasionally. In fact, why make it inline. I would just write a simple assembler function:

Code: Select all

get_cr2:
   mov %cr2, %rax
   ret
and leave it at that. You may have the overhead of a function call (but you've eliminated a lot of unnecessary instructions), but so what? You'll declare the function in a header file as returning a long value, so the compiler knows that %rax is going to be clobbered.
Post Reply