Trying hardcoding a byte with the hex value of 0x48 right before the sysret instruction, see what happens.untwisted wrote:Well, the first line of that code chunk is the offending line, but it doesn't matter WHAT that line is (we originally thought it was a SYSCALL, then someone moved code around and it appeared to be a SYSRET. I padded the function with some NOPs and noticed its just that particular address).
After more testing it HAS to be a problem with the way we're trying to get syscall / sysret working. I *assumed* that we were making it back to supervised mode because thats where the page fault is happening, but I'm not so sure anymore. Basically we're in limbo between our SYSCALL / syscall handler.
General protection fault in 64 bit long mode
Just seeing if it was an issue with ensuring a 64-bit sysret.untwisted wrote:Ok, I've set the byte right before the sysret to 0x48 using a hex editor, and ran it. Nothing different that I can tell.
Am I missing something with this idea?
PS: You shouldn't be changing anything with a hex editor, this will throw off address calculations. *ALWAYS* re-assemble/compile your code after any change.
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
To be honest, I'm getting the idea that the actual cause of the problem isn't properly traced. For every pagefault you know what address is involved, and what code caused it. For the code that caused it you can determine the corresponding line of source code, and the argument that it uses, based on that you can look wether the values are as expected. So far I only get the idea that you are trying to make the error shut up without figuring out the true source.
So, could you please provide more info:
1: on how the syscall handler is being set up
2: on how the (virtual) memory layout of the OS is
3: what that first page fault is about.
4: an compiled image that we can test ourselves.
5: and if possible, some way to view all the sourcecode
So, could you please provide more info:
1: on how the syscall handler is being set up
2: on how the (virtual) memory layout of the OS is
3: what that first page fault is about.
4: an compiled image that we can test ourselves.
5: and if possible, some way to view all the sourcecode
We're not trying to make it just shut up, and we are tracing it (to the best of my knowledge) properly. We're all new to this and theres a very good possibility that we're doing things just plain wrong. Thanks for the help though
1: Right now there isn't much in the syscall handler. Here is the code. What we've intended it to do is just syscall / sysret loop just to make sure everything is working.
This code is toward the end of our main function.
2: We're running a flat 1:1 memory model right now using 2mb pages. We have yet to do anything with virtual memory yet really.
3: The first fault is due to a page not present, it occurs on a write (during a print statement) and in supervised mode.
4: http://www.pittgeeks.org/projects/files/paganos.iso
5: http://xomb.googlecode.com/svn/trunk/
1: Right now there isn't much in the syscall handler. Here is the code. What we've intended it to do is just syscall / sysret loop just to make sure everything is working.
This code is toward the end of our main function.
Code: Select all
if(!(cpuid(0x8000_0001) & 0b1000_0000_0000))
{
kprintfln("Your computer is not cool enough, we need SYSCALL and SYSRET.");
asm { cli; hlt; }
}
const ulong STAR = 0x003b_0010_0000_0000;
const uint STARHI = STAR >> 32;
const uint STARLO = STAR & 0xFFFFFFFF;
lstar.setHandler(&sysCallHandler);
asm
{
// Set the STAR register.
"movl $0xC0000081, %%ecx" ::: "ecx";
"movl %0, %%edx" :: "i" STARHI : "edx";
"movl %0, %%eax" :: "i" STARLO : "eax";
"wrmsr";
// Set the SF_MASK register. Top should be 0, bottom is our mask,
// but we're not masking anything (yet).
"xorl %%eax, %%eax" ::: "eax";
"xorl %%edx, %%edx" ::: "edx";
"movl $0xC0000084, %%ecx" ::: "ecx";
"wrmsr";
// Jump to user mode.
"movq $testUser, %%rcx" ::: "rcx";
"movq $0, %%r11" ::: "r11";
"sysretq";
}
}
void sysCallHandler()
{
asm
{
naked;
"nop";
"nop";
"nop";
"sysretq";
}
}
extern(C) void testUser()
{
asm
{
naked;
"nop";
"nop";
"nop";
"syscall";
}
}
3: The first fault is due to a page not present, it occurs on a write (during a print statement) and in supervised mode.
4: http://www.pittgeeks.org/projects/files/paganos.iso
5: http://xomb.googlecode.com/svn/trunk/
I really appreciate the help. It was my understanding that if a page is in memory, but doesn't have its present bit set it would fault (minor fault) and the bit would be set automagically. After writing that last sentence I'm second guessing myself though. If this sounds incredibly noobish, I'm sorry, we're all new at this
One of the members of the group set the SS to null at the top of our main function, and managed to avoid the GPE, but then got a different error because the return address for iretq was not in canonical form. He then & the top 32 bits with FFFFFF to see if that would fix it, and now our code manages to infinite loop jumping between user mode and supervised mode.
I'm not sure what that means really as I also thought that return addresses were popped on to the stack by the CPU, and I would have assumed that the CPU would ensure that the address was in canonical form.
This is all so frustrating! I wish the documentation was better.
One of the members of the group set the SS to null at the top of our main function, and managed to avoid the GPE, but then got a different error because the return address for iretq was not in canonical form. He then & the top 32 bits with FFFFFF to see if that would fix it, and now our code manages to infinite loop jumping between user mode and supervised mode.
I'm not sure what that means really as I also thought that return addresses were popped on to the stack by the CPU, and I would have assumed that the CPU would ensure that the address was in canonical form.
This is all so frustrating! I wish the documentation was better.
I am the only one who finds the isr_common handling inside a C function wierd or very dangerous. You depend on the compiler not to insert any code before your asm block you should never do that. Write the isr_common in asm. problably that helps already.
Author of COBOS
Well, I think we've got this one licked. We got a new set of eyes on the project and he sat down and went line by line in the asm dump. He found a few bugs that we've since cleaned up (one being that we cast our longs to ints by accident in our atoi function). The GPE / page fault were coming from lstar being set incorrectly causing an invalid rIP to be pushed on to the stack.
Thanks for all of your help guys. Hopefully I'll be able to offer some help to the next noob that comes along like myself
Edit: Just as an aside, the problem came with our inline asm. For some reason, the assembler wasn't paying attention to our clobber list and then wrote over registers that we needed to preserve.
It has since been fixed and the code committed to svn.
Thanks for all of your help guys. Hopefully I'll be able to offer some help to the next noob that comes along like myself
Edit: Just as an aside, the problem came with our inline asm. For some reason, the assembler wasn't paying attention to our clobber list and then wrote over registers that we needed to preserve.
It has since been fixed and the code committed to svn.