OSDev.org

Posted: **Thu Apr 23, 2009 2:07 pm**

I'm currently working on devising a method of redirecting execution flow on-the-fly to another location in an address space.
Although there exists a technique that works fairly well (considering it's an ugly hack, of course) on x86, doing it natively on the x86_64 is much more complex. Because an unconditional relative jump has a maximum immediate that is a signed 32 bit integer that method won't work for jumping to a heap-allocated memory region on x86_64 (expanded address space). Therefore I have two options,

1) Embed an encoded indirect absolute jump to a 64 bit memory region ala: "rex.W jmpq *0x0(%rip); .quad $sym;"

2) Embed two encoded instructions: "movabs $sym, %r11; jmp *%r11;"

According the the ELF x86-64 spec, %r11 isn't used for function argument passing nor is it to be guaranteed across functions for storage, so I am thinking it would make a better choice than the first option. The problem now consists of some other major issues.

1) Control transfer instructions are all relative to the current instruction pointer, and therefore the immediate will need to be fixed up. The math involved isn't the problem, the instruction offsets are. If I have to extend a call/jmp instruction to use %r11 like above, it will clobber the instructions ahead of it. This can probably be mitigated by disassembling each instruction in a given function, copying each one as per normal, and offsetting the latter instructions to cope. It's ugly, but it may work.

2) The stack. Return addresses are stored in the stack frame, and I don't believe that the value encoded is going to be valid once a function is relocated. I don't know what to do here.

Any ideas on how I could go about doing this? I know it's very complex and I may not have explained it well but if you have any thoughts, I'd like to hear them. I'm running a native 64 bit linux system, in case you were wondering.

Posted: **Thu Apr 23, 2009 7:21 pm**

You could make the code jump to a closer location, then a farther one. For instance, if the code you want to change has:

Code: Select all

some_jump:
jmp relative_offset

Somewhere within 2GB of that instruction write:

Code: Select all

long_jump:
mov r11, far_away
jmp r11

and then change the origional jump to:

Code: Select all

some_jump:
jmp long_jump

If you are working with C code you may want to try something like -mcmodel=large, which will allow function calls to anywhere in the 64-bit address space. GCC usually implements it with something like:

Code: Select all

mov rax, far_away
call rax

This is probably your best bet unless you are using this to modify code that you don't have access to in its origional form.

If you don't mind me asking, what do you need to rewrite short jumps for? You aren't doing anything involving code injection are you

Posted: **Thu Apr 23, 2009 8:13 pm**

JohnnyTheDon wrote:You could make the code jump to a closer location, then a farther one. For instance, if the code you want to change has:
Code: Select all
some_jump:
jmp relative_offset
Somewhere within 2GB of that instruction write:
Code: Select all
long_jump:
mov r11, far_away
jmp r11
and then change the origional jump to:
Code: Select all
some_jump:
jmp long_jump
If you are working with C code you may want to try something like -mcmodel=large, which will allow function calls to anywhere in the 64-bit address space. GCC usually implements it with something like:
Code: Select all
mov rax, far_away
call rax
This is probably your best bet unless you are using this to modify code that you don't have access to in its origional form.

If you don't mind me asking, what do you need to rewrite short jumps for? You aren't doing anything involving code injection are you

Actually, you hit the nail on the head with that last one. I'm working on a live binary (currently ELF only) reverse engineering library. I'm looking to redirect execution flow at run time, using a plethora of methods. PLT and GOT hijacking, sandboxing, syscall interception, function signature and symbol reconstruction and a couple other neat things I've been working on over the past few months. Yes, this has to be done in a blind method, I don't want to modify any source code at all. It's also BSD licensed.

Posted: **Fri Apr 24, 2009 2:08 am**

Hi,

syntropy wrote:Actually, you hit the nail on the head with that last one. I'm working on a live binary (currently ELF only) reverse engineering library. I'm looking to redirect execution flow at run time, using a plethora of methods. PLT and GOT hijacking, sandboxing, syscall interception, function signature and symbol reconstruction and a couple other neat things I've been working on over the past few months. Yes, this has to be done in a blind method, I don't want to modify any source code at all. It's also BSD licensed.

In that case, I'd look into the CPU's debugging features - most newer CPUs (Pentium 4 or later) will generate a breakpoint exception on every control transfer (set the "single-step on branches" flag in the "MSR_DEBUGCTLA" MSR, then enable single stepping by setting the TF in EFLAGS). The nice thing here is that it'll generate a breakpoint exception on every control transfer (including CALL/RET, interrupts, Jcc, JMP, etc) and you don't need to modify the original code. There's also "last branch MSRs" that can be used to determine where the code is branching from and where it's branching to.

You could also configure the CPU so that during each control transfer the CPU writes "from_EIP" and "to_EIP" in a buffer in RAM for you (it's called a "Branch Trace Store")...

Cheers,

Brendan

Posted: **Fri Apr 24, 2009 8:43 am**

Brendan wrote:Hi,

syntropy wrote:Actually, you hit the nail on the head with that last one. I'm working on a live binary (currently ELF only) reverse engineering library. I'm looking to redirect execution flow at run time, using a plethora of methods. PLT and GOT hijacking, sandboxing, syscall interception, function signature and symbol reconstruction and a couple other neat things I've been working on over the past few months. Yes, this has to be done in a blind method, I don't want to modify any source code at all. It's also BSD licensed.
In that case, I'd look into the CPU's debugging features - most newer CPUs (Pentium 4 or later) will generate a breakpoint exception on every control transfer (set the "single-step on branches" flag in the "MSR_DEBUGCTLA" MSR, then enable single stepping by setting the TF in EFLAGS). The nice thing here is that it'll generate a breakpoint exception on every control transfer (including CALL/RET, interrupts, Jcc, JMP, etc) and you don't need to modify the original code. There's also "last branch MSRs" that can be used to determine where the code is branching from and where it's branching to.

You could also configure the CPU so that during each control transfer the CPU writes "from_EIP" and "to_EIP" in a buffer in RAM for you (it's called a "Branch Trace Store")...

Cheers,

Brendan

That sounds very interesting. I saw something about branch trace store in my last Linux kernel recompile but I never looked into it.
Basically, this part of the library backs up a function prologue to the heap (the trampoline) and overwrites original function's prologue to make an unconditional jump to a function in a loaded shared library. On x86, this was dead easy because the virtual address space was 32 bits, but in it's 64 bit systems, the shared libraries are loaded at a much higher address than the basic 0xe9 unconditional jump can reach. Last I checked, the from_EIP and to_EIP were read-only, so I don't think I could use them for modifying execution flow, but I may be wrong.

OSDev.org

Execution flow redirection on-the-fly

Execution flow redirection on-the-fly

Re: Execution flow redirection on-the-fly

Re: Execution flow redirection on-the-fly

Re: Execution flow redirection on-the-fly

Re: Execution flow redirection on-the-fly