Page 1 of 2

machine code ? and nasm

Posted: Wed May 19, 2010 8:40 am
by Sam111
Ok, I am a little confused.

somebody said the machine code for

Code: Select all

or eax,-1
xor eax,0f7fb79ffh
call eax

This becomes 83 C8 FF 35 FF 79 FB F7 FF D0.
But when I do

Code: Select all

[BITS 32]



or eax,-1
xor eax,0f7fb79ffh
call eax
And compile with nasm -f bin filename.asm

I get when I look at it with a hexeditor
it shows that the machine code is

Code: Select all

0D FF FF FF FF 35 FF 79 FB F7 FF D0
why the difference ?
I am on an intel pentium 4 dell dimension 4600 on the latest ubuntu OS.

Re: machine code ? and nasm

Posted: Wed May 19, 2010 8:43 am
by qw
By default, NASM generates the long version (0D FF FF FF FF) of "OR EAX, -1". If you want the short version (83 C8 FF) you must explicitly write "OR EAX, BYTE -1".

BTW Funny piece of code... What's its use?

Re: machine code ? and nasm

Posted: Wed May 19, 2010 9:11 am
by Sam111
Is their any difference in using the long version as opposed to the short version both do the exact same thing. Correct?

This code is going to be used for part of my shellcode (I am trying to learn how to create a buffer overflow exploit)

Anyway I have a few more machine code questions

Also when I compile

Code: Select all

call 8050404h
it gives me for machine code

Code: Select all

E8 FF 03 05 08
But when I compile

Code: Select all

call 0000:8050404h
I get this for machine code which looks like it is correct because it has the correct address in it.

Code: Select all

9A 04 04 05 08 00 00 
What I want to know is what is the difference between
call 8050404h and call 0000:8050404h in nasm's eye's they are completely different machine code's???

Thanks

Re: machine code ? and nasm

Posted: Wed May 19, 2010 9:20 am
by JamesM
Sam111 wrote:Is their any difference in using the long version as opposed to the short version both do the exact same thing. Correct?

This code is going to be used for part of my shellcode (I am trying to learn how to create a buffer overflow exploit)

Anyway I have a few more machine code questions

Also when I compile

Code: Select all

call 8050404h
it gives me for machine code

Code: Select all

E8 FF 03 05 08
But when I compile

Code: Select all

call 0000:8050404h
I get this for machine code which looks like it is correct because it has the correct address in it.

Code: Select all

9A 04 04 05 08 00 00 
What I want to know is what is the difference between
call 8050404h and call 0000:8050404h in nasm's eye's they are completely different machine code's???

Thanks
Read the manual. The latter is a FAR CALL. The former is a NEAR CALL, that is relatively addressed (not absolute) - the operand is a signed integer to add to the current PC location.

Re: machine code ? and nasm

Posted: Wed May 19, 2010 9:36 am
by Solar
Sam111 wrote:Is their any difference in using the long version as opposed to the short version both do the exact same thing. Correct?
No. One is using a 32-bit operand, the other a 8-bit operand... which has some impact on cache lines, memory usage, access and decoding speed as well as the limitation that you can easily switch the long version to using e.g. 1032 as operand, which might get difficult with the short version...

Re: machine code ? and nasm

Posted: Wed May 19, 2010 11:05 am
by Sam111
The latter is a FAR CALL. The former is a NEAR CALL, that is relatively addressed (not absolute) - the operand is a signed integer to add to the current PC location.
I guess what I am getting at is
it ok to use far call's in user mode.
call 0000:32bit address (where the 32bit address is the exact memory location you want to jump to)

What I am worried about is what should the cs selector be set to
Because I would think I would get a protection violation if I used the wrong cs selector in usermode?

Also for my near call machine code

Code: Select all

E8 FF 03 05 08
if it is a signed 4 byte integer address after the E8 then won't the signed version of 8050404h be not 80503FF don't get why the

Code: Select all

FF 03 05 08
?

Either way far or near call it seems near calls can access all the memory address as far can. The only difference is far can change the cs selector. So to call the address 8050404h (i.e jump to 8050404h ) Should I use a near or far call?

I would think you almost always want a near call.
But the ofcourse is this vaild code to get me to the 8050404h address

Code: Select all

E8 FF 03 05 08
Thanks

Re: machine code ? and nasm

Posted: Wed May 19, 2010 12:42 pm
by Selenic
Sam111 wrote:Either way far or near call it seems near calls can access all the memory address as far can. The only difference is far can change the cs selector. So to call the address 8050404h (i.e jump to 8050404h ) Should I use a near or far call?

I would think you almost always want a near call.
But the ofcourse is this vaild code to get me to the 8050404h address

Code: Select all

E8 FF 03 05 08
There's one other difference which you're forgetting between near and far calls: as mentioned before, near calls are relative. In other words, that code says 'jmp .+0x080503FF'. Notice that there's a difference of 5 between the two; this is because it uses offsets from the .text section (which is assumed to be zero until it is linked) and this is the first instruction, which is 5 bytes long (remember that branches/calls are always relative to the *end* of that instruction). So it is indeed the correct code.

Re: machine code ? and nasm

Posted: Wed May 19, 2010 1:34 pm
by Gigasoft
When you use a far call, the segment value to use depends on the operating system. On Windows, it is 1bh in user mode and 8 in kernel mode. A far call also pushes the old value of CS on the stack, so if the called function returns, it must do so using the RETF instruction (or if you know it's returning to the same segment, a RET instruction with 4 added to the byte count), and it must add 4 to the offset of any parameters it accesses.

Remember that you can only use a relative near call or jump if you know the distance between the call instruction and the called location. If you don't, you must load the absolute address into a register and use the register as the operand for call or jmp.

Re: machine code ? and nasm

Posted: Wed May 19, 2010 2:39 pm
by Sam111
A far call also pushes the old value of CS on the stack, so if the called function returns, it must do so using the RETF instruction (or if you know it's returning to the same segment, a RET instruction with 4 added to the byte count), and it must add 4 to the offset of any parameters it accesses.
so if I use a far call when I return I should use RETF
But what is the difference of using RETF as opposed to ret N
where N is the number of bytes to pop off the stack.

also if I do

Code: Select all

mov eax , physical address 
call eax
would this do the equivalent of a far call
Bascially I just want to know what the best way to call/jump to any fixed address that I want to.
Notice that there's a difference of 5 between the two
Ok but if the .text section is by default at 0 then won't I have
jmp 0 + 8050404h which is the same as jmp 8050404h

this jump instruction is located at the begining of my text file ...
or is it always
jmp .text + number of bytes in machine code of this instruction =5 + address.

Also curious is their way to find out what the cs selector is in user mode under a given operating system. what happens if you are on a non-windows machine like mac , linux ,...etc are they all the same for the cs selector as windows os? Is their an easy way to find out the cs selector of your os in user mode????

Thanks for the help

Re: machine code ? and nasm

Posted: Wed May 19, 2010 3:57 pm
by Gigasoft
so if I use a far call when I return I should use RETF
But what is the difference of using RETF as opposed to ret N
where N is the number of bytes to pop off the stack.
The RETF instruction restores both EIP and CS, while RET only restores EIP.

Code: Select all

mov eax , physical address
call eax
would this do the equivalent of a far call
Bascially I just want to know what the best way to call/jump to any fixed address that I want to.
No, call eax is a near call, and does not push CS. This is the way to call or jump to a fixed address from an unknown location when you don't need to change CS. Either way, the destination location is not the physical address, but the offset within the CS segment (on Windows, Linux and Mac OS this is based at 0, so the offset is the virtual address).
Also curious is their way to find out what the cs selector is in user mode under a given operating system. what happens if you are on a non-windows machine like mac , linux ,...etc are they all the same for the cs selector as windows os? Is their an easy way to find out the cs selector of your os in user mode????
To get the current value of CS, just use a "mov something, cs" instruction.

Re: machine code ? and nasm

Posted: Wed May 19, 2010 5:20 pm
by Sam111
So then apart from going in and out of Protected mode. Why would you ever use far jmps/calls? In usermode I cann't find a reason?

And my function starts off at 0x8050404 when I print it out
I would think this is the physical address so a relative address won't work.

So

Code: Select all

mov eax , 8050404h
call eax
Is not going to call the fix address 0x8050404 but the relative address. I am alittle confused on the difference if the .text section is taken to be = offset 0 then won't relative and physical corrospond provided the os loads the code in the exact location it should with no reloc needed????

It seems if RETF is only doing the extra poping of cs off the stack then ret 2 should be the same thing.

Re: machine code ? and nasm

Posted: Wed May 19, 2010 5:50 pm
by gerryg400
Sam111, you will find the answers to a lot of your sensible questions in the Intel or AMD manuals. The manuals will also give you the language that you need to ask the forum your real questions. Statements like..
I am alittle confused on the difference if the .text section is taken to be = offset 0 then won't relative and physical corrospond provided the os loads the code in the exact location it should with no reloc needed????
are very frustrating. It's like string theory. It's so wrong I'm not even sure that it's wrong.

You need to understand the definition of things like near, far, physical address, relative address, absolute address and relocation. Until these things are 100% clear in your mind it's going to be very difficult for you.

- gerryg400

Re: machine code ? and nasm

Posted: Wed May 19, 2010 6:16 pm
by Gigasoft
So then apart from going in and out of Protected mode. Why would you ever use far jmps/calls? In usermode I cann't find a reason?
In 16-bit Windows applications, there are many segments, each based at different addresses, and far calls are used a lot. OS/2 also uses different code segments. I also have some far jumps in a Mac OS X emulator I'm working on, where I need to have a value in CS that is different from 1bh so that Windows will leave my GS register alone, but the other code segment I'm setting up is the same as the default one.
Is not going to call the fix address 0x8050404 but the relative address. I am alittle confused on the difference if the .text section is taken to be = offset 0 then won't relative and physical corrospond provided the os loads the code in the exact location it should with no reloc needed????
No, only the immediate forms of jmp and call are relative. All jumps and calls using register or memory operands are absolute, and with call eax the next instruction to execute will be the one whose address is in eax. When a relative address is used, the operand is added to the address of the byte directly following the instruction. For example, an instruction that jumps to itself would be EB FE or E9 FB FF FF FF (the first one is a short jump, which uses a signed byte as the relative address).

Instructions do not use physical addresses directly. Memory operands are specified by an offset within a segment, which are added to the segment base to produce a virtual address. This is also true of relative calls and jumps, here the operand is added to the offset of the next instruction to produce the target offset, which is loaded into EIP. EIP and the CS base together determine the virtual address of the next instruction. The GDTR and IDTR registers contain virtual addresses. The CR3 register, the page directory and the page tables are the only places where physical addresses are used. However, when paging is disabled, physical addresses equal virtual addresses.
It seems if RETF is only doing the extra poping of cs off the stack then ret 2 should be the same thing.
Call far and retf always use 4 bytes for CS in 32 bit mode. It's not the same thing if CS changes. With RETF, CS is restored, but with RET 4, it is not.

Re: machine code ? and nasm

Posted: Thu May 20, 2010 1:51 pm
by Sam111
Ok , I understand how RVA and VA work now.

But my problem is if you are writing shellcode to return to a function in another exe program.

Say I execute test_exploit and it feeds the shellcode to the exploit.exe program how can the shellcode jump back to the print function in my test_exploit code . It would need to jump or call the exact address in memory that my print function is located at. This could potiential be at a different value then the physical address.

I get if my shellcode was going to call a function in the same exe program then you could just use relative/offsets.
But if you are calling a function in a seperate exe/process then I don't think RVA or VA will work if the program is relocated.

That is why in my shellcode I can only get it to execute on the current stack it was overflowed on. And I can call functions in that program by calling the RVA address. I would think the functions that you want to call outside your program would have to be fixed physical or VA address.

If the address's for the outside functions change even alittle bit it will screw up the entire process.
I am also wondering is it even legal to call a function in a different process if the process isn't a .dll,.so, or static library.
I don't think the operating system lets you just jmp to a random piece of memory that is not in your original program allocated address space... unless the memory is marked as shared (like a dll or .so ...etc )

So I think the bounds of shellcode executing is bound by only being able to access functions in the program it is exploiting and/or any dll,.so shared libraries it can load on the os. Correct me if I am wrong.

If that is all true then the only way to get my print function that exsists in test_exploit to execute is to create a .so/dll to
and have my shellcode load it into memory and jump to the function address in the dll.

Either way the main question is how to jmp/call functions outside of your exe program (i.e another exe /process)

Thanks sorry if I have been asking simple questions.
I will try to be better about this. Except the only thing is it is confusing RVA/VA will only work if you are jumping to a function in the same .text segment not to another exe .text segment (unless ofcourse they are both copied on top of each other / over writing each other.)

Re: machine code ? and nasm

Posted: Thu May 20, 2010 6:20 pm
by Gigasoft
Applications do not know or care about physical addresses. What matters is the (absolute) linear offset. Remember that in modern operating systems, CS, DS, ES and SS are all based at 0, so the offsets are the same as the virtual addresses.

When you use a relative jmp or call instruction, the operand is relative to the next instruction, not the beginning of the segment. This isn't a RVA. A RVA is an offset relative to the image base, and they are only used inside executable files. So, relative jumps can't take you from the stack to a fixed address, since the location of the stack is often unknown.

You can never call a function in another process, since in the virtual address space of a process, only the modules that have been loaded into that process are present, and the same module may even be mapped at different virtual addresses in the two processes.

So, the usual course of action for injected code is to cause some other module to be loaded, either by starting a new process or by loading a DLL/SO into the current process. On some operating systems, an URL can be used as the filename.