Questions about NASM and Assembly in General

yolan51 · Post by **yolan51** » Sat Oct 29, 2016 4:51 pm

Hi people,

Its my first time in here, normaly i would have wrote my question in NASM forum but their email server seems to not be able to send my verification email at the moment. So I tought i would come ask my question here.First i'd like to say that I've been programming for like 5 years now , mostly in c,c++,VB,html,Java. I'm not a good programmer , just decent i'd say but i intend to become better thats why i'm trying to figure out how assembly works. So I handle pointers and stuctured data, and OOP pretty fines and got a good knowledge of the compiling/linking process. I lacked few knowledge on OS developement so I have readed/listened to all of those twice for past two weeks:

http://www.nasm.us/xdoc/2.12.02/html/na ... ection-2.1
https://www.youtube.com/watch?v=ZJPvKSrfMfs
https://www.youtube.com/watch?v=YvZhgRO7hL4
https://www.youtube.com/watch?v=1rnA6wp ... gsTJS9MZ6M
https://www.cs.bham.ac.uk/~exr/lectures ... os-dev.pdf
http://asmtutor.com/
http://cs.lmu.edu/~ray/notes/nasmtutorial/
http://wiki.osdev.org/Main_Page -->> topic that i was concerned about , like 32bit to 64 bits long mode...

(Talking about NASM 64bits)
So here are my questions : First of all , why is there so many way to handle systemcall for exit ?? I saw like 2 different way , on nasm tutorial :

Code: Select all

 
        mov     eax, 60                 ; system call 60 is exit
        xor     rdi, rdi                ; exit code 0
        syscall

On asmtutor:

Code: Select all

 
    mov     ebx, 0      ; return 0 status on exit - 'No Errors'
    mov     eax, 1      ; invoke SYS_EXIT (kernel opcode 1)
    int     80h

So I am really confused .... is there any more way to call exit ? and why there is not only one way to do it ? And is there a better/conventional way of doing syscall ?
Does that depend on what you used in your programms ?? ... I'm really confused about that.

My next question is about push and pop instructions :

I've read/heard that "push" move the (esp or ebp) register of eitheir +4 or -4 bytes it was confusing too.... which one is it ? ... is it +4 or -4.

My last question is about how can i make a simple debuging/printing functions, and if its doable or not. As in other language when i had difficulty to grasp what was happening in eitheir a struct or in a recursive loop i used print to figure out what was happening, by printing eitheir the adresses value or the value in the variable, then I could figure out things by myself I guess.

Thanks in advance, and sorry if my english is bad, i'm not a native english speaker.

Sorry also if my questions seems noob like.

Have a nice day.

alexfru · Post by **alexfru** » Sat Oct 29, 2016 5:47 pm

yolan51 wrote: So here are my questions : First of all , why is there so many way to handle systemcall for exit ?? I saw like 2 different way , on nasm tutorial :
Code: Select all
 
        mov     eax, 60                 ; system call 60 is exit
        xor     rdi, rdi                ; exit code 0
        syscall 
On asmtutor:
Code: Select all
 
    mov     ebx, 0      ; return 0 status on exit - 'No Errors'
    mov     eax, 1      ; invoke SYS_EXIT (kernel opcode 1)
    int     80h
So I am really confused .... is there any more way to call exit ? and why there is not only one way to do it ? And is there a better/conventional way of doing syscall ?
Does that depend on what you used in your programms ?? ... I'm really confused about that.

The latter is normally used in 32-bit applications. The syscall/sysenter/sysret/sysexit instructions appeared late, at about the same time when 64-bit x86 CPUs made their debut (see update*). Prior to that the int instruction was the most common way for user mode code to transition into the kernel for the purpose of doing some work on behalf of the user code.

Update: the syscall/sysret pair is strictly for 64-bit mode. The sysenter/sysexit is normally for 32-bit mode (both user and kernel) and it appeared in Pentium II.

yolan51 wrote: My next question is about push and pop instructions :

Download and read the CPU manual.

Brendan · Post by **Brendan** » Sat Oct 29, 2016 8:17 pm

Hi,

yolan51 wrote:So I am really confused .... is there any more way to call exit ? and why there is not only one way to do it ? And is there a better/conventional way of doing syscall ?

For calling a kernel API, (for "80x86 PC") there's about 5 choices an OS developer could choose:

Use an exception (slowest, but "int3" is best possible option for code size)
Use a software interrupt (second slowest)
Use a call gate
Use SYSENTER (not supported by older CPUs, not supported in 64-bit by AMD)
Use SYSCALL (not supported by older CPUs, not supported in 32-bit by Intel)

Of these, Linux originally used the software interrupt (on old CPUs), and when they added support for 64-bit they decided to use SYSCALL for that. However, (as far as I know) there's also support for SYSENTER and SYSCALL in 32-bit code for more recent versions of Linux, if the CPU supports it.

Of course all this is a mess. So that software doesn't need to be compiled differently for different CPUs (with different methods of accessing the kernel's API) they also added a "virtual dynamic shared object" thing, where you call some shared code and it calls the kernel using the best method the CPU and kernel support. This has the disadvantage of adding the overhead of a normal/near call to everything. However, it's also much more future-proof, and also means that some kernel API calls that don't actually require CPL=0 can be shifted into the "virtual dynamic shared object" itself (which avoids the overhead of switching to CPL=0 and back). A typical example of this is kernel API function/s to get the current time, if you're running on a CPU where Linux can use TSC as a stable time reference.

The best possible way is for code to be compiled/optimised specifically for the specific kernel and CPU; where you could get the benefits of a "virtual dynamic shared object" thing without the "extra normal/near call" overhead; and where shorter options ("int 0x80") could be used in code that isn't executed often (to reduce code size where performance doesn't matter) and where the faster but larger options are used in code that is executed often. This requires a different approach to software - specially, you want something like "source code or byte-code that's compiled to native when installed", which most Linux systems don't do; so I doubt the best possible way will ever be supported by Linux (or will ever be viable for Linux).

yolan51 wrote:I've read/heard that "push" move the (esp or ebp) register of eitheir +4 or -4 bytes it was confusing too.... which one is it ? ... is it +4 or -4.

The stack grows towards lower addresses; so for "push" the CPU subtracts from the stack pointer, and for "pop" the CPU adds to the stack pointer. The size depends on which CPU mode and what you're pushing/popping.

yolan51 wrote:My last question is about how can i make a simple debuging/printing functions, and if its doable or not. As in other language when i had difficulty to grasp what was happening in eitheir a struct or in a recursive loop i used print to figure out what was happening, by printing eitheir the adresses value or the value in the variable, then I could figure out things by myself I guess.

It's extremely likely that you can make simple debugging/printing functions (e.g. after learning more, if necessary). The real questions are how much you still need to learn about assembly language, how much you need to learn about the algorithms you'd want to use, and how much you'd need to learn about the environment your code would be executed in. All of these things depend on how much you already know, which is information I don't have.

Cheers,

Brendan

yolan51 · Post by **yolan51** » Sun Oct 30, 2016 8:39 am

Thanks to both of you.

Your explanation Brendan were very appreciated.

I pretty much have to learn everything about assembly, im playing around small code like hello world and small included functions from another asm file, changing value to see what happens in a virtual ubuntu OS. But so far it doesn't lead me anywhere else than figuring that assembly doesn't follow conventionnal programming logic as sometimes i can switch value and get the same result , and other times i get an error of compilations which doesn't say anything about what went wrong in particular sooo ..... I'm a bit in a StaleMate at the moment.

I'm trying to figure out a way out of this. But seriously even if i know all register size and have a fair understanding of how the stacks works there is something not connecting, it seems like there so many way to build stuff i'm confused on how to use the register and which one of it (figuring their innerworking).

Cya,

Yolan

iansjack · Post by **iansjack** » Sun Oct 30, 2016 9:16 am

You would find it very instructive to single-step your code in a debugger. That way you can get a good idea of how the various instructions and addressing modes work.

Ycep · Post by **Ycep** » Tue Nov 01, 2016 8:46 am

Yolan,
Push is used to push an item to stack, i.e to put four-byte value to memory pointed by SP/ESP/RSP register and decrease the SP/ESP/RSP value by 4.

It really does nothing except this:

Code: Select all

mov dword ptr[esp], (pushed value)
sub esp, 4

Stack grows downwards, so your answer is -4.
SP/ESP/RSP register contains stack address.

Pop is used to revert an item to stack.
Stack, in computer-science is one type of list in which elements could be pushed and popped.
In this case, Stack elements are 4-byte integers.

Hope if this helps.

crunch · Post by **crunch** » Tue Nov 01, 2016 2:27 pm

Lukand wrote:Yolan,
Push is used to push an item to stack, i.e to put four-byte value to memory pointed by SP/ESP/RSP register and decrease the SP/ESP/RSP value by 4.

It really does nothing except this:
Code: Select all
mov dword ptr[esp], (pushed value)
sub esp, 4
Hope if this helps.

This is incorrect.... The stack pointer is decremented first, then esp = value. Your order of operations is reversed.

Love4Boobies · Post by **Love4Boobies** » Wed Nov 02, 2016 5:31 am

You are both correct, except for different CPUs.

They changed the semantics after the 8086.

iansjack · Post by **iansjack** » Wed Nov 02, 2016 6:00 am

They can't both be right. The 8086 didn't have an ESP register.

Also note that the decrement is not always 4.

Ycep · Post by **Ycep** » Wed Nov 02, 2016 8:24 am

iansjack,
please, can't you see that I putted SP/ESP/RSP just above that code?
I thought it would be pointless to put it again...

iansjack · Post by **iansjack** » Wed Nov 02, 2016 9:07 am

I wasn't going to correct your statement that push puts a 4-byte value at the location pointed to by SP/ESP/RSP, but as you mention it....

No - I'll leave you to discover the two inaccuracies in that statement.

Ycep · Post by **Ycep** » Wed Nov 02, 2016 11:12 am

You have already told me - since of your good eye.

Love4Boobies · Post by **Love4Boobies** » Thu Nov 03, 2016 1:57 am

iansjack wrote:They can't both be right. The 8086 didn't have an ESP register.

Also note that the decrement is not always 4.

I was talking about the order of the operations. But you knew that, you were just contradicting for the sake of contradicting. :p

Antti · Post by **Antti** » Thu Nov 03, 2016 2:21 am

Now that the discussion is activated again and are at it, we could try to clarify the issue. Nothing has been said that made any sense of the statement "They changed the semantics after the 8086."

The semantics of "push sp" has changed but were we talking about that?

Love4Boobies · Post by **Love4Boobies** » Thu Nov 03, 2016 4:14 am

Nope.

OSDev.org

Questions about NASM and Assembly in General

Questions about NASM and Assembly in General

Re: Questions about NASM and Assembly in General

Re: Questions about NASM and Assembly in General

Re: Questions about NASM and Assembly in General

Re: Questions about NASM and Assembly in General

Re: Questions about NASM and Assembly in General

Re: Questions about NASM and Assembly in General

Re: Questions about NASM and Assembly in General

Re: Questions about NASM and Assembly in General

Re: Questions about NASM and Assembly in General

Re: Questions about NASM and Assembly in General

Re: Questions about NASM and Assembly in General

Re: Questions about NASM and Assembly in General

Re: Questions about NASM and Assembly in General

Re: Questions about NASM and Assembly in General