Questions about NASM and Assembly in General

Programming, for all ages and all languages.
Post Reply
yolan51
Posts: 2
Joined: Sat Oct 29, 2016 2:56 pm
Libera.chat IRC: yolan

Questions about NASM and Assembly in General

Post by yolan51 »

Hi people,

Its my first time in here, normaly i would have wrote my question in NASM forum but their email server seems to not be able to send my verification email at the moment. So I tought i would come ask my question here.First i'd like to say that I've been programming for like 5 years now , mostly in c,c++,VB,html,Java. I'm not a good programmer , just decent i'd say but i intend to become better thats why i'm trying to figure out how assembly works. So I handle pointers and stuctured data, and OOP pretty fines and got a good knowledge of the compiling/linking process. I lacked few knowledge on OS developement so I have readed/listened to all of those twice for past two weeks:

http://www.nasm.us/xdoc/2.12.02/html/na ... ection-2.1
https://www.youtube.com/watch?v=ZJPvKSrfMfs
https://www.youtube.com/watch?v=YvZhgRO7hL4
https://www.youtube.com/watch?v=1rnA6wp ... gsTJS9MZ6M
https://www.cs.bham.ac.uk/~exr/lectures ... os-dev.pdf
http://asmtutor.com/
http://cs.lmu.edu/~ray/notes/nasmtutorial/
http://wiki.osdev.org/Main_Page -->> topic that i was concerned about , like 32bit to 64 bits long mode...

(Talking about NASM 64bits)
So here are my questions : First of all , why is there so many way to handle systemcall for exit ?? I saw like 2 different way , on nasm tutorial :

Code: Select all

 
        mov     eax, 60                 ; system call 60 is exit
        xor     rdi, rdi                ; exit code 0
        syscall 
On asmtutor:

Code: Select all

 
    mov     ebx, 0      ; return 0 status on exit - 'No Errors'
    mov     eax, 1      ; invoke SYS_EXIT (kernel opcode 1)
    int     80h
So I am really confused .... is there any more way to call exit ? and why there is not only one way to do it ? And is there a better/conventional way of doing syscall ?
Does that depend on what you used in your programms ?? ... I'm really confused about that.

My next question is about push and pop instructions :

I've read/heard that "push" move the (esp or ebp) register of eitheir +4 or -4 bytes it was confusing too.... which one is it ? ... is it +4 or -4.

My last question is about how can i make a simple debuging/printing functions, and if its doable or not. As in other language when i had difficulty to grasp what was happening in eitheir a struct or in a recursive loop i used print to figure out what was happening, by printing eitheir the adresses value or the value in the variable, then I could figure out things by myself I guess.

Thanks in advance, and sorry if my english is bad, i'm not a native english speaker.

Sorry also if my questions seems noob like.

Have a nice day.
alexfru
Member
Member
Posts: 1111
Joined: Tue Mar 04, 2014 5:27 am

Re: Questions about NASM and Assembly in General

Post by alexfru »

yolan51 wrote: So here are my questions : First of all , why is there so many way to handle systemcall for exit ?? I saw like 2 different way , on nasm tutorial :

Code: Select all

 
        mov     eax, 60                 ; system call 60 is exit
        xor     rdi, rdi                ; exit code 0
        syscall 
On asmtutor:

Code: Select all

 
    mov     ebx, 0      ; return 0 status on exit - 'No Errors'
    mov     eax, 1      ; invoke SYS_EXIT (kernel opcode 1)
    int     80h
So I am really confused .... is there any more way to call exit ? and why there is not only one way to do it ? And is there a better/conventional way of doing syscall ?
Does that depend on what you used in your programms ?? ... I'm really confused about that.
The latter is normally used in 32-bit applications. The syscall/sysenter/sysret/sysexit instructions appeared late, at about the same time when 64-bit x86 CPUs made their debut (see update*). Prior to that the int instruction was the most common way for user mode code to transition into the kernel for the purpose of doing some work on behalf of the user code.

Update: the syscall/sysret pair is strictly for 64-bit mode. The sysenter/sysexit is normally for 32-bit mode (both user and kernel) and it appeared in Pentium II.
yolan51 wrote: My next question is about push and pop instructions :
Download and read the CPU manual.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Questions about NASM and Assembly in General

Post by Brendan »

Hi,
yolan51 wrote:So I am really confused .... is there any more way to call exit ? and why there is not only one way to do it ? And is there a better/conventional way of doing syscall ?
For calling a kernel API, (for "80x86 PC") there's about 5 choices an OS developer could choose:
  • Use an exception (slowest, but "int3" is best possible option for code size)
  • Use a software interrupt (second slowest)
  • Use a call gate
  • Use SYSENTER (not supported by older CPUs, not supported in 64-bit by AMD)
  • Use SYSCALL (not supported by older CPUs, not supported in 32-bit by Intel)
Of these, Linux originally used the software interrupt (on old CPUs), and when they added support for 64-bit they decided to use SYSCALL for that. However, (as far as I know) there's also support for SYSENTER and SYSCALL in 32-bit code for more recent versions of Linux, if the CPU supports it.

Of course all this is a mess. So that software doesn't need to be compiled differently for different CPUs (with different methods of accessing the kernel's API) they also added a "virtual dynamic shared object" thing, where you call some shared code and it calls the kernel using the best method the CPU and kernel support. This has the disadvantage of adding the overhead of a normal/near call to everything. However, it's also much more future-proof, and also means that some kernel API calls that don't actually require CPL=0 can be shifted into the "virtual dynamic shared object" itself (which avoids the overhead of switching to CPL=0 and back). A typical example of this is kernel API function/s to get the current time, if you're running on a CPU where Linux can use TSC as a stable time reference.

The best possible way is for code to be compiled/optimised specifically for the specific kernel and CPU; where you could get the benefits of a "virtual dynamic shared object" thing without the "extra normal/near call" overhead; and where shorter options ("int 0x80") could be used in code that isn't executed often (to reduce code size where performance doesn't matter) and where the faster but larger options are used in code that is executed often. This requires a different approach to software - specially, you want something like "source code or byte-code that's compiled to native when installed", which most Linux systems don't do; so I doubt the best possible way will ever be supported by Linux (or will ever be viable for Linux).
yolan51 wrote:I've read/heard that "push" move the (esp or ebp) register of eitheir +4 or -4 bytes it was confusing too.... which one is it ? ... is it +4 or -4.
The stack grows towards lower addresses; so for "push" the CPU subtracts from the stack pointer, and for "pop" the CPU adds to the stack pointer. The size depends on which CPU mode and what you're pushing/popping.
yolan51 wrote:My last question is about how can i make a simple debuging/printing functions, and if its doable or not. As in other language when i had difficulty to grasp what was happening in eitheir a struct or in a recursive loop i used print to figure out what was happening, by printing eitheir the adresses value or the value in the variable, then I could figure out things by myself I guess.
It's extremely likely that you can make simple debugging/printing functions (e.g. after learning more, if necessary). The real questions are how much you still need to learn about assembly language, how much you need to learn about the algorithms you'd want to use, and how much you'd need to learn about the environment your code would be executed in. All of these things depend on how much you already know, which is information I don't have.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
yolan51
Posts: 2
Joined: Sat Oct 29, 2016 2:56 pm
Libera.chat IRC: yolan

Re: Questions about NASM and Assembly in General

Post by yolan51 »

Thanks to both of you.

Your explanation Brendan were very appreciated.

I pretty much have to learn everything about assembly, im playing around small code like hello world and small included functions from another asm file, changing value to see what happens in a virtual ubuntu OS. But so far it doesn't lead me anywhere else than figuring that assembly doesn't follow conventionnal programming logic as sometimes i can switch value and get the same result , and other times i get an error of compilations which doesn't say anything about what went wrong in particular sooo ..... I'm a bit in a StaleMate at the moment.

I'm trying to figure out a way out of this. But seriously even if i know all register size and have a fair understanding of how the stacks works there is something not connecting, it seems like there so many way to build stuff i'm confused on how to use the register and which one of it (figuring their innerworking).

Cya,

Yolan
User avatar
iansjack
Member
Member
Posts: 4685
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Questions about NASM and Assembly in General

Post by iansjack »

You would find it very instructive to single-step your code in a debugger. That way you can get a good idea of how the various instructions and addressing modes work.
User avatar
Ycep
Member
Member
Posts: 401
Joined: Mon Dec 28, 2015 11:11 am

Re: Questions about NASM and Assembly in General

Post by Ycep »

Yolan,
Push is used to push an item to stack, i.e to put four-byte value to memory pointed by SP/ESP/RSP register and decrease the SP/ESP/RSP value by 4.

It really does nothing except this:

Code: Select all

mov dword ptr[esp], (pushed value)
sub esp, 4
Stack grows downwards, so your answer is -4.
SP/ESP/RSP register contains stack address.

Pop is used to revert an item to stack.
Stack, in computer-science is one type of list in which elements could be pushed and popped.
In this case, Stack elements are 4-byte integers.

Hope if this helps.
User avatar
crunch
Member
Member
Posts: 81
Joined: Wed Aug 31, 2016 9:53 pm
Libera.chat IRC: crunch
Location: San Diego, CA

Re: Questions about NASM and Assembly in General

Post by crunch »

Lukand wrote:Yolan,
Push is used to push an item to stack, i.e to put four-byte value to memory pointed by SP/ESP/RSP register and decrease the SP/ESP/RSP value by 4.

It really does nothing except this:

Code: Select all

mov dword ptr[esp], (pushed value)
sub esp, 4
Hope if this helps.
This is incorrect.... The stack pointer is decremented first, then esp = value. Your order of operations is reversed.
User avatar
Love4Boobies
Member
Member
Posts: 2111
Joined: Fri Mar 07, 2008 5:36 pm
Location: Bucharest, Romania

Re: Questions about NASM and Assembly in General

Post by Love4Boobies »

You are both correct, except for different CPUs. :) They changed the semantics after the 8086.
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
User avatar
iansjack
Member
Member
Posts: 4685
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Questions about NASM and Assembly in General

Post by iansjack »

They can't both be right. The 8086 didn't have an ESP register.

Also note that the decrement is not always 4.
User avatar
Ycep
Member
Member
Posts: 401
Joined: Mon Dec 28, 2015 11:11 am

Re: Questions about NASM and Assembly in General

Post by Ycep »

iansjack,
please, can't you see that I putted SP/ESP/RSP just above that code?
I thought it would be pointless to put it again...
User avatar
iansjack
Member
Member
Posts: 4685
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Questions about NASM and Assembly in General

Post by iansjack »

I wasn't going to correct your statement that push puts a 4-byte value at the location pointed to by SP/ESP/RSP, but as you mention it....

No - I'll leave you to discover the two inaccuracies in that statement.
User avatar
Ycep
Member
Member
Posts: 401
Joined: Mon Dec 28, 2015 11:11 am

Re: Questions about NASM and Assembly in General

Post by Ycep »

You have already told me - since of your good eye.
User avatar
Love4Boobies
Member
Member
Posts: 2111
Joined: Fri Mar 07, 2008 5:36 pm
Location: Bucharest, Romania

Re: Questions about NASM and Assembly in General

Post by Love4Boobies »

iansjack wrote:They can't both be right. The 8086 didn't have an ESP register.

Also note that the decrement is not always 4.
I was talking about the order of the operations. But you knew that, you were just contradicting for the sake of contradicting. :p
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
Antti
Member
Member
Posts: 923
Joined: Thu Jul 05, 2012 5:12 am
Location: Finland

Re: Questions about NASM and Assembly in General

Post by Antti »

Now that the discussion is activated again and are at it, we could try to clarify the issue. Nothing has been said that made any sense of the statement "They changed the semantics after the 8086."

The semantics of "push sp" has changed but were we talking about that?
User avatar
Love4Boobies
Member
Member
Posts: 2111
Joined: Fri Mar 07, 2008 5:36 pm
Location: Bucharest, Romania

Re: Questions about NASM and Assembly in General

Post by Love4Boobies »

Nope.
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
Post Reply