What assembler syntax is often used to create an OS
What assembler syntax is often used to create an OS
Hello everyone. Sorry to create a stupid post, but I would like to ask the professionals. What is the most commonly used assembly syntax to create an OS. And please tell me why they say that AT&T syntax is not for people? Thanks to all who responded!
-
- Member
- Posts: 5568
- Joined: Mon Mar 25, 2013 7:01 pm
Re: What assembler syntax is often used to create an OS
Most OSes aren't written in assembly. If you're asking about the most popular syntax for the small bits of assembly used while writing an OS in a higher-level language, I don't know if anyone has done a survey to find out.cytorak87 wrote:What is the most commonly used assembly syntax to create an OS.
I prefer NASM syntax, although it isn't an option for inline assembly. For inline assembly, I usually use AT&T syntax just because it's the default. (I don't write enough inline assembly to bother with switching to Intel syntax.)
People sometimes need visual cues to remember how the base, index, and scale are used to calculate effective addresses. People sometimes need to look up instructions in Intel's (or AMD's) manual. AT&T syntax has unhelpful punctuation in effective addresses and changes many of the instruction mnemonics.cytorak87 wrote:And please tell me why they say that AT&T syntax is not for people?
Re: What assembler syntax is often used to create an OS
There are essentially two schools of thought on the matter: Those that use NASM/FASM or similar for their assembler files are OK with using a tool outside of the binutils to build their OS and like Intel syntax more (there are technical reasons for the latter, but it still comes down to preference). Then there are those that use GAS for their assembler files because they don't want to impose a dependency outside of the already necessary binutils on their users, and they don't mind AT&T syntax. That would be my school.
AT&T syntax can be more specific than Intel syntax. For example, you can dereference a register by specifying it in a Mod/RM byte, or by specifying it in a SIB byte. There is almost never a reason for that, but in the TLS spec, it does say that some instructions should be encoded this way, so they will have the correct length so the linker can replace those instructions when it sees an optimization opportunity. In AT&T syntax, there is a difference between "(%rax)" and "(,%rax)", while in Intel syntax both translate to "[rax]".
But outside of these esoteric examples, there is little reason to use AT&T syntax beyond "it's what GAS uses". Therefore my assembler files for x86 are written in AT&T syntax. No other architecture (to my knowledge, anyway) does something like that.
AT&T syntax can be more specific than Intel syntax. For example, you can dereference a register by specifying it in a Mod/RM byte, or by specifying it in a SIB byte. There is almost never a reason for that, but in the TLS spec, it does say that some instructions should be encoded this way, so they will have the correct length so the linker can replace those instructions when it sees an optimization opportunity. In AT&T syntax, there is a difference between "(%rax)" and "(,%rax)", while in Intel syntax both translate to "[rax]".
But outside of these esoteric examples, there is little reason to use AT&T syntax beyond "it's what GAS uses". Therefore my assembler files for x86 are written in AT&T syntax. No other architecture (to my knowledge, anyway) does something like that.
Carpe diem!
-
- Member
- Posts: 34
- Joined: Sat Sep 07, 2019 5:17 pm
- Libera.chat IRC: Superleaf1995
Re: What assembler syntax is often used to create an OS
I have seen that sometimes NASM syntax is used for pure-assembly OSes. I hadn't experienced the luck of seeing a fully-GAS coded OS :P.
Anyways it's just preference, on whetever you think "mov ax, bx" means "move ax to bx" (GAS) or "move bx to ax" (NASM). It's just personal choice.
Anyways it's just preference, on whetever you think "mov ax, bx" means "move ax to bx" (GAS) or "move bx to ax" (NASM). It's just personal choice.
:-)
-
- Member
- Posts: 5568
- Joined: Mon Mar 25, 2013 7:01 pm
Re: What assembler syntax is often used to create an OS
There is a difference between "[rax]" and "[rax*1]" in Intel/NASM syntax too. GAS always treats these as distinct encodings, while NASM translates both to "[rax]" unless you specify "[nosplit rax*1]".nullplan wrote:In AT&T syntax, there is a difference between "(%rax)" and "(,%rax)", while in Intel syntax both translate to "[rax]".
-
- Member
- Posts: 70
- Joined: Tue Jul 14, 2020 4:01 am
- Libera.chat IRC: clementttttttttt
Re: What assembler syntax is often used to create an OS
Linux used att syntax, while (I think) Windows used intel syntax. I personally used intel syntax though because you can type a lot less.
Re: What assembler syntax is often used to create an OS
It really is just personal preference. But for x86 and AMD64 I personally prefer the Intel syntax (the NASM variation specifically), since it better corresponds to the actual opcode bytes being produced.
It's also the syntax used by the Intel and AMD manuals, which also helps.
But as said by others in this thread, usually most OSes only have some fragments of Assembly and the rest is written in some high-level language like C or whatever.
It's also the syntax used by the Intel and AMD manuals, which also helps.
But as said by others in this thread, usually most OSes only have some fragments of Assembly and the rest is written in some high-level language like C or whatever.
Re: What assembler syntax is often used to create an OS
Intel syntax is nicer to write by humans. However, since the amount of assembly in most OSes is so tiny, that does not justify introducing another x86-only dependency. More importantly though, you have to use AT&T in inline asm anyway, so it's nice to have consistent asm code.
Note that GNU AS also supports an Intel-like syntax but you cannot reliably use it in inline asm (try to name a global variable "rax" in C code and see what happens with -masm-syntax=intel).
Note that GNU AS also supports an Intel-like syntax but you cannot reliably use it in inline asm (try to name a global variable "rax" in C code and see what happens with -masm-syntax=intel).
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
Re: What assembler syntax is often used to create an OS
These days I prefer AT&T syntax, even if I can't quite quantify why. I originally learned Intel syntax a long time ago, but decided to switch to AT&T for my OS, and it just felt more natural. The fact that GAS inline uses it, and that other architectures don't support Intel syntax, made it seem like a good idea. I also preferred the uniformity of staying in a single toolchain.
There definitely are some areas where AT&T can be problematic. You can easily mix up the base and index registers, while in Intel syntax it's very obvious what you intend. But from a syntactical point of view, it seems more logical to me that registers are prefixed with %. They aren't variables or memory references, so I think they should be different.
Everyone has their preferences, and there's really no right or wrong. You're the one who has to write it and support it. You want to choose a tool that works with you, not fights you.
There definitely are some areas where AT&T can be problematic. You can easily mix up the base and index registers, while in Intel syntax it's very obvious what you intend. But from a syntactical point of view, it seems more logical to me that registers are prefixed with %. They aren't variables or memory references, so I think they should be different.
Everyone has their preferences, and there's really no right or wrong. You're the one who has to write it and support it. You want to choose a tool that works with you, not fights you.
Re: What assembler syntax is often used to create an OS
Interesting. This is the first time I've seen that particular keyword (I only knew "strict" before). So I guess you can do everything with NASM you can do with GAS. Well, almost. I did try one thing the other day and could not figure it out. But some background first.Octocontrabass wrote:There is a difference between "[rax]" and "[rax*1]" in Intel/NASM syntax too. GAS always treats these as distinct encodings, while NASM translates both to "[rax]" unless you specify "[nosplit rax*1]".
In musl (and some other places), you will often find invocations of a macro called "weak_alias". I wanted to know how exactly it works. It creates a new symbol, aliases it with an existing symbol, and makes it weak. For the C language, it sets the linkage to "external", and the sum of all of this causes the compiler to emit references to the symbol as external calls. The linker will then bind those references to the weak definitions, unless it encountered a strong definition of the same symbol. So, this could, for example, be used in the definition of exit(), to call all the atexit() functions, but only if atexit() is even linked in. If I resolve the macro and the typeof(), it basically looks like this in C:
Code: Select all
static void dummy(void) {}
extern void __funcs_on_exit(void) __attribute__((weak, alias("dummy")));
_Noreturn void exit(int code) {
__funcs_on_exit();
_Exit(code);
}
Code: Select all
.text
dummy:
retq
.global __funcs_on_exit
.weak __funcs_on_exit
.set __funcs_on_exit, dummy
.global exit
exit:
pushq %rdi
callq __funcs_on_exit
popq %rdi
jmpq _Exit
Code: Select all
global __funcs_on_exit:weak
dummy:
__funcs_on_exit:
ret
global exit:function
extern _Exit
exit:
push rdi
call __funcs_on_exit
pop rdi
jmp _Exit
Carpe diem!
-
- Member
- Posts: 5568
- Joined: Mon Mar 25, 2013 7:01 pm
Re: What assembler syntax is often used to create an OS
Support for weak symbols was added pretty recently, so I'd guess that's a bug in NASM.nullplan wrote:Unfortunately, NASM ends up binding the __funcs_on_exit internally, not emitting a relocation.
Re: What assembler syntax is often used to create an OS
Thanks everyone for the answers! You all helped me a lot
Re: What assembler syntax is often used to create an OS
I think there is a third alternative: I use MASM/TASM/WASM syntax. It's a bit like NASM but uses more intuitive ways of making a difference between loading the address of a variable (offset) and the value []. Nasm is problematic since you don't need to define this and then it defaults to reading the address, which breaks a lot of code written with MASM-syntax.
Re: What assembler syntax is often used to create an OS
This does not compute. Code written for MASM will not work directly in NASM? Yes, and code written in C# will not work directly with Java, what is your point? The two are different. Admittedly small differences, but small differences can add up to a lot. The difference between 40°C and 41°C is also very small, but if those are your body temperatures, you'd know the difference.rdos wrote:I think there is a third alternative: I use MASM/TASM/WASM syntax. It's a bit like NASM but uses more intuitive ways of making a difference between loading the address of a variable (offset) and the value []. Nasm is problematic since you don't need to define this and then it defaults to reading the address, which breaks a lot of code written with MASM-syntax.
Carpe diem!
Re: What assembler syntax is often used to create an OS
I think the point is that MASM/TASM/WASM syntax has existed far longer than NASM syntax, and so it was the NASM guys that got it wrong and broke stuff.nullplan wrote:This does not compute. Code written for MASM will not work directly in NASM? Yes, and code written in C# will not work directly with Java, what is your point? The two are different. Admittedly small differences, but small differences can add up to a lot. The difference between 40°C and 41°C is also very small, but if those are your body temperatures, you'd know the difference.rdos wrote:I think there is a third alternative: I use MASM/TASM/WASM syntax. It's a bit like NASM but uses more intuitive ways of making a difference between loading the address of a variable (offset) and the value []. Nasm is problematic since you don't need to define this and then it defaults to reading the address, which breaks a lot of code written with MASM-syntax.
And the target is the same (the x86 processor), so your comparisions between C# and Java does not compute.