Page 1 of 2

x86_64-ISR-Registers and syscall arguments

Posted: Sun Oct 22, 2017 9:28 am
by DevNoteHQ
Hi!

x86_64-ISR-Register:
So i have a struct for storing register values during an interrupt:

Code: Select all

namespace CPU
{
	typedef struct{
		uint64_t r15;
		uint64_t r14;
		uint64_t r13;
		uint64_t r12;
		uint64_t r11;
		uint64_t r10;
		uint64_t r9;
		uint64_t r8;
		uint64_t rbp;
		uint64_t rdi;
		uint64_t rsi;
		uint64_t rdx;
		uint64_t rcx;
		uint64_t rbx;
		uint64_t rax;

		uint64_t   rip;
		uint64_t   cs;
		uint64_t   rflags;
		uint64_t   rsp;
		uint64_t   ss;
	} __attribute__((__packed__)) State;
}
I'm doing ISRs with the new gcc attribute:

Code: Select all

__attribute__((interrupt)) void LVT_Timer(CPU::State *state)
{
	Interrupt::APIC::Write(APIC_EOI, 0);
}
But after compilation, gcc creates the following code:
ISR asm code
ISR asm code
Why doesn't gcc save r15 - r12? And where can i find information about that? The gcc documentation only states that i can find it in the processors manual. https://gcc.gnu.org/onlinedocs/gcc-7.2. ... Attributes - search for "interrupt".



syscall arguments:
Since i have a x86_64-kernel, i'd like to use the syscall instruction for my syscalls. The handler should look like that:

Code: Select all

	void* Handler(uint64_t iCall, void* OtherArguments)
	{
		return (*Handlers[iCall])(OtherArguments);
	}
I might need to change it so that i have an object array of a class with function pointers and a + operator so that i can dynamically add syscalls.
My problem is: How do i pass arguments? Especially long arguments like a string or a char-array? And most importantly: How do i pass them in C/C++?
As far as i know, Linux passes arguments through the stack. But how does Linux handle arrays as arguments?
The simplest and most secure way would be to pass a pointer to an area where all arguments are (=> struct pointer) and let the (*Handlers[iCall]) do the deencapsulation, wouldn't it? But how do i handle different structs with different sizes?

So could i do something like that:
Usermode:

Code: Select all

namespace System
{
	void* Syscall(uint64_t iCall, void* OtherArguments)
	{
		asm volatile("syscall");
		//return?
	}
}
Kernelmode:

Code: Select all

void Handler(uint64_t iCall, void* OtherArguments)
	{
		(*Handlers[iCall])(OtherArguments);
		//return (*Handlers[iCall])(OtherArguments); <= That's stupid, see EDIT2
	}
I'll have to change void* to struct* or something like that...

EDIT:
Or i could do something like that, right?

Code: Select all

typedef struct
{
	uint64_t value1, value2;
} __attribute__((__packed__)) Testing;

Handlers[0] = &Test;

void* Test(void* OtherArguments)
{
	Testing *Pointer = OtherArguments;
	Pointer->value1 = 0x1;
}
EDIT2:
I just saw that i wanted to exit the syscall with a normal return... fail^^
So how do i write the sysret? Would it be sufficient to do the following (without saving the stack pointers):

Code: Select all

void Handler(uint64_t iCall, void* OtherArguments)
	{
		asm volatile("push %rcx");
		(*Handlers[iCall])(OtherArguments);
		asm volatile("pop %rcx");
		asm volatile("sysret");
	}

Re: x86_64-ISR-Registers and syscall arguments

Posted: Sun Oct 22, 2017 10:52 am
by mariuszp
GCC doesn't save r12-r15 because it's expected that functions preserve them anyway. So if a function emitted by GCC isn't saving them, it means it's also not modifying them. It also assumes that any functions it calls will preseve r12-r15.

If you need all the registers, you must write your ISRs in assembly (it's my recommendation either way).

Re: x86_64-ISR-Registers and syscall arguments

Posted: Sun Oct 22, 2017 1:22 pm
by DevNoteHQ
mariuszp wrote:GCC doesn't save r12-r15 because it's expected that functions preserve them anyway. So if a function emitted by GCC isn't saving them, it means it's also not modifying them. It also assumes that any functions it calls will preseve r12-r15.

If you need all the registers, you must write your ISRs in assembly (it's my recommendation either way).
Well i don't see any necessity in using r12-r15 so i'll just remove them from the list.

Is there a reason why it doesn't save rbp?

EDIT: Or is rbp saved by the CPU and i need to move it to a different position in the struct?

Re: x86_64-ISR-Registers and syscall arguments

Posted: Sun Oct 22, 2017 2:43 pm
by iansjack
Check the x86_64 ABI. Parameters are passed in registers (in general - but it's a little more complicated than that).

Re: x86_64-ISR-Registers and syscall arguments

Posted: Sun Oct 22, 2017 3:42 pm
by DevNoteHQ
iansjack wrote:Check the x86_64 ABI. Parameters are passed in registers (in general - but it's a little more complicated than that).
I should have done that before asking the question. Already downloaded that pdf months ago :mrgreen:

EDIT:
"Registers %rbp, %rbx and %r12 through %r15 “belong” to the calling function and the called function is required to preserve their values. In other words, a called function must preserve these registers’ values for its caller. Remaining registers “belong” to the called function. If a calling function wants to preserve such a register value across a function call, it must save the value in its local stack frame."

Re: x86_64-ISR-Registers and syscall arguments

Posted: Sun Oct 22, 2017 8:29 pm
by Brendan
Hi,
DevNoteHQ wrote:So i have a struct for storing register values during an interrupt:

Code: Select all

namespace CPU
{
	typedef struct{
		uint64_t r15;
		uint64_t r14;
		uint64_t r13;
		uint64_t r12;
		uint64_t r11;
		uint64_t r10;
		uint64_t r9;
		uint64_t r8;
		uint64_t rbp;
		uint64_t rdi;
		uint64_t rsi;
		uint64_t rdx;
		uint64_t rcx;
		uint64_t rbx;
		uint64_t rax;

		uint64_t   rip;
		uint64_t   cs;
		uint64_t   rflags;
		uint64_t   rsp;
		uint64_t   ss;
	} __attribute__((__packed__)) State;
}
I'm doing ISRs with the new gcc attribute:

Code: Select all

__attribute__((interrupt)) void LVT_Timer(CPU::State *state)
{
	Interrupt::APIC::Write(APIC_EOI, 0);
}
That's not right. For 64-bit 80x86, the structure should be:

Code: Select all

namespace CPU
{
	typedef struct{
		uint64_t   rip;
		uint64_t   cs;
		uint64_t   rflags;
		uint64_t   rsp;
		uint64_t   ss;
	} __attribute__((__packed__)) State;
This is (literally) what the CPU itself pushes on the stack when it starts an interrupt, and nothing more.

This means that you don't have access to the interrupted code's registers; which is relatively useless for exception handlers because you typically do want to know what the registers were when the exception happened (even if it's just for dumping useful information when something crashes) and may want to modify the registers before returning (e.g. for things like emulating unsupported instruction in the "invalid opcode" exception handler). Also note that there's some cases where you need special handling - e.g. for page fault you may need "CR2 from before a potential second page fault could overwrite it", for NMI and machine check exception you might want/need insane hackery (to guard against things like "NMI (or machine check) immediately after SYSCALL when kernel stack is still dodgy"), etc. The compiler's "__attribute__((interrupt))" won't be useful for any of these things either.

For normal IRQs, for modern computers IRQs mostly need to be dynamically assigned (for MSI and IO APIC) and may be shared (for PCI in general), and for these reasons typically you have a common IRQ handler that uses an "IRQ number" parameter to do its job (e.g. to find a list of ISRs that are sharing the IRQ and call each higher level ISR in that list). The compiler's "__attribute__((interrupt))" can't do this (you need a different "IRQ handling stub" in assembly to set the "IRQ number" parameter before calling the common IRQ handler) so it's not particularly useful for normal IRQs either.


Cheers,

Brendan

Re: x86_64-ISR-Registers and syscall arguments

Posted: Mon Oct 23, 2017 11:31 am
by DevNoteHQ
Brendan wrote:Hi,
DevNoteHQ wrote:So i have a struct for storing register values during an interrupt:

Code: Select all

namespace CPU
{
	typedef struct{
		uint64_t r15;
		uint64_t r14;
		uint64_t r13;
		uint64_t r12;
		uint64_t r11;
		uint64_t r10;
		uint64_t r9;
		uint64_t r8;
		uint64_t rbp;
		uint64_t rdi;
		uint64_t rsi;
		uint64_t rdx;
		uint64_t rcx;
		uint64_t rbx;
		uint64_t rax;

		uint64_t   rip;
		uint64_t   cs;
		uint64_t   rflags;
		uint64_t   rsp;
		uint64_t   ss;
	} __attribute__((__packed__)) State;
}
I'm doing ISRs with the new gcc attribute:

Code: Select all

__attribute__((interrupt)) void LVT_Timer(CPU::State *state)
{
	Interrupt::APIC::Write(APIC_EOI, 0);
}
That's not right. For 64-bit 80x86, the structure should be:

Code: Select all

namespace CPU
{
	typedef struct{
		uint64_t   rip;
		uint64_t   cs;
		uint64_t   rflags;
		uint64_t   rsp;
		uint64_t   ss;
	} __attribute__((__packed__)) State;
This is (literally) what the CPU itself pushes on the stack when it starts an interrupt, and nothing more.

This means that you don't have access to the interrupted code's registers; which is relatively useless for exception handlers because you typically do want to know what the registers were when the exception happened (even if it's just for dumping useful information when something crashes) and may want to modify the registers before returning (e.g. for things like emulating unsupported instruction in the "invalid opcode" exception handler). Also note that there's some cases where you need special handling - e.g. for page fault you may need "CR2 from before a potential second page fault could overwrite it", for NMI and machine check exception you might want/need insane hackery (to guard against things like "NMI (or machine check) immediately after SYSCALL when kernel stack is still dodgy"), etc. The compiler's "__attribute__((interrupt))" won't be useful for any of these things either.

For normal IRQs, for modern computers IRQs mostly need to be dynamically assigned (for MSI and IO APIC) and may be shared (for PCI in general), and for these reasons typically you have a common IRQ handler that uses an "IRQ number" parameter to do its job (e.g. to find a list of ISRs that are sharing the IRQ and call each higher level ISR in that list). The compiler's "__attribute__((interrupt))" can't do this (you need a different "IRQ handling stub" in assembly to set the "IRQ number" parameter before calling the common IRQ handler) so it's not particularly useful for normal IRQs either.


Cheers,

Brendan


GCC pushes the general purpose registers onto the stack (see the picture). In a normal stub handler you would just manually push those on the stack, so why does it make a difference?

Example:

Code: Select all

intr_stub:
  ; save the register file
  push r15
  push r14
  push r13
  push r12
  push r11
  push r10
  push r9
  push r8
  push rbp
  push rdi
  push rsi
  push rdx
  push rcx
  push rbx
  push rax

  ; check if we are switching from user mode to supervisor mode
  mov rax, [rsp + 152]
  and rax, 0x3000
  jz .supervisor_enter

  ; restore the kernel's GS base if we are going from user to supervisor mode
  swapgs

.supervisor_enter:
  ; increment mask count as we configure all interrupts to mask IF
  ; automatically in the IDT
  inc qword [gs:8]

  ; call the C routine for dispatching an interrupt
  cld          ; amd64 SysV ABI states the DF must be cleared by the caller
  mov rdi, rsp ; first argument points to the processor state
  mov rbp, 0   ; terminate stack traces here
  call intr_dispatch
And the state looks like this:

Code: Select all

/* CPU state passed to intr_dispatch() (and various other places) */
typedef struct
{
  /* the register file */
  uint64_t regs[15];

  /* the error code and interrupt id */
  uint64_t id;
  uint64_t error;

  /* these are pushed automatically by the CPU */
  uint64_t rip;
  uint64_t cs;
  uint64_t rflags;
  uint64_t rsp;
  uint64_t ss;
} __attribute__((__packed__)) cpu_state_t;

#define RAX 0
#define RBX 1
#define RCX 2
#define RDX 3
#define RSI 4
#define RDI 5
#define RBP 6
/* RSP is stored in a separate field */
#define R8  7
#define R9  8
#define R10 9
#define R11 10
#define R12 11
#define R13 12
#define R14 13
#define R15 14
So where is the difference?

Code from https://github.com/grahamedgecombe/arc

EDIT: So the correct struct should be:

Code: Select all

namespace CPU
{
	typedef struct{
		uint64_t rax;
		uint64_t rdx;
		uint64_t rcx;
		uint64_t rbx;
		uint64_t rsi;
		uint64_t rdi;
		uint64_t r8;
		uint64_t r9;
		uint64_t r10;
		uint64_t r11;

		uint64_t   rip;
		uint64_t   cs;
		uint64_t   rflags;
		uint64_t   rsp;
		uint64_t   ss;
	} __attribute__((__packed__)) State;
}
Because the last pushed item = first item in the struct.

Re: x86_64-ISR-Registers and syscall arguments

Posted: Mon Oct 23, 2017 12:24 pm
by MichaelPetch
DevNoteHQ wrote:So where is the difference?
If you are asking the question as to what the difference is between the interrupt attribute and hand coding is that the way GCC implemented the interrupt attribute doesn't guarantee that any or all of the general purpose registers will actually ever be pushed on the stack let alone in any specific order. The only thing you can be sure of is:

Code: Select all

   typedef struct{
      uint64_t   rip;
      uint64_t   cs;
      uint64_t   rflags;
      uint64_t   rsp;
      uint64_t   ss;
   } __attribute__((__packed__)) State;
The code in your picture suggests to me the potential that you don't compile with optimizations. I'd expect a non optimized version to push unnecessary volatile registers on the stack whether it is needed or not. An optimizer may be smart enough and realize that only a few volatile registers are changed and may only push the ones it knows actually changes.

Moral of the story is this. if you want a structure of general purpose registers as part of your interrupt handler you can't rely on the interrupt attribute giving you that in a manner you expect. Only way of making sure it is exactly what you want is to write your own stub handler. I felt that the GCC interrupt attribute for the x86/x86-64 targets really was done in a poor fashion and pretty much useless for many cases.

Re: x86_64-ISR-Registers and syscall arguments

Posted: Mon Oct 23, 2017 12:43 pm
by MichaelPetch
As an example in C if I have this code:

Code: Select all

#include <stdint.h>

typedef struct{
    uint64_t r15;
    uint64_t r14;
    uint64_t r13;
    uint64_t r12;
    uint64_t r11;
    uint64_t r10;
    uint64_t r9;
    uint64_t r8;
    uint64_t rbp;
    uint64_t rdi;
    uint64_t rsi;
    uint64_t rdx;
    uint64_t rcx;
    uint64_t rbx;
    uint64_t rax;

    uint64_t   rip;
    uint64_t   cs;
    uint64_t   rflags;
    uint64_t   rsp;
    uint64_t   ss;
} __attribute__((__packed__)) State;

volatile uint64_t tmp11; /*mark volatile to avoid compiler optimizing this variable away if it thinks it is not needed*/
__attribute__((interrupt)) void LVT_Timer(State *state)
{
    tmp11 = state->r11;
}
We don't do much of anything in this except store state->r11 into a global variable. If we compile this with optimizations using

Code: Select all

x86_64-elf-gcc -c -mgeneral-regs-only -mno-red-zone -ffreestanding -O3 interrupt.c
I get an object file with code that looks like this:

Code: Select all

0000000000000000 <LVT_Timer>:
   0:   50                      push   %rax
   1:   48 8b 44 24 28          mov    0x28(%rsp),%rax
   6:   48 89 05 00 00 00 00    mov    %rax,0x0(%rip)        # d <LVT_Timer+0xd>
                        9: R_X86_64_PC32        tmp11-0x4
   d:   58                      pop    %rax
   e:   48 cf                   iretq
As expected the compiler with optimizations only pushed one of the volatile (caller saved) registers on the stack. The only register it knows it changed. mov 0x28(%rsp),%rax attempts to read the R11 variable off the stack via the State structure, but is referencing something that was never actually pushed.

Re: x86_64-ISR-Registers and syscall arguments

Posted: Mon Oct 23, 2017 2:04 pm
by DevNoteHQ
MichaelPetch wrote:As an example in C if I have this code:

Code: Select all

typedef struct{
    uint64_t r15;
    uint64_t r14;
    uint64_t r13;
    uint64_t r12;
    uint64_t r11;
    uint64_t r10;
    uint64_t r9;
    uint64_t r8;
    uint64_t rbp;
    uint64_t rdi;
    uint64_t rsi;
    uint64_t rdx;
    uint64_t rcx;
    uint64_t rbx;
    uint64_t rax;

    uint64_t   rip;
    uint64_t   cs;
    uint64_t   rflags;
    uint64_t   rsp;
    uint64_t   ss;
} __attribute__((__packed__)) State;

volatile uint64_t tmp11; /*mark volatile to avoid compiler optimizing this variable away if it thinks it is not needed*/
__attribute__((interrupt)) void LVT_Timer(State *state)
{
    tmp11 = state->r11;
}
We don't do much of anything in this except store state->r11 into a global variable. If we compile this with optimizations using

Code: Select all

x86_64-elf-gcc -c -mgeneral-regs-only -mno-red-zone -O3 interrupt.c
I get an object file with code that looks like this:

Code: Select all

0000000000000000 <LVT_Timer>:
   0:   50                      push   %rax
   1:   48 8b 44 24 28          mov    0x28(%rsp),%rax
   6:   48 89 05 00 00 00 00    mov    %rax,0x0(%rip)        # d <LVT_Timer+0xd>
                        9: R_X86_64_PC32        tmp11-0x4
   d:   58                      pop    %rax
   e:   48 cf                   iretq
As expected the compiler with optimizations only pushed one of the volatile (caller saved) registers on the stack. The only register it knows it changed. mov 0x28(%rsp),%rax attempts to read the R11 variable off the stack via the State structure, but is referencing something that was never actually pushed.
Okay, then of course it has no use at all :mrgreen:
Thank you all for the help!

But what about the other topic?
Can i do something like this:
Userland:

Code: Select all

namespace System
{
   void* Syscall(uint64_t iCall, void* OtherArguments)
   {
      asm volatile("syscall");
      //return?
   }
}
System:

Code: Select all

void Handler(uint64_t iCall, void* OtherArguments)
{
      asm volatile("push %rcx");
      void* ret = (*Handlers[iCall])(OtherArguments);
      asm volatile("pop %rcx");
      //somehow put ret into a defined register or push it to the stack maybe.
      asm volatile("sysret");
}

typedef struct
{
   uint64_t value1, value2;
} __attribute__((__packed__)) Testing;

Handlers[0] = &Test;

void* Test(void* OtherArguments)
{
   Testing *Pointer = OtherArguments;
   Pointer->value1 = 0x1;
}
Since uint64_t iCall and void* OtherArguments are class INTEGER and POINTER, they'll get passed in register RDI and RSI, right?
And does GCC optimize the arguments of "void* Syscall(uint64_t iCall, void* OtherArguments)" away, cause they are not used once in the function? Does the volatile keyword work in a function definition?

Re: x86_64-ISR-Registers and syscall arguments

Posted: Mon Oct 23, 2017 4:12 pm
by Brendan
Hi,
DevNoteHQ wrote:
Brendan wrote:That's not right. For 64-bit 80x86, the structure should be:

Code: Select all

namespace CPU
{
	typedef struct{
		uint64_t   rip;
		uint64_t   cs;
		uint64_t   rflags;
		uint64_t   rsp;
		uint64_t   ss;
	} __attribute__((__packed__)) State;
This is (literally) what the CPU itself pushes on the stack when it starts an interrupt, and nothing more.

This means that you don't have access to the interrupted code's registers; which is relatively useless for exception handlers because you typically do want to know what the registers were when the exception happened (even if it's just for dumping useful information when something crashes) and may want to modify the registers before returning (e.g. for things like emulating unsupported instruction in the "invalid opcode" exception handler). Also note that there's some cases where you need special handling - e.g. for page fault you may need "CR2 from before a potential second page fault could overwrite it", for NMI and machine check exception you might want/need insane hackery (to guard against things like "NMI (or machine check) immediately after SYSCALL when kernel stack is still dodgy"), etc. The compiler's "__attribute__((interrupt))" won't be useful for any of these things either.

For normal IRQs, for modern computers IRQs mostly need to be dynamically assigned (for MSI and IO APIC) and may be shared (for PCI in general), and for these reasons typically you have a common IRQ handler that uses an "IRQ number" parameter to do its job (e.g. to find a list of ISRs that are sharing the IRQ and call each higher level ISR in that list). The compiler's "__attribute__((interrupt))" can't do this (you need a different "IRQ handling stub" in assembly to set the "IRQ number" parameter before calling the common IRQ handler) so it's not particularly useful for normal IRQs either.
GCC pushes the general purpose registers onto the stack (see the picture). In a normal stub handler you would just manually push those on the stack, so why does it make a difference?
When compiling that specific piece of code, that specific version of GCC with those specific command line args (optimisation settings, etc) happened to push almost half of the registers you need. There is no guarantee whatsoever that trivial changes to anything (the code being compiled, GCC, etc) won't completely change the amount and order of what the compiler's code happened to feel like pushing on the stack.

The only guarantee is what GCC's manual says; GCC's manual says "see CPU vendor's documentation", and the CPU vendor's documentation does not include random trash that the compiler felt like pushing by accident.


Cheers,

Brendan

Re: x86_64-ISR-Registers and syscall arguments

Posted: Tue Oct 24, 2017 7:36 am
by DevNoteHQ
Brendan wrote:Hi,

GCC pushes the general purpose registers onto the stack (see the picture). In a normal stub handler you would just manually push those on the stack, so why does it make a difference?

When compiling that specific piece of code, that specific version of GCC with those specific command line args (optimisation settings, etc) happened to push almost half of the registers you need. There is no guarantee whatsoever that trivial changes to anything (the code being compiled, GCC, etc) won't completely change the amount and order of what the compiler's code happened to feel like pushing on the stack.

The only guarantee is what GCC's manual says; GCC's manual says "see CPU vendor's documentation", and the CPU vendor's documentation does not include random trash that the compiler felt like pushing by accident.


Cheers,

Brendan
Thanks! I've changed my handler to assembly now.

Some more questions:
"swapgs" swaps the current value of gs with MSR_GS_BASE, right? Do i need to set MSR_GS_BASE or is that done automatically?

Is there an instruction for reading/writing TSS_RSP0? Or do i need to manually do it?
Do i have to change rsp from application to kernel stack when entering a Interrupt/Syscall or does that not matter?

Thanks a lot,
DevNoteHQ

Re: x86_64-ISR-Registers and syscall arguments

Posted: Tue Oct 24, 2017 9:30 am
by Korona
MSR_GS_BASE is the GS base address. swapgs swaps MSR_GS_BASE and MSR_KERNEL_GS_BASE. Both registers need to be set up manually.

There is no instruction to access the kernel stack base from the TSS. swapgs is meant to solve exactly that problem. Use swapgs to load a known GS base and then load the kernel RSP from the GS segment.

All this applies only to syscall. Interrupts automatically load RSP from the TSS. Its probably a good idea to swapgs anyway in order to be able to rely on a consistent GS segment for per-CPU data in your kernel.

Re: x86_64-ISR-Registers and syscall arguments

Posted: Tue Oct 24, 2017 11:22 am
by DevNoteHQ
Korona wrote:MSR_GS_BASE is the GS base address. swapgs swaps MSR_GS_BASE and MSR_KERNEL_GS_BASE. Both registers need to be set up manually.

There is no instruction to access the kernel stack base from the TSS. swapgs is meant to solve exactly that problem. Use swapgs to load a known GS base and then load the kernel RSP from the GS segment.

All this applies only to syscall. Interrupts automatically load RSP from the TSS. Its probably a good idea to swapgs anyway in order to be able to rely on a consistent GS segment for per-CPU data in your kernel.
Oh... Well now i understand what https://github.com/grahamedgecombe/arc/ ... /cpu-get.s does in his code...
For some reason i thought GS_BASE is only used to determine the current privilege number. But that is done by the CS register.
Knowing that the GS register can be used for what i want solves so much confusion in my head :mrgreen:

Re: x86_64-ISR-Registers and syscall arguments

Posted: Tue Oct 24, 2017 11:38 am
by Octocontrabass
DevNoteHQ wrote:But what about the other topic?
Can i do something like this:
Userland:

Code: Select all

namespace System
{
   void* Syscall(uint64_t iCall, void* OtherArguments)
   {
      asm volatile("syscall");
      //return?
   }
}
You can have a function like this, but if you want it to actually work, you need to tell GCC how to pass parameters and accept return values. For example:

Code: Select all

void *Syscall( uint64_t iCall, void *OtherArguments)
{
    void *retval;
    asm volatile( "syscall" : "=a"(retval) : "D"(iCall), "S"(OtherArguments) : "memory" );
    return retval;
}
This assumes your syscall handler returns a value in RAX, expects iCall in RDI, expects OtherArguments in RSI, accesses memory belonging to your program (for example, by dereferencing the OtherArguments pointer), and doesn't modify the values in any registers besides RAX. You can change the constraints and clobbers to fit your kernel's syscall handler.