Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
I am not sure if it will help. Nonetheless I think I discovered the culprit. It is an intentional workaround. From the sources, I see that gdb takes care to treat system calls differently for variety of reasons. Which makes sense, because it always thinks that it is operating against user mode processes - whether the target is remote or not. One of the issues is that apparently some kernels when instructed to single step the system call, will trap into the debugger not on the instruction immediately following the int or sysenter, but on the one past it. The debugger works around this by temporarily relocating the next instruction one byte further and by inserting a nop in its stead. Depending on a billion factors, which I gave up trying to reverse engineer, the debugger will either try to use some target feature to reach past the nop (hopefully single stepping), or will insert temporary breakpoint at the relocated address and resume the target. When displaced stepping (the setting above) is not used (including when it is set to auto, and the threads - i.e. cpus for QEMU - operate in stop mode), the target will resume and skip over the syscall. At least this is my understanding from looking at the sources. I am not sure if this will work, but you can try it out.