If you look at traditional network stack designs about all of them use some kind of buffer (like sk_buff in Linux) to split up the transfer into ethernet frame sizes. This requires that you have network buffers that have enough size for the headers of all the protocol layers which adds to the complexity. Also this requires that the data is copied to the network buffers.
So far I haven't seen any Ethernet HW that allows for chained memory transfers. What I mean with that is that you can set several chained buffers to be the source of one Ethernet frame. This enables you to decouple the protocol headers from the data. From a SW point of view, you can now preallocate the headers only, chain them for one frame and also the data. Also the data doesn't need to be copied to a frame buffer and can remain where the user program have it which increases the performance.
Obviously this only applies for sending data, still you need buffers for receiving data as you don't know the destination before you receive the data.
For me this is kind of significant because such a feature would change the SW design of the network stack quite a lot. Still I haven't seen such a design and current network stacks seem to be designed for the lowest common denominator.
1. Have you seen any network HW that allows these kind of chained memory operation for one frame?
2. Do you think such a HW design would be beneficial for the efficiency of network stacks?
Chained memory network frame transfers
-
- Member
- Posts: 595
- Joined: Mon Jul 05, 2010 4:15 pm
Re: Chained memory network frame transfers
I don't think NICs should support bad system design decisions. It's possible to design a zero-copy network stack if you first ask for a buffer (adding the size of the headers in the process), and then add the data in the other direction until you are back at the hardware level, at which point you just link it to the NIC hardware buffer.
It's a bit the same issue as for filesystems where the "everything is a handle" concept means that the producer provides the buffer rather than ask for it.
It's a bit the same issue as for filesystems where the "everything is a handle" concept means that the producer provides the buffer rather than ask for it.
Re: Chained memory network frame transfers
Modern hardware supports what you are looking for, it is called "scatter-gather" I/O. On VMs, at least virtio-net can be used for testing.
However, SG is not a panacea: as soon as IP fragmentation for UDP is involved, you need to split packets anyway (unless the NIC supports IP fragmentation offloading). The same is true for TCP, both for regular operation (due to the receive window) and for retransmission.
In any case, you probably want to put all headers into one SG buffer (and avoid using one buffer per layer of headers). Otherwise the NIC's performance will suffer since it needs to do more DMA transactions to fetch each packet.
However, SG is not a panacea: as soon as IP fragmentation for UDP is involved, you need to split packets anyway (unless the NIC supports IP fragmentation offloading). The same is true for TCP, both for regular operation (due to the receive window) and for retransmission.
In any case, you probably want to put all headers into one SG buffer (and avoid using one buffer per layer of headers). Otherwise the NIC's performance will suffer since it needs to do more DMA transactions to fetch each packet.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].