I wonder how it works (or should) the TCP/IP stack in the operating system.
Is each layer (ethernet [sending] / ip / tcp / icmp / arp ... and so on) a separate process / thread
For example, the Ethernet layer (incoming packets) is already managed by the interrupt generated from the network card.
If so, then processes / threads should be appropriately prioritized to achieve good transmission performance.
Do I follow the right direction
Network stack and Threads
- CorruptedByCPU
- Member
- Posts: 78
- Joined: Tue Feb 11, 2014 4:59 pm
Network stack and Threads
https://blackdev.org/ - system programming, my own 64 bit kernel and software.
Re: Network stack and Threads
I don’t see any reason to run these layers in different threads. Each packet you receive will need to be “handled” by each of these layers just one time. And since these layers are chained together, with the output of one layer being handed over to the input of another layer, sequentially, I don’t think I there would be any advantage in multiple threads.
I guess the only scenario I can see where multiple threads make any sense would be if you were trying to achieve the absolute lowest latency for sending and receiving individual packets. You could have each thread handle one layer of the TCP/IP stack, but you'd probably get the same low latency results by just having each incoming packet processed entirely by a different thread out of a thread pool, which would be much easier to implement.
I guess the only scenario I can see where multiple threads make any sense would be if you were trying to achieve the absolute lowest latency for sending and receiving individual packets. You could have each thread handle one layer of the TCP/IP stack, but you'd probably get the same low latency results by just having each incoming packet processed entirely by a different thread out of a thread pool, which would be much easier to implement.
Project: OZone
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
Re: Network stack and Threads
Handling the whole TCP/IP stack in the interrupt routine might be faster, but it's a nightmare to debug, so I use a kernel thread per network card instead. Applications creating sockets will typically create their own thread to handle the socket. That way they can block waiting for data and so don't need complex asynchronous IO.
Re: Network stack and Threads
Hi,
Also note that sometimes you want to join two or more network devices together to create a virtual network device (sort of like how you might join 2 or more storage devices with a RAID layer to get a virtual storage device); and sometimes a network device is not used for TCP/IP at all (e.g. assigned to a virtual machine, using a different protocol stack like IPX); and sometimes you want two or more separate TCP/IP stacks (e.g. gateway/firewall with one TCP/IP stack for the public side and separate TCP/IP stack for the "internal LAN" side, with special forwarding between the TCP/IP stacks, like maybe a set of NAT rules).
For flexibility you'd want a modular design, possibly where network drivers provide a simple (and standardised) low level "send/receive packet" interface so it's easy to "mix and match" middle layers and upper layers. However, for performance you probably don't want that (e.g. a simple "send/receive packet" interface can be too simple to allow "offload engines" in high speed NICs to be used), so you end up with a complicated mess as you try to make compromises between flexibility/modularity and performance.
For examples; for some cases (10 GiB+ ethernet cards) under heavy load an IRQ handler can struggle to keep up so (to avoid the overhead of IRQs) you end up switching to polling without using IRQs; for some cases (loopback device) there isn't any device involved; for some cases the device driver ends up being a "no hardware access" child that communicates with a parent device driver (e.g. USB ethernet adapter driver that talks to a USB controller driver); and for some cases a device can be using more than one IRQ (e.g. a group of 16 IRQs and/or one IRQ per CPU).
Cheers,
Brendan
That depends on the OS. For a true micro-kernel you'd want processes in user-space, but you might not follow the "7 layer OSI model" (e.g. it might be one process per network card driver plus one process for everything else). For a monolithic kernel it depends on what the kernel provides - e.g. kernel threads, some kind of light-weight alternative like deferred procedure calls (Windows) or tasklettes (Linux), "nothing", etc.akasei wrote:I wonder how it works (or should) the TCP/IP stack in the operating system.
Is each layer (ethernet [sending] / ip / tcp / icmp / arp ... and so on) a separate process / thread
Also note that sometimes you want to join two or more network devices together to create a virtual network device (sort of like how you might join 2 or more storage devices with a RAID layer to get a virtual storage device); and sometimes a network device is not used for TCP/IP at all (e.g. assigned to a virtual machine, using a different protocol stack like IPX); and sometimes you want two or more separate TCP/IP stacks (e.g. gateway/firewall with one TCP/IP stack for the public side and separate TCP/IP stack for the "internal LAN" side, with special forwarding between the TCP/IP stacks, like maybe a set of NAT rules).
For flexibility you'd want a modular design, possibly where network drivers provide a simple (and standardised) low level "send/receive packet" interface so it's easy to "mix and match" middle layers and upper layers. However, for performance you probably don't want that (e.g. a simple "send/receive packet" interface can be too simple to allow "offload engines" in high speed NICs to be used), so you end up with a complicated mess as you try to make compromises between flexibility/modularity and performance.
You should think of a device driver as "glue between two interfaces", where one of the interfaces is "software side" (e.g. how the driver talks to the kernel or processes) and you are responsible for designing the "software side interface" for your OS, and one of the interfaces is "hardware side" (how the driver talks to the device itself) where typically you have no control over the design at all (it's designed by the hardware manufacturer). For the glue in the middle, the device driver developer needs to be free to do whatever makes sense for their specific device, and (unless you're writing a driver) you have no reason to make assumptions about what might or might not make sense.akasei wrote:For example, the Ethernet layer (incoming packets) is already managed by the interrupt generated from the network card.
For examples; for some cases (10 GiB+ ethernet cards) under heavy load an IRQ handler can struggle to keep up so (to avoid the overhead of IRQs) you end up switching to polling without using IRQs; for some cases (loopback device) there isn't any device involved; for some cases the device driver ends up being a "no hardware access" child that communicates with a parent device driver (e.g. USB ethernet adapter driver that talks to a USB controller driver); and for some cases a device can be using more than one IRQ (e.g. a group of 16 IRQs and/or one IRQ per CPU).
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
- CorruptedByCPU
- Member
- Posts: 78
- Joined: Tue Feb 11, 2014 4:59 pm
Re: Network stack and Threads
Ok, those answers gave me something to think about. Thank You.
https://blackdev.org/ - system programming, my own 64 bit kernel and software.
-
- Member
- Posts: 595
- Joined: Mon Jul 05, 2010 4:15 pm
Re: Network stack and Threads
What I recommend is that you make it single threaded to begin with and the reason is that it isn't that easy to identify places where parallelism is beneficial also potential blockers aren't that obvious either when you start looking at the details. The one I can recommend is the driver itself that can easily be identified as a blocker so send/receive should be asynchronously decoupled from the rest of the stack.
After that it starts to become difficult. The IP layer is pretty much a pass through after a few checks but making this multi threaded makes it possible for network packets to arrive out of order for the sub protocols. Some protocols can handle this but also the SW must be able to handle this in that case. Sockets is an obvious place where you can try to branch out in threads but the question is how much of a benefit it is.
My suggestion, don't do any premature optimization. Make a single threaded stack and then identify what you can do in parallel och the blockers. When it comes to separate them into processes, you can do that but the trend is that they consolidate the network stack into one process for speed reasons. Having several processes for each layer greatly impacts performance in a negative manner.
After that it starts to become difficult. The IP layer is pretty much a pass through after a few checks but making this multi threaded makes it possible for network packets to arrive out of order for the sub protocols. Some protocols can handle this but also the SW must be able to handle this in that case. Sockets is an obvious place where you can try to branch out in threads but the question is how much of a benefit it is.
My suggestion, don't do any premature optimization. Make a single threaded stack and then identify what you can do in parallel och the blockers. When it comes to separate them into processes, you can do that but the trend is that they consolidate the network stack into one process for speed reasons. Having several processes for each layer greatly impacts performance in a negative manner.
- CorruptedByCPU
- Member
- Posts: 78
- Joined: Tue Feb 11, 2014 4:59 pm
Re: Network stack and Threads
Thank You All.
Now I'm able to prepare a really fast 64bit IP stack.
Now I'm able to prepare a really fast 64bit IP stack.
https://blackdev.org/ - system programming, my own 64 bit kernel and software.
-
- Member
- Posts: 396
- Joined: Wed Nov 18, 2015 3:04 pm
- Location: San Jose San Francisco Bay Area
- Contact:
Re: Network stack and Threads
I work for Cisco but not networking expert. I know lot of even network card vendors to their tcp processing in the hardware to speed up called tcp-offloading for high performance network switcing to reach the speeds of multigigabit/sec rate.
key takeaway after spending yrs on sw industry: big issue small because everyone jumps on it and fixes it. small issue is big since everyone ignores and it causes catastrophy later. #devilisinthedetails