jamesread wrote:Hi all,
I want to design a new OS that is designed from the ground up for prioritising network activity such that applications such as web servers and web crawlers will be high performance by default. I'm hoping to start a stimulating conversation about what design choices are appropriate for such a goal.
You should learn about appliances. There are many. They are specialized hardware because the weak link is usually not the software, but the hardware. For example
Avalanche can generate HTTP requests at really high freq (hundred thousands of connections per sec), filling up a 100G network line. And some
Cisco Web server appliances can serve webpages through six 10G ports at once.
So true high-performance means specialized hardware.
jamesread wrote:You may just think I'd be better off modifying an existing system to achieve this goal. That's OK. But I am sure that there could be gains from designing the system from the ground up especially for purpose.
I really don't think there's much you could add to the existing solutions.
jamesread wrote:Here's my thinking so far. Web servers have seen massive performance gains by switching from a threaded model to an event driven model using for example epoll in Linux. Nginx outperforms Apache using this model. But is there further work that can be done to increase throughput
The Linux kernel had a built-in
webserver, but it was obsoleted and now there's
khttpd for maximum performance. It is fast because it does not need processes, threads, and not even epoll and there's zero kernel-space - user-space overhead at all. That's the fastest software solution there can be. The price of performance is terrible security though (everything runs at ring 0, meaning a crafty HTTP-request could trigger a buffer-overflow directly inside the kernel).
jamesread wrote:and number of parallel connections supported by redesigning the OS to prioritise network communications over and above all other work on the system?
Now that's a completely different question. It's not the priority that matters (most servers does not run anything else than a webserver anyway), and there are RT versions of the Linux kernel you can use to minimize the overhead (combined with khttpd there'll be exactly zero latency). The problem is more related to the number of parallel connections. The proper terminology is "
number of concurrent connections", google that. It is limited by many various factors:
1. the NIC throughput
2. the MTU size (which specifies how often an NIC IRQ is required for a stream within a connection, Cisco calls these jumbo frames)
3. the number of IRQs per sec possible on the hardware (see "interrupts limit" within the Linux kernel, plus also read
this)
4. size of the TCP session table (if it's full, a webserver can't accept no more new connections, read
this)
5. number of available file descriptors (webserver uses some per connection, and without threading and forking this is a global limit, even with khttpd)
6. amount of RAM (how much of the webpage content can be cached)
7. speed of the storage (how slow it is to read the content into the cache)
...etc. however all of these are
either hardware-related (not much an OS can do about it) or
already run-time configurable in mainstream OSes (use "sysctl -a" under Linux for example).
Furthermore, using a dedicated cache server is also a common practice. With low TTL (1 sec) users won't notice a thing, however it means a lot for the CPU. High-traffic sites often use clusters instead of single server with a very complex architecture:
Code: Select all
clients -- load balancer -- cacheserver(s) -- application server(s)
With VMs like AWS, the load balancer can start new cache and application server instances dynamically if the load increases, meaning the high-performance is achieved through running many machines in parallel. The cheapest load balancer could be a round-rubin DNS with multiple A records, but more serious (and expensive) solutions use
VRRP (see also
RFC 5798) that operates at ARP (mac address) level that provides very low latency in routing.
Anyway, I don't think you can come up with a better solution than what's already implemented in Linux. (I don't want to ruin your enthusiasm, but the fact is, Linux serves almost every high-traffic sites, so it has been put to test under real-time conditions for several decades now. Everything that worth trying to improve performance has already been tried with Linux, hardware-backed solutions included.)
Knowing this, I don't want to say you shouldn't try, but be prepared you have to be very very experienced to even have a chance to outperform a properly configured khttpd. I strongly suggest to study the subject by researching the phrases and terminology I've used in this post.
Cheers,
bzt