Systems Thinking about Communications - some related material:
Systems design for protocols - see Trading packet headers for packet processing, by
Chandranmenon, G.P.; Varghese, G., and for improving state set up costs, see A model, analysis, and protocol framework for soft state-based communication by Raman and McCanne.
Protocols can be given the performance improvements by batching (amortizing cost of processing multiple times in one go - e.g. aggregating packet processing in one interrupt service routine), pipelining (e.g. Integrated Layer Processing), but beware, this can be bad if the loop to do the processing exceeds the instruction cache size (or ditto for packet header data). This is not just diminishing returns, but can be counter-intuitive. We can amalgamate different processing and improve not just implementation, but also design - two elegant examples are TCP/IP header compression and header prediction - see RFC1144 for a beautiful piece of design and implementation in this regard.
Statistical multiplexing (see lectures on telephone, internet and atm for discussion) can be great, if you know the statistics of the traffic at various levels (of scale in time and space and aggregation)- as we pointed out, this is very true for voice (whether telephony voice or skype/voip) but less so for data (we know how a TCP flow works, but not how a lot go togeter, but worse, with a mix of client/server and client/proxy traffic, and P2P traffic, we really don't haev a good handle on the "traffic matrix" any more, if we ever did!)
Virtualization (of h/w/OS - see Xen, or network, see VPNs) is a way computer science hides multiplexing (sharing) - the control plane allocates a share and schedules it, then each share (slice/multiplex) sees a dedicated resource "apparently all its own" - reality (accuracy of dedicated share) depends on accuracy of the scheduler - as in all things CS, "your mileage may vary":-). To work well, the scheduler has to model the resource being virtualized, and the requirements of the accuracy of the model trade-off against its efficiency dependent on the statisitcs of the workload - see, easy!