Tuesday, November 25, 2025

Principles of Communications Week 8 25-27/11/2025

tue: traffic management timescales, users, large and small, demands, short and long...

signaling and soft state


thu: systems design patterns


Monday, November 17, 2025

Principles of Communications Week 7 18-20/11/2025

 This week we'll cover


  • qjump[qj] - data center networking - a big special corner case of traffic scheduling.
note qj puts the token bucket pacer in a hypversions so it is a regulator and a policer in one. it could also be offloaded to a smart NIC If you had one, or just put in the guest OS if you trust it.
  • optimisation - routes for traffic, and traffic for routes
The start of traffic engineering by steering traffic by weights - note the hill descent is a classic old school AI technique.  And the origin of why tcp does AIMD is also a type of optimisation, of joint source and network utility!

Optimisation is a very large topic in itself,  and underpins many of the ideas in machine learning when it comes to training - ideas like stochastic gradient descent (SGD) are seen in how assign traffic flows to routes, here. In contrast, the decentralised, implicit optimisation that a collection of TCP or "TCP friendly" flows use is more akin to federated learning, which is another whole topic in itself.

Why a log function? maybe see bernouilli on risk

Why proportional fairness? from social choice theory!

Are people prepared to pay more for more bandwidth? One famous Index Experiment says yes.

0. See Computer Systems Modeling for why the delay grows quickly as load approaches capacity.

1. see IB Distributed Systems for clock synch

2. see prev year's Cloud Computing (II) module for a bit more about data centers&platforms.


[qj] The qj paper gives lots more details of setup, for anyone interested (the figures in the paper are clickable and take you to software and switch/network configurations and data).

Tuesday, November 11, 2025

Principles of Communications Week 6 11-13/11/2025

 We've revisited flow and congestion control - one way of visualising the progress of an adaptive flow&congestion control protocol is the time sequence diagram:-




but note this is a massive over-simplification as really what you see here is an ideal with only one source and (apparently) only one bottleneck queue. In reality, in FIFO queues, traffic from multiple sources mixes and interferences (causing high variance in delay, hence round trip time, and very unpredictable loss. If we had Round Robin (by flow) queues, things might be a bit better, but how much? That is what we look at under Scheduling, and with the Generalised Processor Sharing model, can see how close to some ideal of "isolation" between flows, we can get. 

With FIFO queues, RTTs and rates (as estiamted from data or ack packet inter-arrival times are going to be varying fairly chaotically, as the ensemble of flows at any bottleneck will not be coordinated in any special way as they all have different RTTs and perhaps just different performance senders, maybe different packet sizes, possibly different inter-packet timing at transmit time, etc etc


A combination of open loop (admission control) and closed loop (feedback) would allow sources to simply operate at a flat rate (the requested rate) or adapt above it as capacity is made free...

The trade off in flow control and scheduling is all about overall system complexity.



Next, we'll wrap up on queue management, then move on to the special (but important) case of data center networks and latency control!




Tuesday, November 04, 2025

Principles of Communications Week 5 4-6/11/2025

This week, we've got two random[ref] examples


1. random telephone routing

c.f. triangles

can we greedily find the tandem? (only) if you want to check the reasoning behind that part of DAR!

2. random drop congestion - serioiusly, today's lecture is mainly revision of IB congestion/flow control...

c.f.tcp arena 

how many TCPs are there, really? again, only here for the keen background reader!


ref: there's a very nice study of general applicability of random choices in this harvard paper if you want some more background reading...

separately, in a discussion with a supervisee/supervisor, i realise some people may be interested in looking at real code to reinforce their understanding - for IB material, I strongly recommend Rich Stevens' TCP/IP Illustrated Volumes 1 and 2, but for this course, there's not really any such a nice single text - an alternative might be this:-

There's a neat network simulator called NS3 which has real code in it - it is used both for routing experiments and for transport - the package also has a nice visualisation system, so when you configure a simulation with a topology of routers and hosts and links, you can record the traces of events (packets) and later 'play it back' with a nice layout - see
for where to get it & documentation including examples...

Simulations are interesting as they often (in this case) include fragments of real code from real systems, so they are also test harnesses. however, more fundamentally, packet-based event driven simulators are implented with much the same design pattern as packet switching systems (i.e. routers) - effectively the simulator is an aggregsation of all the event schedules for all the entities in the system confituation being simulators- there are basically two types of events
- timers - both for when to send periodic route updates/keepalives, and for tcp/quic retransmit type actions
- packets arriving to go to the next stage (whether next layer or next hop...)

Associated with events is some state (e.g. forwarding information base searched by destination address in IP header, or label switched path indexed by packet label in MPLS shim header) - or in transport protocols, the 
whole bunch of information about sequence numbers, windows and congestion, etc

Event driven code tends to look a bit different from classic application layer code
[that said, proxies/caches/servers can look similar).

From the router perspective, there's the event/timer driven distributed routing control protocol,

and separately theres the actual packet forwarding processes/threads ...the former (as i mentioned before) is often in the pattern of a unix daemon/service, whereas the latter is much lower level (and often has to interface to custom hardware for supportng fast lookup and even switch fabric interfaces) - there are open source routers other than just linux based (home broadband) ones - (e.g. www.nongnu.org/quagga ) but i think this might be a lot to read with few extra lessons..


Tuesday, October 28, 2025

Principles of Communications Week 4 28-30/10/2025

 This week we wrap up BGP - looking at abstractions of the algorithm (The Stable Paths Problem in Interdomain Routing) and concrete realisations of problems in implementations.

While you may find the stable paths model helpful in removing noise from BGP complexity, I am not so sure its a great abstraction for thinking about how actually to resolve the problem(s) (non convergence etc). A nicer approach (by same lead person, Tim Griffin) is meta routing, which is very powerful and general, but would need an entire other course to discuss and I just put here for background in case anyone is interested !


Then we'll next make a start on Multicast Routing. Two compelling applications were tv/radio broadcast and software distribution. However, application layer overlays (Content Distribution Networks) have subsumed those needs - a great example is how Zoom coordinates multiparty sessions - this talk by their CEO is instructive. Also interesting is this paper on netflix content distribution approach.

Sunday, October 19, 2025

Principles of Communications Week 3 21-23/10/2025

 Interdomin routing-  BGP - key 4 slides - 124 126 131 132

Moving from intra-domain (within one autonomous system/routing domain/internet service provider) to intradomain, the key change is from policy within a domain (as used for steering traffic in centralised routing or in mpls or segment routing) to policy between multiple autonomous (i.e. independent) domains who may have conflicts and often require some level of information hiding (protecting knowledge of their customers' needs from competitors). So while connectivity is the minimum requirement, there's often no shared goal in terms of what is "optimal" (i.e. what routing metric to use) - we'll see that in default cases, for traffic engineering, and various tie breaking reasons, metrics implicitly creep in as an implici part of BGP routing, but not in any way consistently.








The last thing on BGP is going to follow from Traffic ENgineering and asks two questions: what really is the model BGP implements, to better understand how it works (and goes wrong)? and What engineering tweaks can we do to make it actually operate better in practice. We'll finish up on this on Oct 28th

Tuesday, October 14, 2025

Principles of Communications Week 2 14-16/10/2025

1 Centralised (!) Routing - Fibbing (see fibbing paper for more details - esp. figure 9/10 on failure/recovery modes) - in particular, fail-open and fail-close is used in the paper to refer to the persistence of a path made up by fibbing in the event of a controller failure where in some cases this is needed, and in others it needs to be removed! [1,2]

2 Stateful Routing - MPLS. Multi-protocol label switching has a long history (probably started in Cambridge!). It simplifes switch/router design in terms of forwarding, at the cost of increasing complexity in the control plane  -- possibly in routing, but more crucially, signaling. Signaling protocols have a very long history (from railways over 200 years ago). One interesting computer science dimension that arose from signaling is the concept of mutual exclusion. The first algorithms for avoiding contention for a limited resource are direct descendents of the P and V flags used to prevent two trains entering the same section of track....

In a sense, these two ideas (central and stateful) can be reconciled via "soft state" protocols (see last lecture).

Note also: MLPS involves a "shim" layer between IP and lower levels. Segment routing may use that, or may just use IP6 routing options directly. Recall layering from IB networking course. It is often not a pure picture - IP tunnels are another example of extra layers  between this and that. MPLS can also simplify switch & router port processing (and possibly, if switch is "cell switched" scheduling forwarding packets across the switch fabric - again, recall router architecure from IB networking course). Segment Routing is a re-think of MPLS, which can use IPv6 routing options as labels, and then use IP routing updates to distribute the label information to ( amongst others, upstream) neighbours. There's a nice slidepack SR explainer from CERN which shows the interaction with routing... Note segment routing with IPv6 dispenses with the potential hardware speedup of having 20 bit MLPS labels for forwarding, so one assumes router NICs and Processors may have ASIC support for v6 header processing!

Optional background reading...

2. Another dimension of signaling is that it requires a level of accesss control authentication and authorisation not typically present in pure datagram networks like traditional IP. For a measure of how bad it can get, look no further than the old digital telephone network signaling system number 7 (SS7) which is more complex than the whole TCP/IP regular data stack (see report on vulnerabilities in SS7). RSVP (serves similar function for signaling for MPLS if you don't just rely on routing!) is about as bad.

1. Further work was done based on the fibbing idea:

Basically, it leverages FIBbing-like mechanisms to optimize (oblivious) TE in IP networks.

This paper is about the hardness of configuring ECMP for TE. This is actually based on a very cute hardness amplification proof technique.