tue: traffic management timescales, users, large and small, demands, short and long...
signaling and soft state
thu: systems design patterns
tue: traffic management timescales, users, large and small, demands, short and long...
signaling and soft state
thu: systems design patterns
This week we'll cover
Optimisation is a very large topic in itself, and underpins many of the ideas in machine learning when it comes to training - ideas like stochastic gradient descent (SGD) are seen in how assign traffic flows to routes, here. In contrast, the decentralised, implicit optimisation that a collection of TCP or "TCP friendly" flows use is more akin to federated learning, which is another whole topic in itself.
Why a log function? maybe see bernouilli on risk
Why proportional fairness? from social choice theory!
Are people prepared to pay more for more bandwidth? One famous Index Experiment says yes.
0. See Computer Systems Modeling for why the delay grows quickly as load approaches capacity.
1. see IB Distributed Systems for clock synch
2. see prev year's Cloud Computing (II) module for a bit more about data centers&platforms.
We've revisited flow and congestion control - one way of visualising the progress of an adaptive flow&congestion control protocol is the time sequence diagram:-
but note this is a massive over-simplification as really what you see here is an ideal with only one source and (apparently) only one bottleneck queue. In reality, in FIFO queues, traffic from multiple sources mixes and interferences (causing high variance in delay, hence round trip time, and very unpredictable loss. If we had Round Robin (by flow) queues, things might be a bit better, but how much? That is what we look at under Scheduling, and with the Generalised Processor Sharing model, can see how close to some ideal of "isolation" between flows, we can get.
With FIFO queues, RTTs and rates (as estiamted from data or ack packet inter-arrival times are going to be varying fairly chaotically, as the ensemble of flows at any bottleneck will not be coordinated in any special way as they all have different RTTs and perhaps just different performance senders, maybe different packet sizes, possibly different inter-packet timing at transmit time, etc etc
This week, we've got two random[ref] examples
1. random telephone routing
c.f. triangles
can we greedily find the tandem? (only) if you want to check the reasoning behind that part of DAR!
2. random drop congestion - serioiusly, today's lecture is mainly revision of IB congestion/flow control...
c.f.tcp arena
how many TCPs are there, really? again, only here for the keen background reader!
ref: there's a very nice study of general applicability of random choices in this harvard paper if you want some more background reading...
separately, in a discussion with a supervisee/supervisor, i realise some people may be interested in looking at real code to reinforce their understanding - for IB material, I strongly recommend Rich Stevens' TCP/IP Illustrated Volumes 1 and 2, but for this course, there's not really any such a nice single text - an alternative might be this:-
This week we wrap up BGP - looking at abstractions of the algorithm (The Stable Paths Problem in Interdomain Routing) and concrete realisations of problems in implementations.
While you may find the stable paths model helpful in removing noise from BGP complexity, I am not so sure its a great abstraction for thinking about how actually to resolve the problem(s) (non convergence etc). A nicer approach (by same lead person, Tim Griffin) is meta routing, which is very powerful and general, but would need an entire other course to discuss and I just put here for background in case anyone is interested !
Then we'll next make a start on Multicast Routing. Two compelling applications were tv/radio broadcast and software distribution. However, application layer overlays (Content Distribution Networks) have subsumed those needs - a great example is how Zoom coordinates multiparty sessions - this talk by their CEO is instructive. Also interesting is this paper on netflix content distribution approach.
Interdomin routing- BGP - key 4 slides - 124 126 131 132
Moving from intra-domain (within one autonomous system/routing domain/internet service provider) to intradomain, the key change is from policy within a domain (as used for steering traffic in centralised routing or in mpls or segment routing) to policy between multiple autonomous (i.e. independent) domains who may have conflicts and often require some level of information hiding (protecting knowledge of their customers' needs from competitors). So while connectivity is the minimum requirement, there's often no shared goal in terms of what is "optimal" (i.e. what routing metric to use) - we'll see that in default cases, for traffic engineering, and various tie breaking reasons, metrics implicitly creep in as an implici part of BGP routing, but not in any way consistently.
1 Centralised (!) Routing - Fibbing (see fibbing paper for more details - esp. figure 9/10 on failure/recovery modes) - in particular, fail-open and fail-close is used in the paper to refer to the persistence of a path made up by fibbing in the event of a controller failure where in some cases this is needed, and in others it needs to be removed! [1,2]
2 Stateful Routing - MPLS. Multi-protocol label switching has a long history (probably started in Cambridge!). It simplifes switch/router design in terms of forwarding, at the cost of increasing complexity in the control plane -- possibly in routing, but more crucially, signaling. Signaling protocols have a very long history (from railways over 200 years ago). One interesting computer science dimension that arose from signaling is the concept of mutual exclusion. The first algorithms for avoiding contention for a limited resource are direct descendents of the P and V flags used to prevent two trains entering the same section of track....
In a sense, these two ideas (central and stateful) can be reconciled via "soft state" protocols (see last lecture).
Note also: MLPS involves a "shim" layer between IP and lower levels. Segment routing may use that, or may just use IP6 routing options directly. Recall layering from IB networking course. It is often not a pure picture - IP tunnels are another example of extra layers between this and that. MPLS can also simplify switch & router port processing (and possibly, if switch is "cell switched" scheduling forwarding packets across the switch fabric - again, recall router architecure from IB networking course). Segment Routing is a re-think of MPLS, which can use IPv6 routing options as labels, and then use IP routing updates to distribute the label information to ( amongst others, upstream) neighbours. There's a nice slidepack SR explainer from CERN which shows the interaction with routing... Note segment routing with IPv6 dispenses with the potential hardware speedup of having 20 bit MLPS labels for forwarding, so one assumes router NICs and Processors may have ASIC support for v6 header processing!
Optional background reading...
2. Another dimension of signaling is that it requires a level of accesss control authentication and authorisation not typically present in pure datagram networks like traditional IP. For a measure of how bad it can get, look no further than the old digital telephone network signaling system number 7 (SS7) which is more complex than the whole TCP/IP regular data stack (see report on vulnerabilities in SS7). RSVP (serves similar function for signaling for MPLS if you don't just rely on routing!) is about as bad.
1. Further work was done based on the fibbing idea: