Today we finished with systems design patterns, and a course summary.
For revision, maybe take a look at example supervision questions and past exam papers
Today we finished with systems design patterns, and a course summary.
For revision, maybe take a look at example supervision questions and past exam papers
This week we covered the multiple time scales of traffic engineering and signaling, from packet time/scheduling to end-to-end/RTT latency admission control/call setup or else congestion control/queue management, to network wide provisioning and the interaction with pricing/utility, and consequences for possible accountability of end user/identity/payment systems...
Then we looked at a toolkit of system design patterns, with particular attention to communications systems, but with broad applicability in many domains/sectors/walks of life...
This week we're covering firstly data centers (through the lens of q-jump and its clever "fan-in" trick), and secondly, optimisation of routes and rates. Both of these topics relate to practice, and theory of quite a few machine learning systems/platforms.
The mix of application platforms we see in the data center lecture are classic mapreduce (as represented by Apache Hadoop) and stream processing (as represented by Microsoft's Naiad)[2], as well as memcached (an in memory file store cache, widely used to reduce storage access latency and increase file throughput - e.g. for aforesaid platforms), as well as PTP (a more precise clock sysnch system than the widely used Network Time Protocol[1]). For an example of use of map reduce, consider Google's original pagerank in hadoop. For the interested reader, this paper on timely data flow is perhaps useful background.
Optimisation is a very large topic in itself, and underpins many of the ideas in machine learning when it comes to training - ideas like stochastic gradient descent (SGD) are seen in how assign traffic flows to routes, here. In contrast, the decentralised, implicit optimisation that a collection of TCP or "TCP friendly" flows use is more akin to federated learning, which is another whole topic in itself.
Why a log function? maybe see bernouilli on risk
Why proportional fairness? from social choice theory!
Are people prepared to pay more for more bandwidth? One famous Index Experiment says yes.
0. See Computer Systems Modeling for why the delay grows quickly as load approaches capacity.
1. see IB Distributed Systems for clock synch
2. see last year's Cloud Computing (II) module for a bit more about data centers&platforms.
This week we covered congestion control (with a brief mention of latest ideas of CUBIC and BBR) and scheduling (work conserving, max min fairness, and round robin with various tweaks, as well as queue management, with another appearance of "randomness" as a useful trick).
My analogy for the overall system that is TCP + IP is that it is like a bicycle, with a chain connecting two sets of gears - you can change gear at each end to change rate/speed, but the wheels slip (forwarding) and the chain is made of soggy bread (end to end rate).
IP forwarding+scheduling is like the bit where the rubber hits the road, and the foot on the pedal is like the sender, while the force backwards from friction, skidding and effort is the flow & congestion control.
Now loosely couple the chains of 10 billion bikes and riders and roads, and run them anything from 1 to a billion RPM...
and make the road surface out of jelly. and take a chain link out, or put on in every now and then (re-routing)...
and many of the bikes are penny farthings but some are jetskis (i.e. not everyone is running the same algorithm, either at IP level or end-to-end/TCP level).
We covered mobile and telephony routing - key interesting idea is Dynamic Alternative Routing, a.k.a. sticky random routing, where the Gibbens paper covers the Max Flow analysis, plus this paper gives the statistical background on the triangle problem.
We then covered (essentially, revision of last years networks) flow and congestion control. There's an important link to control theory here which is covered in the Computer Systems Modeling module next term.
This week, we'll finish up BGP, and cover multicast routing.
General lessons from BGP - information hiding can be harmful to decentralised algorithms. but information hiding may be a necessary dimension to some distributed systems due to business cases (commercial inconfidence/competitive data).
Are there other ways to retain decentralised or federated operations but to retain also confidentiality? Perhaps using secure multiparty computations (not covered in this course!). There are a number of other federated services emerging in the world (applications like Mastodon and Matrix, and also, federated Machine Learning systems like flower) so this probably needs a good solution.
Multicast has a great future behind it - some neat thinking, but largely replaced by application layer content distribution networks (e.g. netflix, youtube, apple/microsoft software update distribution), and the move away from simultaneous mass consumption of video/audio. For more info on limitations of multicast, this paper is a good (quick) read. One of the key objections to multicast is that, like broadcast, it has great power (reach) but also great potential for harm - as a tool for launching denial of service attacks on many people at once, it could be rather handy, for bad people....source specific multicast was one way to limit this risk (so that ISPs could police those sources) - indeed, it would be fun if one could only have source-specific unicast so one would only have to receive packets from designated senders!
Finally, multicast IP has the same packet lost characteristics as normal IP, so you need to put an end-to-end reliable protocol on top of it for applications like software distribution (though perhaps not for video/audio etc). - this isn't as simple as using TCP (or QUIC) as the acknowledgement mecahsnisms would not work well for large groups (each data packet to a group of N recipients would cause a flood of N acks which woiuld likely overwhealm the sender or the senders network access). So you need new multicast transport protocols that manage reliability differently. This is another challenge for using IP multicast - although in some limited deployments (e.g. data centers) it might be a good fit for some applications.
In the analog world (e.g. radio, or audio) broadcast is obviously both good and bad - spark gap generators jam all the radios within considerable range, and white noise generators mess with everyone's hearing nearby...
Are application layer overlays a solution to other "in network" alleged enhancements? Possibly - for example, Cloudflare obviate the need tor always on IPv4 or IPv6 addresses - this also wasn't covered in this course, but might be of interest. Thir work is described here and other papers of theirs may be of current interest.
This week, we'll start on BGP/Policy Routing, getting up to traffic engineering, including the amusingly named "hot and cold potato". Next week we will wrap up BGP covering semantics & performance.
A great overview of BGP if you want an alternate source.
This week we started on more fundamentals of routing, which can be
centralised - we gave an example, using SDN controllers to add FIBs into a routing domain - in "fibbing" this is done in a hybrid distributed/centralised way, so that the default is that a distributed routing algorithm (typically link state routing) runs, but is then enhanced by central control. At one extreme, everything could be done through central control and dirct control of routers via OpenFlow, and robustness achieved by replicated the controllers.
decentralised - one decentralise approach is source routing, where routers are completely dumn with regards paths, but just forward packets based in fields in the packet. One example of this is the use of loose source routes in iPv6 in Segment Routing.
distributed - classic Link State Routing is distributed, and does not distinguish any special node. Of course, link metrics might be configured centrally to influence path selection (see traffic enginering)
federated. - the internet is a network of networks, where each autonomous system runs its own routing protocol (any of the paradigms here work), but to connect networks together and find paths, BGP is a classic example of fedeated routing.
We'll start on BGP details next week.
The Control Plane verus the Data Plane
Routing is part of the control plane, whereas forwarding is part of the data plane - control planes are where the management sits (think like light switches or thermostats) whereas data planes is where the users business lies (power, pressure, data...).
Multi-protocol label switching and Segment Routing are a mixture of control and data. MPLS requires other protocols help to install forwarding state (labels), and segment rouring requires other protocols to tell end systems what segement addresses to put in packets. Those other protocols might be pure management protocols (centralised) or just piggyback on distributed routing (add information to link state advertisements), or use some new protocol to obtain map/internet graph data. They then also need additional action in the data plane (to put label on user's data packet's MPLS header, or to put a segment source route list in a user's data packet's IP header).
So MPLS is not a routing protocol, but rather addresses on the forwarding problem, and interacts with other protocols to install labels - this can be done via label distribution protocols (LDP) or signaling protocols like RSVP (which amount to source routing in some senses). In Segment Routing (SR), which is an IP layer analog of MPLS, we can also choose to distibute segment labels via additional link state update fields, or we can use lists of intermediate router addresses in IPv6 to achieve source routing.
Note MPLS operates "below" IP in the protocol stack, and hence can support multiple network layers for forwarding (hence its name), while SR operates "within" the IP layer.
The main way we illustrate that neither MPLS nor SR are routnig protocols themselves, is that they ave ad hoc methods to deal with switch/router outages and recovery from outages (or a priori protection).
Today sees start of 2023/2024 year and for Part II taking Principles of Communications I'll be noting progress and also adding occasional related reading/ and corrections on this blog.
If you want to revise anything to warm up for the course, I suggest last year's Computer Networks course should be a quick re-read!
This week we'll just make a start on routing.
Glossary of Terms: from ISOC. Some acronyms come from the 7 layer model of the communications stack including terms like PHY (short for physical, so not really an acronym).
For an influential source on how a lot of the way we really "traverse" the Internet today, Cloudflare's head of research writes s very good blog
For a very nice talk about how Internet Access technology works, this description of the evolution of last mile access is very up to date and watchable.