Monday, November 28, 2022

Principles of Communications Week 9 L16 29th Nov, 2022 - Systems Design pt II and Wrap.

Solving for Systems:-

  • multiplexing
  • pipelining
  • batching 
  • exploiting locality (spatial and temporal)
  • exploiting commonality 
  • hierarchy 
  • using binding with indirection 
  • virtualization (multiplexing, indirection, binding) 
  • exploiting randomization 
  • using softstate versus explicit state exchange 
  • employing hysterisis 
  • separating data and control 
  • extensibility


and measure, measure, measure and characterise.

Thursday, November 24, 2022

lovely explainer of the internet, at 5 levels of complexity

Jim Kurose explains the Internet to five different levels of complexity.

Tuesday, November 22, 2022

Principles of Communications Week 8 L14/L15 22nd&24th Nov, 2022 - Traffic Management and Systems Design

Traffic Management

This is mainly about the Timescale decomposition of traffic management components, packet transit/RTT times, flow setup times, traffic matrix time variations, and long term demand variation (usually increase).

Also novel protocol deployment create new demand patterns.

This all creates requirements for empirical input e.g. of user utilities for elastic and inelastic traffic demands, and for behaviour in response to offpeak (or congestion) type time varying charging for resource use. A great example of recent work on how things change sometimes quite quckly is this paper about the change in application demands during the pandemic

Signaling protocol complexity - signaling couples multiple components - end systems and users, to reuters/switches, schedulers and admission control, and even routing, so signaling is a mess, and to date, very little deployed in the Internet. Most traffic management decisions are made using bespoke (ad hoc) measurement tools and techniques.

One of the big arguments recently is how to divide up a budget (whether congestion or delay) amongst hops (e.g. AS hops) in a path - with some overall resource constraint (e2e delay must be less than 300ms) each hop will want to maximise the delay it can impose for bursty traffic (same for RED based ECN triggers) so it can maximise user traffic - need incentive matching!

Systems Design

Some things change demand surprisingly quick. - already mentioned the impact of the pandemic on increased in demand for interactive video/audio, but also sources shifted from work to working from home in daytime. New change is decentralised social media like Mastodon and Matrix, which have many p2p servers (mastdon today has around 4000, serving 8M people - the user base is growing at around 1M a week!)

Interesting social/legal/regulatory constraints include the simple (the EU mandates free data roaming or even trivial, like USBC phone charging sockets!) to the subtle - the Digital Markets Act requires any large service provider to open up APIs for interoperation - new work in internet standards means there will be open protocols that allow all systems (not just e-mail and web servers, but messaging and social media etc) to interwork - presumably also video conferencing 

Technically, this is actually trivial (e.g. most video and audio use same coding - indeed in WebRTC have to use same protocols too). The trickier pieces are interworking key management for security, especially for group communications.

Pipelining example lead to replacement of HTTP 1.0/TCP with HTTP 3/QUIC, which allows arbitrary ordering of packets delivered over UDP (but still with reliability, flow and congestion control, and e2e privacy) - this allows browsers far more freedom to render material from multiple sources/media.

And this will keep changing and changing as people create new applications and services, and new communications technology - all that requires measurement, measurement and also measurement.

Tuesday, November 15, 2022

Principles of Communications Week 7 L12/L13 15th&17th Nov, 2022 - Data Centers & Optimisation

 Data Centers - example of Queue Jump


1. Data centers have regular topology and software can be centrally managed

2. the Fan-in factor of traffic, sometiems known as TCP in-cast, can and does cause spikey delays, which very badly reduces the performance of distributed compiutations, clock synchronisation and in memory disk caches, all of which degrades the throughput of data centers

3. if we can differentiate traffic, we can use different schedules to treat flows with low latency requirements, from those with high throughput but no particularly latency bound needs.

4. in a 3 hop data center, a small number of priority queues will work - as long as the sum of traffic from sources of a given priority is controlled based on computing the occupancy of that priority class, and its delay impact - then there is capacity left for lower priority, and still very short queues too... Sources can be policed to ensure the classes don't starve out lower priorities, and this can be done below the app in the OS, or below the OS in a hypervisor

Optimisation:- multipath routes, decentralised rates.

from an entirely different perspective, we can treat the routing and rate control problems as optimisation challenges - in this approach, we can use gradient descent methods for assigning flows to routes, and distributed optimisation (via feedback control and increase/decrease searching for optimal utility) to compute rates.

Note well - the optimisation of routes can work at any level of aggregation, and hence is suitable for traffic engineering, and is largely a centralised technique.  The formulation is also agnostic about multipath routing, so is suitable in the presence of live use of redundant paths and load balancers, and is consistent with end-to-end multipath protocols (e.g. multipath TCP or QUIC). One tends to think of the route optimisation approaches being for longer term matching of traffic to paths (but could also suit open flow controlled traffic at the individual flow level, obviously, though this is not used in the Internet today). Of course, gradient descent methods are widely used in training in machine learning and AI.

n.b. finding minimum of a function reminder

The distributed, asynchronous, non-coordinated optimisation of rates (i.e. TCP congestion control or equivalents) is also applicable to other distributed machine learning. The rate adaptation is also suitable for multipath end-to-end protocols (i.e. MPTCP). So the rate optimisation techniques operate on the round-trip time timescales.

Tuesday, November 08, 2022

Principles of Communications Week 6 L10/L11 8&10 Nov, 2022 - Scheduling and Queue Management

I am going to loop back around to flow and congestion control, because these things go together like strawberries and cream, or meta and verse:-)

Flow control can be open loop (call setup with a traffic descriptor and an associated admission control algorithm), or closed loop, based on feedback (dupack, packet loss/timeout, or explicit congestion notification).

Packet forwarding can be FIFO, or Round Robin, or weighted according to some request (management, setup, payment, etc - out-of-band). If it is round robin, we get fairness quite naturally, and some degree of protection against misbehaviour of other flows. If the flows are policed (because they gave a descriptor in open loop, or because they implement congestion avoidance and control), then we get more protection against misbehaviour of other flows of packets. If the router implements an Active Queue Management scheme (like RED), as well as a schedule (like round robin), then we get more protection (even against our own misbehaviour).

A neat example of use of scheduling in a different layer is in the new web protocol, QUIC, in browsers - this paper illustrates nicely how one can improve rendering of pages by changing the order that components flow through the layers of HTTP, QUIC, to/from UDP/IP...

Tuesday, November 01, 2022

Principles of Communications Week 5 L8/L9 1&3 Nov, 2022 - Mobile&Random Routing + Open&Closed Loop Flow Control

 We've looked at unicast (one direction) multicast (some direction), broadcast (all directions), mentioned any cast (any direction), and mobile (one level of redirection. We've also looked at metric v. random based route choice, and some policy interactions with traffic engineering (preferences that differ from policy or metric).

Note - slide 228 lists R (trunk reservation) with opposite sense to r on y axes on graphs showing diminishing return of impact of trunk reservation


Next, we look at open-loop and closed-loop flow control.  History of feedback controllers is ancient!

A fun thing to look at is the BSD TCP kernel code. - this book by Rich Stevens is a walk through of that!

Wednesday, October 26, 2022

Principles of Communications Week 4 L6/7 25&27 Oct, 2022 - BGP Abstraction + Multicast

 This week, we finished up looking at BGP by covering:

  1. the stable paths abstraction - what does path vector+policy do, algorithmically?
  2. real world dynamics and engineering for stability - the latter performance challenges show that the information hiding goals of inter-domain routing have not really been achieved very thoroughly! So intra-domain dynamics get exported to the world, because of hot potato and MEDs.
As well as this, BGP requires a lot of external machinery for specifying policy, and for guarding against misinformation (by definition, harder to do with a system which has a goal of information hiding).


Then we looked at Internet Multicast - this also has some interesting challenges, that have prevented wide spread adoption:

  1. potential use for leveraging DDoS attacks
  2. lack of a policy/interdomain/business model (who causes down stream traffic?)
In addition, multicast created headaches for high speed switch/router hardware designers

General lesson here is that global scale systems entail complex, multi-factor considerations, and don't believe everything out there running the world is actually based on completely sound design. Nevertheless, it has application in special cases/limited domains, such as backbones for TV distribution, and more especially, data center networks.

----admin:

Aside: just to support this, the university information service video recording system failed to capture audio&slides for the 20.10.22 lecture but luckily, last year's recordings are still available, and roughly from same dates (2021 recordings) e.g. 2021-10-19 2021-10-21 2021-10-26 

Apologies for that!

Thursday, October 20, 2022

Principles of Communications Week 3 L4/5 18&20 Oct, 2022 - Interdomain routing and BGP.

We started looking at the requirements for Interdomain routing - it has to provide connectivity between Autonomous Routing Domains which may be optimising paths for irreconcilable metrics/goals, so it can, at best, find paths that fit policies - it also respects business relationships - again, injected through policy/configuration somewhere (vendor/product specific tools!). Mistakes can be made. These are mapped into the various hacks to allow BGP to do some traffic engineering, like backup paths, load balancing and so on.

BGP is a path vector protocol, which has a plus point of hiding information, by default. And also has the minus point of hiding information by default:-) Being path vector (rather than just distance vector) BGP is at least loop free. However, it is still a diffusing computation, rather than a fixed epoch cycle, so news can travel slowly (or not at all). 

Remember, what you announced is what other people look at to see how to get to things, in, or via, your network/AS

It is a classic political-technical space, where simple algorithmic solutions present themselves, but are dismissed because they don't include the competitive/incentive sides.

There have also been proposed alternative approaches to information hiding (advertise everything, but encrypt it and only give keys to people allowed to use this particular AS on a path) - that wouldn't scale.

Useful further reading/notes from this MIT BGP book.

BGP has been around 20 years, and when it was introduced, was probably the world's first planetary scale programmable system. This is not necessarily a good thing, when the language is obscure, and the runtime is compromised.

Friday, October 14, 2022

Principles of Communications 2022. some high level questions

 someone kindly reminded me i did have some high level problem/questions you could tackle when looking at lecture material - Questions to ask yourself about the materials in Princ of Comm

Tuesday, October 11, 2022

Principles of Communications Week 2 L2 L3@LT2 11am, 11&13 Oct, 2022 - Central Routing and MPLS&Segment Routing

 Recommend revisiting notes from IB networks course 2021-2022 where you can find link-state and distance vector from page 25 onwards!

So underlying distance vector (and path vector, see later when we discuss BGP and Inter-domain routing) is the notion of diffusing computations! Link-state is much simpler given in each epoch, we flood map data, and then everyone does a synchronised computation of shortest paths from themselves to all other nodes/destinations - fibbing relies on this, although the offline/centralised computation that produces additional virtual nodes can be anything - and this is also true of the paths computed for MPLS and segment routing and then distributed via CR-LDP or RSVP or other control protocols. One reference approach is in the paper linked from the course materials web page on DeFo...but there are many, once you open the network up to being re-programmed!

The key difference between Fibbing and classical SDN is that Software Defined Network controllers use OpenFlow (effectively an RPC like protocol) to add/modify FIBs in specific routers, whereas Fibbing simply joins in the Link State flooding protocol that all the routers use to update each other with the map (put into the RIB, then used to compute FIB entries, e.g. via Dijkstra Shortest Path algorithm). Hence the virtual nodes/and edges that the Fibbing controller adds behave just like real ones - appearing and disappearing as they are added or fail/remove, or the controller(s) adding them become unreachable (or reachable again after repair/reboot) - this is where the resilience comes from....that, and replicating the controllers.

Note for this course: Hierarchy is good for scaling, but doesn't always work when a network wants diversity - so we could have geographic, topological or organisational hierarchy, but we might want redundancy without common mode failures, so would connect to multiple providers "upstream", which undermines any notion of address aggregation in the forwarding tables! Adding in widespread use of NATs (network address translators) and the world is a lot more complex than we'd like!

Thursday, October 06, 2022

Principles of Communications Week 1 L1@LT2 11am, 6 Oct, 2022 - Introduction and Start on Routing

 Introduction to the course/outline of what will be covered.

Introduction to routing - up to basic idea of routing in general - not just networks, and not just packets and circits, but also cars, spaceships. Not just for shortest path, but also policy and energy.

Next week will cover features of current Internet Routing, and Centralised Routing, and start on Policy Routing.

This might be a good time to quickly revise IB Networking material (at least on shortest path/Dijkstra, and link state/distance vector!).

Friday, September 30, 2022

Explainable AI & Quantum

 Much "AI" in use today is basic machine learning, which is frequently simply statistics, which have been in use since the first actuarial and ordnance tables were devised to predict risk or targets...so several hundred years (or longer if the Greeks or Phoencians or Minoans used same math they had for astronomy for landing greek fire on other ships accurately, or insure their ships' cargo against storm damage...


As for autonomy, this occurred as soon as someone built a feedback loop: My fave paper on this james clark maxwell's Royal Society paper 

On Governors back in 1868



For "black box" (as in "inexplicable" AI)- this is true of any system so complex that few or no single human understands all of it - so pretty much any smart phone (without even getting into what goes on in the camera s/w) - nobody could both design the chip and write

the OS (actually i think i know one person, but he's probably the last).



A "deep learning" (aka neural net) is usually explainable if someone just spends the time and energy (it is computationally pricey) - two techniques


1/ shapley values for example here



2/ energy landscapes - as in this

https://www.pnas.org/doi/full/10.1073/pnas.1919995117


Roughly, you can think about comuting significant changes in the entropy of the net at any step in training, and then, using shapley values on input features, identify what caused that change in the neural network (analogous to change point detection, with thresholds etc) and then export the feature list + decision as a new branch in a decision tree or random forest (for example).


So interestingly, once you have built an explanation for a neural net, you can often replace the thing with a random forest or other directly (i.e. self explaining) approach - this was sort of obvious too once people realised you could massively compress neural nets (even using lossy compression algorithms - like video) suggesting most the links and weights were redundant.


So here's the quantum bit (pun intended) - the problem with computing shapley values and energy landscapes at every step in the training iteration is that it is very expensive (compared to the training itself), so if we have to do it often, this is unsustainable.


However, these (especially the energy landscape) might be amenable to computation by an analog quantum computer (described herein ) perhaps making this affordable. Analog quantum computers are available, indeed have been applied to expensive problems like the transfer function of a 3D space to multiple radios - see princeton work on this).

Tuesday, June 21, 2022

the philosophal problems of matter transportation and time travel

Theres' a great book by Harry Harrison called One Step From Earh, where he went through pretty much every possible philosophical interpretation of how matter transporter's could/might work including that they might just copy a person rather than "move" them...


There's also at least one short story by Philip k. Dick where every time someone time "travels", they actually sending a copy of themselves to the next time - the story ends in havoc as the population of the world ends up being dominated by millions of different age copies of the first time traveller...


The problem with this is that there's a point where philosophy meets physics or

chemistry or biology


As in the play, of course, your cells are replaced over time, so that you are not the same set of components that you were (say) 7 years ago - but this is also true at the level of particles, only far faster....


And also at the level of cognition (given brain plasticity, and how it is believed memory really works, we are continually modifying who we "are" as we experience new things, even just recalling old things)...



On a more interesting take about paradox, Behold the Man, by Michael Moorcock was pretty ingenious...and for paradox busting par excellence, Robert Heinlein's classic By His Bootstraps is the business...


So taking the memory model as a template for the philosophical challenge of continuous identity, it seems to me that there's really several problems


  • Fidelity

if something is a high fidelity copy of the previous instance of a person, and the previous version is replaced, then from everyone elses point of view, this is the same

person.


  • Memory

if the next instance of a person has the same (or very similar) memories to the previous instance, then they can delude themselves that they are the same person, as they will have the illusion of continuous existence - this is actually no different than how vision works, where your visual cortex as to make an apparently coherent and spatial and temporal continuous visual space out of what your eyes/retina detect, despite that that is intermittent and imperfect...


  • Consciousness

this is very tricky, since the locus of attention moves ahead (anticipation/ etc) as well as behind (memory) the current moment...however, if the brain is just a machine, then it is reasonable for the model it runs of the world to include prediction, and that model itself is copied from instance to instance of the person, providing the illusion of continued consciousness....

Thursday, April 14, 2022

onboard, board, offboard, outboard & knowledge base

 there was an interesting internal tech talk recently at the Turing Institute by a fairly recent addition to the research engineering group, who had a lot of previous experience in various technologies knowledge bases in various different organisations, and was being mildly critical of the system that had evolved here.


One thing struck me about this was that however you construct such a system, much of it (like an iceberg) is not in the visible components, but is reflective of how people use/navigate/update the knowledge, which is a shared delusion (like William Gibson's depiction of cyberspace in Neuromancer) - not in a bad way, but the longer the system exists, the harder it is for new people to acclimatise to it. Large parts of the structural information used by people to work with it are in their heads, not online.

so the system could automatically document how different kinds of users use it, by keeping breadcrumb/paper trails (you can of course do this in a wiki) and then do some kinds of statistical analysis to provide common, distinct modes/patterns explicitly. This could even be done in a privacy preserving way by combining federated learning (e.g. in client side tools, or browsers etc) with differential privacy, perhaps....


a project for an intern?

Tuesday, April 12, 2022

metaphorical computing considered lazy

 there's a story that originally ECT was discovered as a way to treat manic people after someone observed that chimps in captivity who got that way, but also had epileptic fits, were calmer after a fit. so then realising you could induce something that looked like a fit in chimps and therefore likely in people, the treatment was born, and many people suffered from this ludicrous idea for decades - I heard more recently, ECT has been somewhat rehabilitated and isn't used as a means to control unruly patients but actually has theraputic value, but the origin tale is still alarming.


so what about other ideas that are based in leaky reasoning, for example...

artificial neural networks as a way to build classifiers? not with anything like the same node degree distribution or mechanism for firing whatsoever, so how would one build so many aNNs almost none of which bear any ressemblance to what goes on in our heads?

evolutionary programming (e.g. GP/GA) as a way to do optimisation? but note evolution is about natural selection of anything that fits the niche in the environment - that doesn't make it an optimisation at all, just a choice.

bio-inspired search, e..g based in ants trailing pheremones? as with evolution, this is a blind process that assumes nothing about the setup, and is mind bogglingly wasteful.


Are there actually any vaguely sustainable ways of tackling these tasks (classifiers, optimisation, search) - of course there are...