Wednesday, September 19, 2018

comprehensive AI - or explicable ML or understandable computerised statistics

How do we know that the underlying machine that does automated decision making isn't wrong?



Well, most machine learning (ML) is about as sophisticated as stuff people used to do with
SPSS (then S, then R, and sometimes Matlab, or Python libraries) and
consists of linear regression.. occasionally something a tiny bit cleverer like naive Bayes, random forests etc etc - verifying the code, and its properties after various data have been fed in is very simple. Indeed, the vast majority of things people are doing with computers doing "AI" (not artificial intelligence, just stuff that does statistics on data, dynamically, and gets better at it, usually), are things humans did with pencil, paper, and tables, hundreds of years ago. And with mechanical calculators 100+ years ago.




People don't use deep learning in these systems (e.g. convolutional neural nets )
for much (outside of some image classification (its a cat!))... in
practice, at least as far as we can tell...there's no need, plus the
training time, data, costs (energy/processing) for neural networks
is awful and they are badly thrown by adversarial input

If you must use things like that, and are interested in their
properties, one approach (actually from the Turing inst)
now deployed in google's code, is described here:
https://ai.googleblog.com/2018/09/the-what-if-tool-code-free-probing-of.html
which generalises to other systems, and has been scaled up by folks at Deepmind
https://arxiv.org/abs/1802.08139
to cope with systems with many dimensions and also google using a slightly different approach:https://beenkim.github.io/

That's not verification in the sense a computer scientist would mean - its
empirical/evidential - there are some more complex approaches for
that sort of thing)....

You'd definitely need to keep checking the systems behaviour as it
is trained more as a NN can suddenly switch its behaviour if the
input is significantly novel - an llustration of this is in Wales
work in cambridge on energy landscapes - see
https://www.ch.cam.ac.uk/group/wales/publications

or you can break the system down based on a model
and generate training sets (or partition the neural net) either GANs or segmenting - https://deepmind.com/blog/moorfields-major-milestone/

I have no idea how much anyone in the Real World uses
general model inferencers (as per MCMC/metropolitan hastings and
probablistic programming techniques) but these are relatively
transparent in the things they output too - we need to map the communities of practice. asap

This all needs documenting in a decent way because it is (increasingly) not magic/fiddle factors/pixie dust....

However, here's a reason to care about running a Butlerian Jihad:

https://anatomyof.ai/

No comments: