Artificial Intelligence (AI) and machine learning systems have been shown to be susceptible to biased and discriminatory decision-making. In response, calls to make algorithmic decision-making systems more transparent, explainable, and therefore accountable can be seen in academic literature, policy proposals and laws.
Explanations are viewed as an ideal mechanism to enhance the accountability of algorithmic systems. This talk provides an overview of current methods for explaining decisions and outputs of AI systems.
Much of this current work is akin to scientific modelling rather than the sort of reason-based, discursive explanations we expect from people. These methods thus have limited utility for non-expert individuals affected by the outputs and decisions of algorithmic systems.
A gap exists between current methods and ‘ideal explanations as described in philosophy, psychology, and the cognitive sciences, and argue that AI and ML researchers need to urgently turn towards ‘contrastive’, user-oriented explanations.
Using ‘counterfactual explanations’ may be one way to bridge this gap between the expectations of users and affected individuals on the one hand, and the capacities for explanation currently used in the machine learning community. This approach bypasses many current barriers to interpretability in ‘black box’ models, strikes a balance between transparency and the rights and freedoms of others (e.g. privacy, trade secrets), and meets and exceeds the legal requirements of the EU General Data Protection Regulation.