Date of Award
Master of Science
Deep neural networks are traditionally considered to be “black-box” models where it is generally difficult to interpret a certain decision made by such models given a test instance. However, as deep learning is increasingly becoming the tool of choice in making many safety-critical and time-critical decisions such as perception for self-driving cars, the machine learning community has been extremely interested recently to build interpretation mechanisms for these so called black box deep learning models primarily to build users’ trust with the models. Many such mechanisms have been developed to explain behavior of deep models such as convolutional neural networks (CNNs) and provide visual interpretations of their classification decisions. However, there is still no consensus in the community about the specific goals and performance metrics for the interpretability mechanisms. In this thesis, we review the recent literature to arrive at a formal definition for the “Interpretability-problem” for CNNs with the help of different axioms. We observe that many recently proposed mechanisms do not adhere to the axioms of interpretability and hence not quite robust in performance. In this context, we propose a framework to test the interpretation algorithms under model perturbation and data perturbation. This framework tests the “sensitivity” of the algorithms and helps in evaluating “implementation invariance”, which are desired characteristics for any interpretability mechanism. We demonstrate our framework using two well-known algorithms namely “Saliency Maps” and “Grad-CAM” and introduce a new interpretability technique called “Forward-Backward Interpretability algorithm” that provides a systematic framework for visualizing information flow in deep networks. Finally, we also present visualization and interpretability results for an impactful scientific application involving microstructure-property mapping in material science.
Apurva Dilip Kokate
Kokate, Apurva Dilip, "A study of interpretability mechanisms for deep networks" (2018). Graduate Theses and Dissertations. 16927.