Campus Units

Computer Science

Document Type

Article

Publication Version

Submitted Manuscript

Publication Date

5-30-2019

Journal or Book Title

arxiv

Abstract

Despite numerous attempts to defend deep learning based image classifiers, they remain susceptible to the adversarial attacks. This paper proposes a technique to identify susceptible classes, those classes that are more easily subverted. To identify the susceptible classes we use distance-based measures and apply them on a trained model. Based on the distance among original classes, we create mapping among original classes and adversarial classes that helps to reduce the randomness of a model to a significant amount in an adversarial setting. We analyze the high dimensional geometry among the feature classes and identify the k most susceptible target classes in an adversarial attack. We conduct experiments using MNIST, Fashion MNIST, CIFAR-10 (ImageNet and ResNet-32) datasets. Finally, we evaluate our techniques in order to determine which distance-based measure works best and how the randomness of a model changes with perturbation.

Comments

This is a pre-print made available through arxiv: https://arxiv.org/abs/1905.13284.

Copyright Owner

The Authors

Language

en

File Format

application/pdf

Share

COinS