Generalized but not Robust? Comparing the Effects of Data Modification Methods on Out-of-Domain Generalization and Adversarial Robustness

Tejas Gokhale, Swaroop Mishra, Man Luo, Bhavdeep Singh Sachdeva, Chitta Baral

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Scopus citations

Abstract

Data modification, either via additional training datasets, data augmentation, debiasing, and dataset filtering, has been proposed as an effective solution for generalizing to out-of-domain (OOD) inputs, in both natural language processing and computer vision literature. However, the effect of data modification on adversarial robustness remains unclear. In this work, we conduct a comprehensive study of common data modification strategies and evaluate not only their in-domain and OOD performance, but also their adversarial robustness (AR). We also present results on a two-dimensional synthetic dataset to visualize the effect of each method on the training distribution. This work serves as an empirical study towards understanding the relationship between generalizing to unseen domains and defending against adversarial perturbations. Our findings suggest that more data (either via additional datasets or data augmentation) benefits both OOD accuracy and AR. However, data filtering (previously shown to improve OOD accuracy on natural language inference) hurts OOD accuracy on other tasks such as question answering and image classification. We provide insights from our experiments to inform future work in this direction.

Original languageEnglish (US)
Title of host publicationACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Findings of ACL 2022
EditorsSmaranda Muresan, Preslav Nakov, Aline Villavicencio
PublisherAssociation for Computational Linguistics (ACL)
Pages2705-2718
Number of pages14
ISBN (Electronic)9781955917254
StatePublished - 2022
Event60th Annual Meeting of the Association for Computational Linguistics, ACL 2022 - Dublin, Ireland
Duration: May 22 2022May 27 2022

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference60th Annual Meeting of the Association for Computational Linguistics, ACL 2022
Country/TerritoryIreland
CityDublin
Period5/22/225/27/22

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Generalized but not Robust? Comparing the Effects of Data Modification Methods on Out-of-Domain Generalization and Adversarial Robustness'. Together they form a unique fingerprint.

Cite this