TY - GEN
T1 - Semantically Distributed Robust Optimization for Vision-and-Language Inference
AU - Gokhale, Tejas
AU - Chaudhary, Abhishek
AU - Banerjee, Pratyay
AU - Baral, Chitta
AU - Yang, Yezhou
N1 - Publisher Copyright:
© 2022 Association for Computational Linguistics.
PY - 2022
Y1 - 2022
N2 - Analysis of vision-and-language models has revealed their brittleness under linguistic phenomena such as paraphrasing, negation, textual entailment, and word substitutions with synonyms or antonyms. While data augmentation techniques have been designed to mitigate against these failure modes, methods that can integrate this knowledge into the training pipeline remain under-explored. In this paper, we present SDRO†, a model-agnostic method that utilizes a set linguistic transformations in a distributed robust optimization setting, along with an ensembling technique to leverage these transformations during inference. Experiments on benchmark datasets with images (NLVR2) and video (VIOLIN) demonstrate performance improvements as well as robustness to adversarial attacks. Experiments on binary VQA explore the generalizability of this method to other V&L tasks.
AB - Analysis of vision-and-language models has revealed their brittleness under linguistic phenomena such as paraphrasing, negation, textual entailment, and word substitutions with synonyms or antonyms. While data augmentation techniques have been designed to mitigate against these failure modes, methods that can integrate this knowledge into the training pipeline remain under-explored. In this paper, we present SDRO†, a model-agnostic method that utilizes a set linguistic transformations in a distributed robust optimization setting, along with an ensembling technique to leverage these transformations during inference. Experiments on benchmark datasets with images (NLVR2) and video (VIOLIN) demonstrate performance improvements as well as robustness to adversarial attacks. Experiments on binary VQA explore the generalizability of this method to other V&L tasks.
UR - http://www.scopus.com/inward/record.url?scp=85142803958&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85142803958&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85142803958
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 1493
EP - 1513
BT - ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Findings of ACL 2022
A2 - Muresan, Smaranda
A2 - Nakov, Preslav
A2 - Villavicencio, Aline
PB - Association for Computational Linguistics (ACL)
T2 - 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022
Y2 - 22 May 2022 through 27 May 2022
ER -