PatchSwap: A Regularization Technique for Vision Transformers

Sachin Chhabra, Hemanth Venkateswara, Baoxin Li

Research output: Contribution to conferencePaperpeer-review

Abstract

Vision Transformers have recently gained popularity due to their superior performance on visual computing tasks. However, this performance is based on training with huge datasets, and maintaining the performance on small datasets remains a challenge. Regularization helps to alleviate the overfitting issue that is common when dealing with small datasets. Most existing regularization techniques are designed keeping ConvNets in mind. As Vision Transformers process images differently, there is a need for new regularization techniques crafted for them. In this paper, we propose a regularization called PatchSwap, which interchanges the patches between two images, resulting in a new input for regularizing the transformer. Our extensive experiments showcase that PatchSwap yields superior performance than existing state-of-the-art methods. Further, the simplicity of PatchSwap makes a straightforward extension to a semi-supervised setting with minimal effort.

Original languageEnglish (US)
StatePublished - 2022
Event33rd British Machine Vision Conference Proceedings, BMVC 2022 - London, United Kingdom
Duration: Nov 21 2022Nov 24 2022

Conference

Conference33rd British Machine Vision Conference Proceedings, BMVC 2022
Country/TerritoryUnited Kingdom
CityLondon
Period11/21/2211/24/22

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'PatchSwap: A Regularization Technique for Vision Transformers'. Together they form a unique fingerprint.

Cite this