Abstract
Vision Transformers have recently gained popularity due to their superior performance on visual computing tasks. However, this performance is based on training with huge datasets, and maintaining the performance on small datasets remains a challenge. Regularization helps to alleviate the overfitting issue that is common when dealing with small datasets. Most existing regularization techniques are designed keeping ConvNets in mind. As Vision Transformers process images differently, there is a need for new regularization techniques crafted for them. In this paper, we propose a regularization called PatchSwap, which interchanges the patches between two images, resulting in a new input for regularizing the transformer. Our extensive experiments showcase that PatchSwap yields superior performance than existing state-of-the-art methods. Further, the simplicity of PatchSwap makes a straightforward extension to a semi-supervised setting with minimal effort.
Original language | English (US) |
---|---|
State | Published - 2022 |
Event | 33rd British Machine Vision Conference Proceedings, BMVC 2022 - London, United Kingdom Duration: Nov 21 2022 → Nov 24 2022 |
Conference
Conference | 33rd British Machine Vision Conference Proceedings, BMVC 2022 |
---|---|
Country/Territory | United Kingdom |
City | London |
Period | 11/21/22 → 11/24/22 |
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition