TY - GEN
T1 - XMA
T2 - 59th ACM/IEEE Design Automation Conference, DAC 2022
AU - Zhang, Fan
AU - Yang, Li
AU - Meng, Jian
AU - Seo, Jae Sun
AU - Cao, Yu Kevin
AU - Fan, Deliang
N1 - Funding Information:
This work is supported in part by the National Science Foundation under Grant No.2003749, No.1931871, No. 2144751
Publisher Copyright:
© 2022 ACM.
PY - 2022/7/10
Y1 - 2022/7/10
N2 - ReRAM crossbar array as a high-parallel fast and energy-efficient structure attracts much attention, especially on the acceleration of Deep Neural Network (DNN) inference on one specific task. However, due to the high energy consumption of weight re-programming and the ReRAM cells' low endurance problem, adapting the crossbar array for multiple tasks has not been well explored. In this paper, we propose XMA, a novel crossbar-aware shift-based mask learning method for multiple task adaption in the ReRAM crossbar DNN accelerator for the first time. XMA leverages the popular mask-based learning algorithm's benefit to mitigate catastrophic forgetting and learn a task-specific, crossbar column-wise, and shift-based multi-level mask, rather than the most commonly used element-wise binary mask, for each new task based on a frozen backbone model. With our crossbar-aware design innovation, the required masking operation to adapt for a new task could be implemented in an existing crossbar-based convolution engine with minimal hardware/memory overhead and, more importantly, no need for power-hungry cell re-programming, unlike prior works. The extensive experimental results show that, compared with state-of-the-art multiple task adaption Piggyback method [1], XMA achieves 3.19% higher accuracy on average, while saving 96.6% memory overhead. Moreover, by eliminating cell re-programming, XMA achieves ∼4.3x higher energy efficiency than Piggyback.
AB - ReRAM crossbar array as a high-parallel fast and energy-efficient structure attracts much attention, especially on the acceleration of Deep Neural Network (DNN) inference on one specific task. However, due to the high energy consumption of weight re-programming and the ReRAM cells' low endurance problem, adapting the crossbar array for multiple tasks has not been well explored. In this paper, we propose XMA, a novel crossbar-aware shift-based mask learning method for multiple task adaption in the ReRAM crossbar DNN accelerator for the first time. XMA leverages the popular mask-based learning algorithm's benefit to mitigate catastrophic forgetting and learn a task-specific, crossbar column-wise, and shift-based multi-level mask, rather than the most commonly used element-wise binary mask, for each new task based on a frozen backbone model. With our crossbar-aware design innovation, the required masking operation to adapt for a new task could be implemented in an existing crossbar-based convolution engine with minimal hardware/memory overhead and, more importantly, no need for power-hungry cell re-programming, unlike prior works. The extensive experimental results show that, compared with state-of-the-art multiple task adaption Piggyback method [1], XMA achieves 3.19% higher accuracy on average, while saving 96.6% memory overhead. Moreover, by eliminating cell re-programming, XMA achieves ∼4.3x higher energy efficiency than Piggyback.
KW - in-memory computing
KW - multi-task learning
KW - neural networks
UR - http://www.scopus.com/inward/record.url?scp=85137503426&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85137503426&partnerID=8YFLogxK
U2 - 10.1145/3489517.3530458
DO - 10.1145/3489517.3530458
M3 - Conference contribution
AN - SCOPUS:85137503426
T3 - Proceedings - Design Automation Conference
SP - 271
EP - 276
BT - Proceedings of the 59th ACM/IEEE Design Automation Conference, DAC 2022
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 10 July 2022 through 14 July 2022
ER -