TY - JOUR
T1 - On the Accuracy of Analog Neural Network Inference Accelerators [Feature]
AU - Xiao, T. Patrick
AU - Feinberg, Ben
AU - Bennett, Christopher H.
AU - Prabhakar, Venkatraman
AU - Saxena, Prashant
AU - Agrawal, Vineet
AU - Agarwal, Sapan
AU - Marinella, Matthew J.
N1 - Publisher Copyright:
© 2001-2012 IEEE.
PY - 2022
Y1 - 2022
N2 - Specialized accelerators have recently garnered attention as a method to reduce the power consumption of neural network inference. A promising category of accelerators utilizes nonvolatile memory arrays to both store weights and perform in situ analog computation inside the array. While prior work has explored the design space of analog accelerators to optimize performance and energy efficiency, there is seldom a rigorous evaluation of the accuracy of these accelerators. This work shows how architectural design decisions, particularly in mapping neural network parameters to analog memory cells, influence inference accuracy. When evaluated using ResNet50 on ImageNet, the resilience of the system to analog non-idealities - cell programming errors, analog-to-digital converter resolution, and array parasitic resistances - all improve when analog quantities in the hardware are made proportional to the numerical values in the network. Moreover, contrary to the assumptions of prior work, nearly equivalent resilience to cell imprecision can be achieved by fully storing weights as analog quantities, rather than spreading weight bits across multiple devices, often referred to as bit slicing. By exploiting proportionality, analog system designers have the freedom to match the precision of the hardware to the needs of the algorithm, rather than attempting to guarantee the same level of precision in the intermediate results as an equivalent digital accelerator. This ultimately results in an analog accelerator that is more accurate, more robust to analog errors, and more energy-efficient.
AB - Specialized accelerators have recently garnered attention as a method to reduce the power consumption of neural network inference. A promising category of accelerators utilizes nonvolatile memory arrays to both store weights and perform in situ analog computation inside the array. While prior work has explored the design space of analog accelerators to optimize performance and energy efficiency, there is seldom a rigorous evaluation of the accuracy of these accelerators. This work shows how architectural design decisions, particularly in mapping neural network parameters to analog memory cells, influence inference accuracy. When evaluated using ResNet50 on ImageNet, the resilience of the system to analog non-idealities - cell programming errors, analog-to-digital converter resolution, and array parasitic resistances - all improve when analog quantities in the hardware are made proportional to the numerical values in the network. Moreover, contrary to the assumptions of prior work, nearly equivalent resilience to cell imprecision can be achieved by fully storing weights as analog quantities, rather than spreading weight bits across multiple devices, often referred to as bit slicing. By exploiting proportionality, analog system designers have the freedom to match the precision of the hardware to the needs of the algorithm, rather than attempting to guarantee the same level of precision in the intermediate results as an equivalent digital accelerator. This ultimately results in an analog accelerator that is more accurate, more robust to analog errors, and more energy-efficient.
UR - http://www.scopus.com/inward/record.url?scp=85147442832&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85147442832&partnerID=8YFLogxK
U2 - 10.1109/MCAS.2022.3214409
DO - 10.1109/MCAS.2022.3214409
M3 - Article
AN - SCOPUS:85147442832
SN - 1531-636X
VL - 22
SP - 26
EP - 48
JO - IEEE Circuits and Systems Magazine
JF - IEEE Circuits and Systems Magazine
IS - 4
ER -