Simultaneously optimizing weight and quantizer of ternary neural network using truncated gaussian approximation

Zhezhi He; Deliang Fan

doi:10.1109/CVPR.2019.01170

Simultaneously optimizing weight and quantizer of ternary neural network using truncated gaussian approximation

Zhezhi He, Deliang Fan

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

46 Scopus citations

Abstract

In the past years, Deep convolution neural network has achieved great success in many artificial intelligence applications. However, its enormous model size and massive computation cost have become the main obstacle for deployment of such powerful algorithm in the low power and resource-limited mobile systems. As the countermeasure to this problem, deep neural networks with ternarized weights (i.e.-1, 0, +1) have been widely explored to greatly reduce model size and computational cost, with limited accuracy degradation. In this work, we propose a novel ternarized neural network training method which simultaneously optimizes both weights and quantizer during training, differentiating from prior works. Instead of fixed and uniform weight ternarization, we are the first to incorporate the thresholds of weight ternarization into a closed-form representation using truncated Gaussian approximation, enabling simultaneous optimization of weights and quantizer through back-propagation training. With both of the first and last layer ternarized, the experiments on the ImageNet classification task show that our ternarized ResNet-18/34/50 only has 3.9/2.52/2.16% accuracy degradation in comparison to the full-precision counterparts.

Original language	English (US)
Title of host publication	Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
Publisher	IEEE Computer Society
Pages	11430-11438
Number of pages	9
ISBN (Electronic)	9781728132938
DOIs	https://doi.org/10.1109/CVPR.2019.01170
State	Published - Jun 2019
Externally published	Yes
Event	32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 - Long Beach, United States Duration: Jun 16 2019 → Jun 20 2019

Publication series

Name	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume	2019-June
ISSN (Print)	1063-6919

Conference

Conference	32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
Country/Territory	United States
City	Long Beach
Period	6/16/19 → 6/20/19

Keywords

Categorization
Deep Learning
Optimization Methods
Recognition: Detection
Retrieval

ASJC Scopus subject areas

Software
Computer Vision and Pattern Recognition

Access to Document

10.1109/CVPR.2019.01170

Cite this

He, Z., & Fan, D. (2019). Simultaneously optimizing weight and quantizer of ternary neural network using truncated gaussian approximation. In Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 (pp. 11430-11438). Article 8953256 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Vol. 2019-June). IEEE Computer Society. https://doi.org/10.1109/CVPR.2019.01170

Simultaneously optimizing weight and quantizer of ternary neural network using truncated gaussian approximation. / He, Zhezhi; Fan, Deliang.
Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019. IEEE Computer Society, 2019. p. 11430-11438 8953256 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Vol. 2019-June).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

He, Z & Fan, D 2019, Simultaneously optimizing weight and quantizer of ternary neural network using truncated gaussian approximation. in Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019., 8953256, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2019-June, IEEE Computer Society, pp. 11430-11438, 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, United States, 6/16/19. https://doi.org/10.1109/CVPR.2019.01170

He Z, Fan D. Simultaneously optimizing weight and quantizer of ternary neural network using truncated gaussian approximation. In Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019. IEEE Computer Society. 2019. p. 11430-11438. 8953256. (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). doi: 10.1109/CVPR.2019.01170

He, Zhezhi ; Fan, Deliang. / Simultaneously optimizing weight and quantizer of ternary neural network using truncated gaussian approximation. Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019. IEEE Computer Society, 2019. pp. 11430-11438 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).

@inproceedings{82c6ad41072a4ec083220724632f6663,

title = "Simultaneously optimizing weight and quantizer of ternary neural network using truncated gaussian approximation",

abstract = "In the past years, Deep convolution neural network has achieved great success in many artificial intelligence applications. However, its enormous model size and massive computation cost have become the main obstacle for deployment of such powerful algorithm in the low power and resource-limited mobile systems. As the countermeasure to this problem, deep neural networks with ternarized weights (i.e.-1, 0, +1) have been widely explored to greatly reduce model size and computational cost, with limited accuracy degradation. In this work, we propose a novel ternarized neural network training method which simultaneously optimizes both weights and quantizer during training, differentiating from prior works. Instead of fixed and uniform weight ternarization, we are the first to incorporate the thresholds of weight ternarization into a closed-form representation using truncated Gaussian approximation, enabling simultaneous optimization of weights and quantizer through back-propagation training. With both of the first and last layer ternarized, the experiments on the ImageNet classification task show that our ternarized ResNet-18/34/50 only has 3.9/2.52/2.16% accuracy degradation in comparison to the full-precision counterparts.",

keywords = "Categorization, Deep Learning, Optimization Methods, Recognition: Detection, Retrieval",

author = "Zhezhi He and Deliang Fan",

note = "Publisher Copyright: {\textcopyright} 2019 IEEE.; 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 ; Conference date: 16-06-2019 Through 20-06-2019",

year = "2019",

month = jun,

doi = "10.1109/CVPR.2019.01170",

language = "English (US)",

series = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",

publisher = "IEEE Computer Society",

pages = "11430--11438",

booktitle = "Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019",

}

TY - GEN

T1 - Simultaneously optimizing weight and quantizer of ternary neural network using truncated gaussian approximation

AU - He, Zhezhi

AU - Fan, Deliang

PY - 2019/6

Y1 - 2019/6

N2 - In the past years, Deep convolution neural network has achieved great success in many artificial intelligence applications. However, its enormous model size and massive computation cost have become the main obstacle for deployment of such powerful algorithm in the low power and resource-limited mobile systems. As the countermeasure to this problem, deep neural networks with ternarized weights (i.e.-1, 0, +1) have been widely explored to greatly reduce model size and computational cost, with limited accuracy degradation. In this work, we propose a novel ternarized neural network training method which simultaneously optimizes both weights and quantizer during training, differentiating from prior works. Instead of fixed and uniform weight ternarization, we are the first to incorporate the thresholds of weight ternarization into a closed-form representation using truncated Gaussian approximation, enabling simultaneous optimization of weights and quantizer through back-propagation training. With both of the first and last layer ternarized, the experiments on the ImageNet classification task show that our ternarized ResNet-18/34/50 only has 3.9/2.52/2.16% accuracy degradation in comparison to the full-precision counterparts.

AB - In the past years, Deep convolution neural network has achieved great success in many artificial intelligence applications. However, its enormous model size and massive computation cost have become the main obstacle for deployment of such powerful algorithm in the low power and resource-limited mobile systems. As the countermeasure to this problem, deep neural networks with ternarized weights (i.e.-1, 0, +1) have been widely explored to greatly reduce model size and computational cost, with limited accuracy degradation. In this work, we propose a novel ternarized neural network training method which simultaneously optimizes both weights and quantizer during training, differentiating from prior works. Instead of fixed and uniform weight ternarization, we are the first to incorporate the thresholds of weight ternarization into a closed-form representation using truncated Gaussian approximation, enabling simultaneous optimization of weights and quantizer through back-propagation training. With both of the first and last layer ternarized, the experiments on the ImageNet classification task show that our ternarized ResNet-18/34/50 only has 3.9/2.52/2.16% accuracy degradation in comparison to the full-precision counterparts.

KW - Categorization

KW - Deep Learning

KW - Optimization Methods

KW - Recognition: Detection

KW - Retrieval

UR - http://www.scopus.com/inward/record.url?scp=85072962047&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85072962047&partnerID=8YFLogxK

U2 - 10.1109/CVPR.2019.01170

DO - 10.1109/CVPR.2019.01170

M3 - Conference contribution

AN - SCOPUS:85072962047

T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

SP - 11430

EP - 11438

BT - Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019

PB - IEEE Computer Society

T2 - 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019

Y2 - 16 June 2019 through 20 June 2019

ER -

Simultaneously optimizing weight and quantizer of ternary neural network using truncated gaussian approximation

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this