ClosNets: Batchless DNN Training with On-Chip a Priori Sparse Neural Topologies

Mihailo Isakov; Alan Ehret; Michel Kinsy

doi:10.1109/FPL.2018.00017

ClosNets: Batchless DNN Training with On-Chip a Priori Sparse Neural Topologies

Mihailo Isakov, Alan Ehret, Michel Kinsy

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

5 Scopus citations

Abstract

The deployment of deep neural network (DNN) models is generally hindered by their training time. DNN training throughput is commonly limited by the fully-connected layers. This is due to their large size and low data reuse. Large batch sizes are often used to mitigate some of the effects. Increasing batch size can however hurt model accuracy, creating a tradeoff between accuracy and efficiency. We tackle the problem of training DNNs in on-chip memory, allowing us to train models without the use of batching. Pruning and quantizing dense layers can greatly reduce network size, allowing models to fit on the chip, but can only be applied after training. We propose a fully-connected but sparse layer that reduces the memory requirements of DNNs without sacrificing accuracy. We replace a dense matrix with a sparse matrix product with a predetermined topology. This allows us to: (1) train significantly smaller networks without a loss in accuracy, and (2) store weights without having to store connection indices. We therefore achieve significant training speedups due to the fast access to on-chip weights, smaller network size, and a reduced amount of computation per epoch.

Original language	English (US)
Title of host publication	Proceedings - 2018 International Conference on Field-Programmable Logic and Applications, FPL 2018
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	55-59
Number of pages	5
ISBN (Electronic)	9781538685174
DOIs	https://doi.org/10.1109/FPL.2018.00017
State	Published - Nov 9 2018
Externally published	Yes
Event	28th International Conference on Field-Programmable Logic and Applications, FPL 2018 - Dublin, Ireland Duration: Aug 26 2018 → Aug 30 2018

Publication series

Name	Proceedings - 2018 International Conference on Field-Programmable Logic and Applications, FPL 2018

Other

Other	28th International Conference on Field-Programmable Logic and Applications, FPL 2018
Country/Territory	Ireland
City	Dublin
Period	8/26/18 → 8/30/18

Keywords

acceleration
hardware
neural network
sparsity

ASJC Scopus subject areas

Artificial Intelligence
Computer Networks and Communications
Computer Science Applications
Hardware and Architecture
Software

Access to Document

10.1109/FPL.2018.00017

Cite this

Isakov, M., Ehret, A., & Kinsy, M. (2018). ClosNets: Batchless DNN Training with On-Chip a Priori Sparse Neural Topologies. In Proceedings - 2018 International Conference on Field-Programmable Logic and Applications, FPL 2018 (pp. 55-59). Article 8532585 (Proceedings - 2018 International Conference on Field-Programmable Logic and Applications, FPL 2018). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/FPL.2018.00017

ClosNets: Batchless DNN Training with On-Chip a Priori Sparse Neural Topologies. / Isakov, Mihailo; Ehret, Alan; Kinsy, Michel.
Proceedings - 2018 International Conference on Field-Programmable Logic and Applications, FPL 2018. Institute of Electrical and Electronics Engineers Inc., 2018. p. 55-59 8532585 (Proceedings - 2018 International Conference on Field-Programmable Logic and Applications, FPL 2018).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Isakov, M, Ehret, A & Kinsy, M 2018, ClosNets: Batchless DNN Training with On-Chip a Priori Sparse Neural Topologies. in Proceedings - 2018 International Conference on Field-Programmable Logic and Applications, FPL 2018., 8532585, Proceedings - 2018 International Conference on Field-Programmable Logic and Applications, FPL 2018, Institute of Electrical and Electronics Engineers Inc., pp. 55-59, 28th International Conference on Field-Programmable Logic and Applications, FPL 2018, Dublin, Ireland, 8/26/18. https://doi.org/10.1109/FPL.2018.00017

Isakov M, Ehret A, Kinsy M. ClosNets: Batchless DNN Training with On-Chip a Priori Sparse Neural Topologies. In Proceedings - 2018 International Conference on Field-Programmable Logic and Applications, FPL 2018. Institute of Electrical and Electronics Engineers Inc. 2018. p. 55-59. 8532585. (Proceedings - 2018 International Conference on Field-Programmable Logic and Applications, FPL 2018). doi: 10.1109/FPL.2018.00017

Isakov, Mihailo ; Ehret, Alan ; Kinsy, Michel. / ClosNets : Batchless DNN Training with On-Chip a Priori Sparse Neural Topologies. Proceedings - 2018 International Conference on Field-Programmable Logic and Applications, FPL 2018. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 55-59 (Proceedings - 2018 International Conference on Field-Programmable Logic and Applications, FPL 2018).

@inproceedings{6a340228fbef4746a39f8547b322c4df,

title = "ClosNets: Batchless DNN Training with On-Chip a Priori Sparse Neural Topologies",

abstract = "The deployment of deep neural network (DNN) models is generally hindered by their training time. DNN training throughput is commonly limited by the fully-connected layers. This is due to their large size and low data reuse. Large batch sizes are often used to mitigate some of the effects. Increasing batch size can however hurt model accuracy, creating a tradeoff between accuracy and efficiency. We tackle the problem of training DNNs in on-chip memory, allowing us to train models without the use of batching. Pruning and quantizing dense layers can greatly reduce network size, allowing models to fit on the chip, but can only be applied after training. We propose a fully-connected but sparse layer that reduces the memory requirements of DNNs without sacrificing accuracy. We replace a dense matrix with a sparse matrix product with a predetermined topology. This allows us to: (1) train significantly smaller networks without a loss in accuracy, and (2) store weights without having to store connection indices. We therefore achieve significant training speedups due to the fast access to on-chip weights, smaller network size, and a reduced amount of computation per epoch.",

keywords = "acceleration, hardware, neural network, sparsity",

author = "Mihailo Isakov and Alan Ehret and Michel Kinsy",

note = "Publisher Copyright: {\textcopyright} 2018 IEEE.; 28th International Conference on Field-Programmable Logic and Applications, FPL 2018 ; Conference date: 26-08-2018 Through 30-08-2018",

year = "2018",

month = nov,

day = "9",

doi = "10.1109/FPL.2018.00017",

language = "English (US)",

series = "Proceedings - 2018 International Conference on Field-Programmable Logic and Applications, FPL 2018",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "55--59",

booktitle = "Proceedings - 2018 International Conference on Field-Programmable Logic and Applications, FPL 2018",

}

TY - GEN

T1 - ClosNets

T2 - 28th International Conference on Field-Programmable Logic and Applications, FPL 2018

AU - Isakov, Mihailo

AU - Ehret, Alan

AU - Kinsy, Michel

PY - 2018/11/9

Y1 - 2018/11/9

N2 - The deployment of deep neural network (DNN) models is generally hindered by their training time. DNN training throughput is commonly limited by the fully-connected layers. This is due to their large size and low data reuse. Large batch sizes are often used to mitigate some of the effects. Increasing batch size can however hurt model accuracy, creating a tradeoff between accuracy and efficiency. We tackle the problem of training DNNs in on-chip memory, allowing us to train models without the use of batching. Pruning and quantizing dense layers can greatly reduce network size, allowing models to fit on the chip, but can only be applied after training. We propose a fully-connected but sparse layer that reduces the memory requirements of DNNs without sacrificing accuracy. We replace a dense matrix with a sparse matrix product with a predetermined topology. This allows us to: (1) train significantly smaller networks without a loss in accuracy, and (2) store weights without having to store connection indices. We therefore achieve significant training speedups due to the fast access to on-chip weights, smaller network size, and a reduced amount of computation per epoch.

AB - The deployment of deep neural network (DNN) models is generally hindered by their training time. DNN training throughput is commonly limited by the fully-connected layers. This is due to their large size and low data reuse. Large batch sizes are often used to mitigate some of the effects. Increasing batch size can however hurt model accuracy, creating a tradeoff between accuracy and efficiency. We tackle the problem of training DNNs in on-chip memory, allowing us to train models without the use of batching. Pruning and quantizing dense layers can greatly reduce network size, allowing models to fit on the chip, but can only be applied after training. We propose a fully-connected but sparse layer that reduces the memory requirements of DNNs without sacrificing accuracy. We replace a dense matrix with a sparse matrix product with a predetermined topology. This allows us to: (1) train significantly smaller networks without a loss in accuracy, and (2) store weights without having to store connection indices. We therefore achieve significant training speedups due to the fast access to on-chip weights, smaller network size, and a reduced amount of computation per epoch.

KW - acceleration

KW - hardware

KW - neural network

KW - sparsity

UR - http://www.scopus.com/inward/record.url?scp=85060304986&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85060304986&partnerID=8YFLogxK

U2 - 10.1109/FPL.2018.00017

DO - 10.1109/FPL.2018.00017

M3 - Conference contribution

AN - SCOPUS:85060304986

T3 - Proceedings - 2018 International Conference on Field-Programmable Logic and Applications, FPL 2018

SP - 55

EP - 59

BT - Proceedings - 2018 International Conference on Field-Programmable Logic and Applications, FPL 2018

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 26 August 2018 through 30 August 2018

ER -

ClosNets: Batchless DNN Training with On-Chip a Priori Sparse Neural Topologies

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this