CAMDNN: Content-Aware Mapping of a Network of Deep Neural Networks on Edge MPSoCs

Soroush Heidari; Mehdi Ghasemi; Young Geun Kim; Carole Jean Wu; Sarma Vrudhula

doi:10.1109/TC.2022.3207137

CAMDNN: Content-Aware Mapping of a Network of Deep Neural Networks on Edge MPSoCs

Soroush Heidari, Mehdi Ghasemi, Young Geun Kim, Carole Jean Wu, Sarma Vrudhula

Research output: Contribution to journal › Article › peer-review

1 Scopus citations

Abstract

Machine Learning (ML) workloads are increasingly deployed at the edge. Enabling efficient inference execution while considering model and system heterogeneity remains challenging, especially for ML tasks built with a network of deep neural networks (DNNs). The challenge is to maximize the utilization of all available resources on the multiprocessor system on a chip (MPSoC) at the same time. This becomes even more complicated because the optimal mapping for the network of DNNs can vary with input batch sizes and scene complexity. In this paper, a holistic hierarchical scheduling framework is presented to optimize the execution time for a network of DNN models on an edge MPSoC at runtime, considering varying input characteristics. The framework consists of a local and a global scheduler. The local scheduler maps individual DNNs in the inference pipeline to the best-performing hardware unit while the global scheduler customizes an Integer Linear Programming (ILP) solution to instantiate DNN remapping. To minimize scheduler runtime overhead, an imitation learning (IL) based scheduler is used that approximates the ILP solutions. The proposed scheduling framework (CAMDNN) was implemented on a Qualcomm Robotic RB5 platform. CAMDNN resulted in lower execution time of up to 32% than heterogeneous earliest finish time, and by factors of 6.67X, 5.6X and 2.17X than the CPU-only, GPU-only and Central Queue schedulers.

Original language	English (US)
Pages (from-to)	3191-3202
Number of pages	12
Journal	IEEE Transactions on Computers
Volume	71
Issue number	12
DOIs	https://doi.org/10.1109/TC.2022.3207137
State	Published - Dec 1 2022

Keywords

DNN serving
IoT
Machine learning
deep neural networks
edge
scheduling

ASJC Scopus subject areas

Software
Theoretical Computer Science
Hardware and Architecture
Computational Theory and Mathematics

Access to Document

10.1109/TC.2022.3207137

Cite this

@article{ec97a65015d4416db127a86aa37d72f0,

title = "CAMDNN: Content-Aware Mapping of a Network of Deep Neural Networks on Edge MPSoCs",

abstract = "Machine Learning (ML) workloads are increasingly deployed at the edge. Enabling efficient inference execution while considering model and system heterogeneity remains challenging, especially for ML tasks built with a network of deep neural networks (DNNs). The challenge is to maximize the utilization of all available resources on the multiprocessor system on a chip (MPSoC) at the same time. This becomes even more complicated because the optimal mapping for the network of DNNs can vary with input batch sizes and scene complexity. In this paper, a holistic hierarchical scheduling framework is presented to optimize the execution time for a network of DNN models on an edge MPSoC at runtime, considering varying input characteristics. The framework consists of a local and a global scheduler. The local scheduler maps individual DNNs in the inference pipeline to the best-performing hardware unit while the global scheduler customizes an Integer Linear Programming (ILP) solution to instantiate DNN remapping. To minimize scheduler runtime overhead, an imitation learning (IL) based scheduler is used that approximates the ILP solutions. The proposed scheduling framework (CAMDNN) was implemented on a Qualcomm Robotic RB5 platform. CAMDNN resulted in lower execution time of up to 32% than heterogeneous earliest finish time, and by factors of 6.67X, 5.6X and 2.17X than the CPU-only, GPU-only and Central Queue schedulers.",

keywords = "DNN serving, IoT, Machine learning, deep neural networks, edge, scheduling",

author = "Soroush Heidari and Mehdi Ghasemi and Kim, {Young Geun} and Wu, {Carole Jean} and Sarma Vrudhula",

note = "Publisher Copyright: {\textcopyright} 1968-2012 IEEE.",

year = "2022",

month = dec,

day = "1",

doi = "10.1109/TC.2022.3207137",

language = "English (US)",

volume = "71",

pages = "3191--3202",

journal = "IEEE Transactions on Computers",

issn = "0018-9340",

publisher = "IEEE Computer Society",

number = "12",

}

TY - JOUR

T1 - CAMDNN

T2 - Content-Aware Mapping of a Network of Deep Neural Networks on Edge MPSoCs

AU - Heidari, Soroush

AU - Ghasemi, Mehdi

AU - Kim, Young Geun

AU - Wu, Carole Jean

AU - Vrudhula, Sarma

PY - 2022/12/1

Y1 - 2022/12/1

N2 - Machine Learning (ML) workloads are increasingly deployed at the edge. Enabling efficient inference execution while considering model and system heterogeneity remains challenging, especially for ML tasks built with a network of deep neural networks (DNNs). The challenge is to maximize the utilization of all available resources on the multiprocessor system on a chip (MPSoC) at the same time. This becomes even more complicated because the optimal mapping for the network of DNNs can vary with input batch sizes and scene complexity. In this paper, a holistic hierarchical scheduling framework is presented to optimize the execution time for a network of DNN models on an edge MPSoC at runtime, considering varying input characteristics. The framework consists of a local and a global scheduler. The local scheduler maps individual DNNs in the inference pipeline to the best-performing hardware unit while the global scheduler customizes an Integer Linear Programming (ILP) solution to instantiate DNN remapping. To minimize scheduler runtime overhead, an imitation learning (IL) based scheduler is used that approximates the ILP solutions. The proposed scheduling framework (CAMDNN) was implemented on a Qualcomm Robotic RB5 platform. CAMDNN resulted in lower execution time of up to 32% than heterogeneous earliest finish time, and by factors of 6.67X, 5.6X and 2.17X than the CPU-only, GPU-only and Central Queue schedulers.

AB - Machine Learning (ML) workloads are increasingly deployed at the edge. Enabling efficient inference execution while considering model and system heterogeneity remains challenging, especially for ML tasks built with a network of deep neural networks (DNNs). The challenge is to maximize the utilization of all available resources on the multiprocessor system on a chip (MPSoC) at the same time. This becomes even more complicated because the optimal mapping for the network of DNNs can vary with input batch sizes and scene complexity. In this paper, a holistic hierarchical scheduling framework is presented to optimize the execution time for a network of DNN models on an edge MPSoC at runtime, considering varying input characteristics. The framework consists of a local and a global scheduler. The local scheduler maps individual DNNs in the inference pipeline to the best-performing hardware unit while the global scheduler customizes an Integer Linear Programming (ILP) solution to instantiate DNN remapping. To minimize scheduler runtime overhead, an imitation learning (IL) based scheduler is used that approximates the ILP solutions. The proposed scheduling framework (CAMDNN) was implemented on a Qualcomm Robotic RB5 platform. CAMDNN resulted in lower execution time of up to 32% than heterogeneous earliest finish time, and by factors of 6.67X, 5.6X and 2.17X than the CPU-only, GPU-only and Central Queue schedulers.

KW - DNN serving

KW - IoT

KW - Machine learning

KW - deep neural networks

KW - edge

KW - scheduling

UR - http://www.scopus.com/inward/record.url?scp=85139409241&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85139409241&partnerID=8YFLogxK

U2 - 10.1109/TC.2022.3207137

DO - 10.1109/TC.2022.3207137

M3 - Article

AN - SCOPUS:85139409241

SN - 0018-9340

VL - 71

SP - 3191

EP - 3202

JO - IEEE Transactions on Computers

JF - IEEE Transactions on Computers

IS - 12

ER -

CAMDNN: Content-Aware Mapping of a Network of Deep Neural Networks on Edge MPSoCs

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this