TY - JOUR
T1 - A Deep Q-Learning Approach for Dynamic Management of Heterogeneous Processors
AU - Gupta, Ujjwal
AU - Mandal, Sumit K.
AU - Mao, Manqing
AU - Chakrabarti, Chaitali
AU - Ogras, Umit
N1 - Funding Information:
This work was supported by NSF grant CNS-1526562 and Semiconductor Research Corp. task 2721.001.
Publisher Copyright:
© 2002-2011 IEEE.
PY - 2019/1/1
Y1 - 2019/1/1
N2 - Heterogeneous multiprocessor system-on-chips (SoCs) provide a wide range of parameters that can be managed dynamically. For example, one can control the type (big/little), number and frequency of active cores in state-of-the-art mobile processors at runtime. These runtime choices lead to more than 10× range in execution time, 5× range in power consumption, and 50× range in performance per watt. Therefore, it is crucial to make optimum power management decisions as a function of dynamically varying workloads at runtime. This paper presents a reinforcement learning approach for dynamically controlling the number and frequency of active big and little cores in mobile processors. We propose an efficient deep Q-learning methodology to optimize the performance per watt (PPW). Experiments using Odroid XU3 mobile platform show that the PPW achieved by the proposed approach is within 1 percent of the optimal value obtained by an oracle.
AB - Heterogeneous multiprocessor system-on-chips (SoCs) provide a wide range of parameters that can be managed dynamically. For example, one can control the type (big/little), number and frequency of active cores in state-of-the-art mobile processors at runtime. These runtime choices lead to more than 10× range in execution time, 5× range in power consumption, and 50× range in performance per watt. Therefore, it is crucial to make optimum power management decisions as a function of dynamically varying workloads at runtime. This paper presents a reinforcement learning approach for dynamically controlling the number and frequency of active big and little cores in mobile processors. We propose an efficient deep Q-learning methodology to optimize the performance per watt (PPW). Experiments using Odroid XU3 mobile platform show that the PPW achieved by the proposed approach is within 1 percent of the optimal value obtained by an oracle.
KW - Deep reinforcement learning
KW - Heterogeneous multi-cores
KW - Power management
UR - http://www.scopus.com/inward/record.url?scp=85059963191&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85059963191&partnerID=8YFLogxK
U2 - 10.1109/LCA.2019.2892151
DO - 10.1109/LCA.2019.2892151
M3 - Article
AN - SCOPUS:85059963191
SN - 1556-6056
VL - 18
SP - 14
EP - 17
JO - IEEE Computer Architecture Letters
JF - IEEE Computer Architecture Letters
IS - 1
M1 - 8607043
ER -