TY - GEN
T1 - An Optimal Stopping Approach for Iterative Training in Federated Learning
AU - Jiang, Pengfei
AU - Ying, Lei
N1 - Funding Information:
Research supported in part by NSF CNS 1618768, ECCS 1739344, and CNS 2002608.
Publisher Copyright:
© 2020 IEEE.
PY - 2020/3
Y1 - 2020/3
N2 - This paper studies the problem of iterative training in Federated Learning. We consider a system with a single parameter server (PS) and M client devices for training a predictive learning model with distributed data sets on the client devices. The clients communicate with the parameter server using a common wireless channel, so each time only one device can transmit. The training is an iterative process consisting of multiple rounds. At beginning of each round (also called an iteration), each client trains the model, broadcast by the parameter server at the beginning of the round, with its own data. After finishing training, the device transmits the update to the parameter server when the wireless channel is available. The server aggregates updates to obtain a new model and broadcasts it to all clients to start a new round. We consider adaptive training where the parameter server decides when to stop/restart a new round, and formulate the problem as an optimal stopping problem. While this optimal stopping problem is difficult to solve, we propose a modified optimal stopping problem. We first develop a low complexity algorithm to solve the modified problem, which also works for the original problem. Experiments on a real data set shows significant improvements compared with policies collecting a fixed number of updates in each round.
AB - This paper studies the problem of iterative training in Federated Learning. We consider a system with a single parameter server (PS) and M client devices for training a predictive learning model with distributed data sets on the client devices. The clients communicate with the parameter server using a common wireless channel, so each time only one device can transmit. The training is an iterative process consisting of multiple rounds. At beginning of each round (also called an iteration), each client trains the model, broadcast by the parameter server at the beginning of the round, with its own data. After finishing training, the device transmits the update to the parameter server when the wireless channel is available. The server aggregates updates to obtain a new model and broadcasts it to all clients to start a new round. We consider adaptive training where the parameter server decides when to stop/restart a new round, and formulate the problem as an optimal stopping problem. While this optimal stopping problem is difficult to solve, we propose a modified optimal stopping problem. We first develop a low complexity algorithm to solve the modified problem, which also works for the original problem. Experiments on a real data set shows significant improvements compared with policies collecting a fixed number of updates in each round.
KW - Distributed Machine Learning
KW - Federated Learning
KW - Optimal Stopping
UR - http://www.scopus.com/inward/record.url?scp=85085247605&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85085247605&partnerID=8YFLogxK
U2 - 10.1109/CISS48834.2020.1570616094
DO - 10.1109/CISS48834.2020.1570616094
M3 - Conference contribution
AN - SCOPUS:85085247605
T3 - 2020 54th Annual Conference on Information Sciences and Systems, CISS 2020
BT - 2020 54th Annual Conference on Information Sciences and Systems, CISS 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 54th Annual Conference on Information Sciences and Systems, CISS 2020
Y2 - 18 March 2020 through 20 March 2020
ER -