TY - JOUR

T1 - Distributed stochastic gradient tracking methods

AU - Pu, Shi

AU - Nedić, Angelia

N1 - Publisher Copyright:
© 2020, Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society.

PY - 2021/5

Y1 - 2021/5

N2 - In this paper, we study the problem of distributed multi-agent optimization over a network, where each agent possesses a local cost function that is smooth and strongly convex. The global objective is to find a common solution that minimizes the average of all cost functions. Assuming agents only have access to unbiased estimates of the gradients of their local cost functions, we consider a distributed stochastic gradient tracking method (DSGT) and a gossip-like stochastic gradient tracking method (GSGT). We show that, in expectation, the iterates generated by each agent are attracted to a neighborhood of the optimal solution, where they accumulate exponentially fast (under a constant stepsize choice). Under DSGT, the limiting (expected) error bounds on the distance of the iterates from the optimal solution decrease with the network size n, which is a comparable performance to a centralized stochastic gradient algorithm. Moreover, we show that when the network is well-connected, GSGT incurs lower communication cost than DSGT while maintaining a similar computational cost. Numerical example further demonstrates the effectiveness of the proposed methods.

AB - In this paper, we study the problem of distributed multi-agent optimization over a network, where each agent possesses a local cost function that is smooth and strongly convex. The global objective is to find a common solution that minimizes the average of all cost functions. Assuming agents only have access to unbiased estimates of the gradients of their local cost functions, we consider a distributed stochastic gradient tracking method (DSGT) and a gossip-like stochastic gradient tracking method (GSGT). We show that, in expectation, the iterates generated by each agent are attracted to a neighborhood of the optimal solution, where they accumulate exponentially fast (under a constant stepsize choice). Under DSGT, the limiting (expected) error bounds on the distance of the iterates from the optimal solution decrease with the network size n, which is a comparable performance to a centralized stochastic gradient algorithm. Moreover, we show that when the network is well-connected, GSGT incurs lower communication cost than DSGT while maintaining a similar computational cost. Numerical example further demonstrates the effectiveness of the proposed methods.

KW - Communication networks

KW - Convex programming

KW - Distributed optimization

KW - Stochastic optimization

UR - http://www.scopus.com/inward/record.url?scp=85082773201&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85082773201&partnerID=8YFLogxK

U2 - 10.1007/s10107-020-01487-0

DO - 10.1007/s10107-020-01487-0

M3 - Article

AN - SCOPUS:85082773201

SN - 0025-5610

VL - 187

SP - 409

EP - 457

JO - Mathematical Programming

JF - Mathematical Programming

IS - 1-2

ER -