OPTIMIZING (L0, L1)-SMOOTH FUNCTIONS BY GRADIENT METHODS

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We study gradient methods for optimizing (L0, L1)-smooth functions, a class that generalizes Lipschitz-smooth functions and has gained attention for its relevance in machine learning. We provide new insights into the structure of this function class and develop a principled framework for analyzing optimization methods in this setting. While our convergence rate estimates recover existing results for minimizing the gradient norm in nonconvex problems, our approach significantly improves the best-known complexity bounds for convex objectives. Moreover, we show that the gradient method with Polyak stepsizes and the normalized gradient method achieve nearly the same complexity guarantees as methods that rely on explicit knowledge of (L0, L1). Finally, we demonstrate that a carefully designed accelerated gradient method can be applied to (L0, L1)-smooth functions, further improving all previous results.

Original languageEnglish (US)
Title of host publication13th International Conference on Learning Representations, ICLR 2025
PublisherInternational Conference on Learning Representations, ICLR
Pages39615-39641
Number of pages27
ISBN (Electronic)9798331320850
StatePublished - 2025
Externally publishedYes
Event13th International Conference on Learning Representations, ICLR 2025 - Singapore, Singapore
Duration: Apr 24 2025Apr 28 2025

Publication series

Name13th International Conference on Learning Representations, ICLR 2025

Conference

Conference13th International Conference on Learning Representations, ICLR 2025
Country/TerritorySingapore
CitySingapore
Period4/24/254/28/25

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science Applications
  • Education
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'OPTIMIZING (L0, L1)-SMOOTH FUNCTIONS BY GRADIENT METHODS'. Together they form a unique fingerprint.

Cite this