Campus Units

Electrical and Computer Engineering, Industrial and Manufacturing Systems Engineering

Document Type

Conference Proceeding

Conference

36th International Conference on Machine Learning

Publication Version

Published Version

Link to Published Version

http://proceedings.mlr.press/v97/lu19a.html

Publication Date

2019

Journal or Book Title

Proceedings of Machine Learning Research

Volume

97

First Page

4134

Last Page

4143

Conference Title

36th International Conference on Machine Learning

Conference Date

June 9-15, 2019

City

Long Beach, CA

Abstract

Alternating gradient descent (A-GD) is a simple but popular algorithm in machine learning, which updates two blocks of variables in an alternating manner using gradient descent steps. In this paper, we consider a smooth unconstrained nonconvex optimization problem, and propose a perturbed A-GD (PA-GD) which is able to converge (with high probability) to the second-order stationary points (SOSPs) with a global sublinear rate. Existing analysis on A-GD type algorithm either only guarantees convergence to first-order solutions, or converges to second-order solutions asymptotically (without rates). To the best of our knowledge, this is the first alternating type algorithm that takes O(polylog(d)/ϵ2) iterations to achieve an (ϵ,ϵ√)-SOSP with high probability, where polylog(d) denotes the polynomial of the logarithm with respect to problem dimension d.

Comments

This proceeding is published as Lu, Songtao, Mingyi Hong, and Zhengdao Wang. "PA-GD: On the convergence of perturbed alternating gradient descent to second-order stationary points for structured nonconvex optimization." In International Conference on Machine Learning (2019): 4134-4143. Posted with permission.

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Copyright Owner

The Author(s)

Language

en

File Format

application/pdf

Published Version

Share

Article Location

 
COinS