DS1 spectrogram: Accelerating Single-Pass SGD for Generalized Linear Prediction

Accelerating Single-Pass SGD for Generalized Linear Prediction

March 2, 20262603.01951

Authors

Qian Chen,Shihong Ding,Cong Fang

Abstract

We study generalized linear prediction under a streaming setting, where each iteration uses only one fresh data point for a gradient-level update. While momentum is well-established in deterministic optimization, a fundamental open question is whether it can accelerate such single-pass non-quadratic stochastic optimization.

We propose the first algorithm that successfully incorporates momentum via a novel data-dependent proximal method, achieving dual-momentum acceleration. Our derived excess risk bound decomposes into three components: an improved optimization error, a minimax optimal statistical error, and a higher-order model-misspecification error.

The proof handles mis-specification via a fine-grained stationary analysis of inner updates, while localizing statistical error through a two-phase outer-loop analysis. As a result, we resolve the open problem posed by Jain et al.

[2018a] and demonstrate that momentum acceleration is more effective than variance reduction for generalized linear prediction in the streaming setting.

Resources

Stay in the loop

Get tldr.takara.ai to Your Email, Everyday.

tldr.takara.aiHome·Daily at 6am UTC·© 2026 takara.ai Ltd

Content is sourced from third-party publications.