DS1 spectrogram: Characterizing the Generalization Error of Random Feature Regression with Arbitrary Data-Augmentation

Characterizing the Generalization Error of Random Feature Regression with Arbitrary Data-Augmentation

May 11, 20262605.10290

Authors

Alain Durmus,Adrien Hardy,Lucas Morisset

Abstract

This paper aims at analyzing the regularization effect that data augmentation induces on supervised regression methods in the proportional regime, where the number of covariates grows proportionally to the number of samples. We provide a tight characterization of the test error, measured in mean squared error, in terms only of the population quantities of the true data, as well as first and second order statistics of the augmentation scheme.

Our results are valid under misspecified feature maps, and for any network architecture where only the last readout layer is trained, and the rest of the network is either frozen or randomly initialized. We specify our results in the case of Gaussian data, and show that our asymptotic characterization is tight in this setting.

Resources

Stay in the loop

Every AI paper that matters, free in your inbox daily.

Details

  • © 2026 takara.ai Ltd
  • Content is sourced from third-party publications.