Advancing Analytic Class-Incremental Learning through Vision-Language Calibration

Abstract

Class-incremental learning (CIL) with pre-trained models (PTMs) faces a critical trade-off between efficient adaptation and long-term stability. While analytic learning enables rapid, recursive closed-form updates, its efficacy is often compromised by accumulated errors and feature incompatibility.

In this paper, we first conduct a systematic study to dissect the failure modes of PTM-based analytic CIL, identifying representation rigidity as the primary bottleneck. Motivated by these insights, we propose VILA, a novel dual-branch framework that advances analytic CIL via a two-level vision-language calibration strategy. Specifically, we coherently fuse plastic, task-adapted features with a frozen, universal semantic anchor at the feature level through geometric calibration, and leverage cross-modal priors at the decision level to rectify prediction bias.

This confluence maintains analytic-learning's extreme efficiency while overcoming its inherent brittleness. Extensive experiments across eight benchmarks demonstrate that VILA consistently yields superior performance, particularly in fine-grained and long-sequence scenarios.

Our framework harmonizes high-fidelity prediction with the simplicity of analytic learning. Our code is available at https://github.com/byzhaoAI/VILA

Advancing Analytic Class-Incremental Learning through Vision-Language Calibration

Authors

Abstract

Resources

Stay in the loop

Pages

Tools

Details