DS1 spectrogram: Contemporary Model Compression on Large Language Models Inference

Contemporary Model Compression on Large Language Models Inference

September 3, 20242409.01990

Authors

Yanxuan Yu,Yite Wang,Jing Wu,Zhongwei Wan,Sina Alinejad

Abstract

This paper focuses on modern efficient training and inference technologies on foundation models and illustrates them from two perspectives: model and system design. Model and System Design optimize LLM training and inference from different aspects to save computational resources, making LLMs more efficient, affordable, and more accessible.

The paper list repository is available at https://github.com/NoakLiu/Efficient-Foundation-Models-Survey.

Resources

Stay in the loop

Every AI paper that matters, free in your inbox daily.

Details

  • © 2026 takara.ai Ltd
  • Content is sourced from third-party publications.