DS1 spectrogram: Does language matter for spoken word classification? A multilingual generative meta-learning approach

Does language matter for spoken word classification? A multilingual generative meta-learning approach

2605.13084

Authors

Batsirayi Mupamhi Ziki,Louise Beyers,Ruan van der Merwe

Abstract

Meta-learning has been shown to have better performance than supervised learning for few-shot monolingual spoken word classification. However, the meta-learning approach remains under-explored in multilingual spoken word classification.

In this paper, we apply the Generative Meta-Continual Learning algorithm to spoken word classification. The generative nature of this algorithm makes it viable for use in application, and the meta-learning aspect promotes generalisation, which is crucial in a multilingual setting.

We train monolingual models on English, German, French, and Catalan, a bilingual model on English and German, and a multilingual model on all four languages. We find that although the multilingual model performs best, the differences between model performance is unexpectedly low.

We also find that the hours of unique data seen during training seems to be a stronger performance indicator than the number of languages included in the training data.

Resources

Stay in the loop

Every AI paper that matters, free in your inbox daily.

Details

  • © 2026 takara.ai Ltd
  • Content is sourced from third-party publications.