DS1 spectrogram: KhmerST: A Low-Resource Khmer Scene Text Detection and Recognition Benchmark

KhmerST: A Low-Resource Khmer Scene Text Detection and Recognition Benchmark

2410.18277

Authors

Muhammad Muzzamil Luqman,Mickaël Coustaty,Jean-Marc Ogier,Vannkinh Nom,Souhail Bakkali

Abstract

Developing effective scene text detection and recognition models hinges on extensive training data, which can be both laborious and costly to obtain, especially for low-resourced languages. Conventional methods tailored for Latin characters often falter with non-Latin scripts due to challenges like character stacking, diacritics, and variable character widths without clear word boundaries.

In this paper, we introduce the first Khmer scene-text dataset, featuring 1,544 expert-annotated images, including 997 indoor and 547 outdoor scenes. This diverse dataset includes flat text, raised text, poorly illuminated text, distant and partially obscured text.

Annotations provide line-level text and polygonal bounding box coordinates for each scene. The benchmark includes baseline models for scene-text detection and recognition tasks, providing a robust starting point for future research endeavors.

The KhmerST dataset is publicly accessible at https://gitlab.com/vannkinhnom123/khmerst.

Resources

Stay in the loop

Every AI paper that matters, free in your inbox daily.

Details

  • © 2026 takara.ai Ltd
  • Content is sourced from third-party publications.