Hierarchical Label-Wise Attention for Imbalanced Multi-Label Thai Text Classification

Main Article Content

Suwika Plubin, Bandhita Plubin, Walaithip Bunyatisai, Thanasak Mouktonglang, Manad Khamkong

Abstract

Introduction: Multi-label text classification, in which each instance can belong to several categories, is vital for applications like sentiment analysis, document classification, and mining customer feedback. In real-world situations, customer feedback frequently covers various subjects at once, highlighting the need to correctly pinpoint all pertinent elements. In practice, real-world datasets often experience significant class imbalance, leading models to prioritize majority classes (e.g., Product & Services, Accessibility) while performing poorly on minority classes (e.g., Chatbot, Facility & Supporter). These difficulties are increased when handling Thai text, which does not have clear word boundaries and includes intricate syntax, making tokenization and contextual comprehension more challenging. Presented at the IIARP International Conference Abstract Proceedings Volume (Vol. 3, No. 3 | March 2025), this study presents an innovative approach designed to address these challenges.


Objectives: This research presents the Hierarchical Label-Wise Attention Transformer (HiLAT) aimed at addressing imbalanced multi-label classification for reviews from Thai banking customers. We aim to enhance overall predictive accuracy and F1-scores per category—particularly for less represented labels—by tailoring attention mechanisms and adjusting loss weighting to the specific traits of Thai.


Methods: We gathered 24,500 customer reviews in Thai (67,870 labeled sentences) from social media, which were manually categorized into eight groups: Accessibility, Chatbot, Facility & Support, Image, Other, Product & Services, Staff, and Timing. To address class imbalance, we utilized a class-weighted loss function in training, enhancing the impact of minority labels. The HiLAT framework includes two attention layers: (1) attention at the sentence level to pinpoint the most pertinent text segments for each label; and (2) token attention by label to concentrate on the most significant tokens within those segments. Pre-trained Thai word vectors were integrated to enhance semantic representations. We evaluated the model’s performance by employing macro-averaged precision for label retrieval accuracy, macro-averaged recall for completeness, macro-averaged F1-score for overall balance, and Hamming Loss to measure misclassification rates.


Results: HiLAT achieved a macro-average F1-score of 0.597 with a Hamming Loss of 0.233. It performed best on well-represented labels—Product & Services (F1 0.732), Accessibility (0.725), and Staff (0.627)—and moderately on mid-frequency labels Timing (0.599) and Other (0.547). Performance on low-frequency categories remained lower—Chatbot (0.396), Facility & Supporter (0.415), and Image (0.490)—highlighting opportunities for further enhancement.


Conclusions: By integrating hierarchical attention and class weighting, HiLAT effectively addresses the dual challenges of multi-label prediction and severe class imbalance in Thai text. While further enhancements—such as advanced resampling or data augmentation—may improve performance on minority categories, the strong macro-average metrics validate HiLAT’s applicability for nuanced feedback analysis in banking and other sectors.

Article Details

Section
Articles