A Comparative Study of Multilabel Classification Techniques for Analyzing Bug Report Dependencies

Zhengdong Hu

doi:10.52783/jisem.v10i26s.4253

PDF

Published: Mar 28, 2025

DOI: https://doi.org/10.52783/jisem.v10i26s.4253

Keywords:

Bug report, bug dependency, multilabel classification technique, machine learning, deep learning, transformer learning

Zhengdong Hu, Jantima Polpinij, Gamgarn Somprasertsri

Abstract

Bug report dependency analysis entails identifying and examining the interrelations among software bug reports. Dependencies may indicate that bugs are interconnected, with one bug obstructing the resolution of another. Consequently, one software defect must be rectified prior to resolving another. To our knowledge, a baseline has been developed for textual similarity-based grouping and keyword matching. Regrettably, this usually fails to comprehensively represent the complicated associations among bug reports, resulting in ineffective debugging and heightened maintenance expenses. As a results, this study presents an alternate way to enhance bug dependency analysis via extensive multilabel classification methods. The dataset, obtained from Bugzilla, comprises 4,781 bug reports pertaining to Mozilla Firefox, with each report linked to one to four dependency labels. A thorough comparison of Binary Relevance, Classifier Chains, and Label Powerset was performed with classifiers like Multinomial Naïve Bayes (MNB), K-Nearest Neighbor (KNN), Random Forest (RF), and Support Vector Machine (SVM). Furthermore, deep learning architectures including LSTM and TextCNN, as well as transformer-based models like BERT and RoBERTa, were assessed. Although multilabel classifiers developed by machine learning exhibit strong performance, they encounter issues with class imbalance, which adversely impacts F1-scores despite elevated overall accuracy. The experiment's findings indicate that BERT surpasses all other models and the baseline, attaining the greatest F1-score (0.647) and Micro-averaged Accuracy (0.9967), underscoring its efficacy in identifying semantic relationships within bug reports. The results demonstrate that transformer-based models return the most efficient approach for identifying bug report dependencies, hence enhancing problem triaging and automating software maintenance.

Issue

Vol. 10 No. 26s (2025)

Section

Articles

Journal of Information Systems Engineering and Management

A Comparative Study of Multilabel Classification Techniques for Analyzing Bug Report Dependencies

Abstract

Volume 10 (2025)

Volume 9 (2024)

Volume 8 (2023)

Volume 7 (2022)

Volume 6 (2021)

Volume 5 (2020)

Volume 4 (2019)

Volume 3 (2018)

Volume 2 (2017)

Volume 1 (2016)

Journal of Information Systems Engineering and Management

Article Sidebar

Main Article Content

Abstract

Article Details