eLDA: Augmenting Topic Modeling with Word Embeddings for Enhanced Coherence and Interpretability
Main Article Content
Abstract
Traditional topic modeling methods like Latent Dirichlet Allocation suffer from several challenges, especially concerning appropriate topic coherence, logical and consistent word groups that follow some semantic relationship, and interpretability. In this work, we propose an enhanced version of LDA, called eLDA, which incorporates Word2Vec embeddings (W2Ve) into LDA. This approach is adopted in order to improve the coherence of individual topics and improve the general topic interpretability by using established metrics such as the coherence score. Traditional LDA and eLDA coherence scores are compared to validate the results. In contrast to the former, we observe that eLDA provides much better interpretability with higher coherence scores, stronger semantic relationships, and improved visualization of topics.