Designing an Efficient Framework for Web Content Mining Using Machine Learning

S. Zafar Mehdi Kazmi

doi:10.52783/jisem.v10i45s.9148

PDF

Published: May 11, 2025

DOI: https://doi.org/10.52783/jisem.v10i45s.9148

Keywords:

Web Content Mining, Data Mining, Information Retrieval, Natural Language Processing (NLP), Text Mining, Deep Learning,

S. Zafar Mehdi Kazmi, Md. Faizan Farooqui

Abstract

As the volume of web data continues to increase, web content mining is becoming more important for organizations and researchers aiming to develop web content that is unstructured and in constant flux. This paper proposes a web content mining framework that automatically addresses critical problems like dynamic web architectures with different types of content and various formats of unstructured data. Together with modern web scraping tools, NLP algorithms, and machine learning frameworks, these technologies efficiently extract and analyze web data.

The framework begins with a powerful data acquisition module that combines standard web crawling techniques with API incorporation to handle both static and dynamic URL sources. The data pre-processing pipeline cleans and normalizes the data, making it more appropriate for further analysis. These advanced information extraction methods include extracting text from metadata and applying feature engineering processes to derive structured insights from the web's unrefined raw content.

Analysis and processing capabilities where topic modelling, sentiment analysis, and named entity recognition converge provide more insightful and actionable intelligence. The framework enables Scalable storage with the help of a database.

Issue

Vol. 10 No. 45s (2025)

Section

Articles

Journal of Information Systems Engineering and Management

Designing an Efficient Framework for Web Content Mining Using Machine Learning

Abstract

Volume 11 (2026)

Volume 10 (2025)

Volume 9 (2024)

Volume 8 (2023)

Volume 7 (2022)

Volume 6 (2021)

Volume 5 (2020)

Volume 4 (2019)

Volume 3 (2018)

Volume 2 (2017)

Volume 1 (2016)

Journal of Information Systems Engineering and Management

Article Sidebar

Main Article Content

Abstract

Article Details