Medicon Engineering Themes (ISSN: 2834-7218)

Review Article

Volume 9 Issue 2


Enhanced Recommender Systems for Big Data: A Feature Engineering Approach Using Apache Spark

Balaji GN, Parthasarathy Govindarajan* and Balaji N
School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India
*Corresponding Author: Parthasarathy Govindarajan, School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India.

Published: September 29, 2025

View Pdf

Abstract  

In the era of big data, recommender systems play a crucial role in addressing information overload by suggesting relevant items to users based on their preferences. However, traditional recommendation methods face significant challenges in processing vast and complex data, leading to computational inefficiencies and reduced prediction accuracy. This paper introduces a novel recommender system designed specifically for big data environments, leveraging the Apache Spark platform to enhance scalability and address data sparsity issues.

Motivation: The increasing volume and complexity of user data in real-world applications create a need for more robust and scalable recommender systems. Traditional systems often struggle with performance and accuracy when applied to such large datasets. Our goal is to develop a system that can efficiently handle massive data while providing accurate recommendations.

Methods: We propose a BDMFE (Big Data Model with Feature Engineering) approach that optimizes data processing within a distributed computing environment, utilizing the capabilities of Apache Spark. Our model tackles both data sparsity and scalability issues by leveraging Spark's in-memory processing and parallelism.

Results: The proposed system was evaluated using three real-world datasets, demonstrating superior performance over existing recommendation models. In particular, it showed significant improvements in prediction accuracy, as measured by Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE).

Keywords: Machine Learning; Prediction; Feature Engineering; Big Data