A Review of Penalized Regression and Machine Learning Methods in High-Dimensional Data

Document Type : Review

Authors

1 department of applied statistics, institute of statistical studies and research, Cairo university, Egypt

2 Associate Professor of Statistics, Department of Applied Statistics and Econometrics, FGSSR, Cairo University

3 Deraya University

10.21608/esju.2025.368665.1080

Abstract

In recent years, penalized regression techniques and machine learning methods have emerged as powerful tools for statistical modeling, particularly in high-dimensional data analysis. Penalized regression methods, such as Ridge, least absolute shrinkage and selection operator, and Elastic Net, mitigate multicollinearity and overfitting through regularization, enhancing model stability, accuracy, and interpretability. Meanwhile, machine learning techniques, including decision trees, random forests, support vector machines, and neural networks, provide strong predictive capabilities across various applications, research domains, and real-world case studies. This review systematically examines these methodologies, discussing their theoretical foundations, advancements, practical implementations, and computational efficiency. A comparative analysis highlights their strengths, limitations, and performance in different analytical contexts. Additionally, emerging hybrid techniques that integrate penalized regression with machine learning are explored, demonstrating their potential to improve model efficiency, scalability, accuracy, and interpretability. The review concludes that combining these approaches offers a robust framework for handling complex, high-dimensional data, making them valuable tools for modern statistical analysis, predictive modeling, and data-driven decision-making.

Keywords

Main Subjects