Developing a Taxonomy, Root-Cause Analysis, and Adaptive Remediation Framework for JVM Garbage Collection (GC) Overhead Errors in Apache Spark under High-Scale Commerce Workloads
Keywords:
JVM GC overhead, Apache Spark, adaptive remediation, root-cause analysis, performance optimization, large-scale workloads, dynamic configurations.Abstract
The study examines the JVM GC overhead error in Apache Spark with the close-out of a commerce-scale load and suggests an overview, cause and effect examination, and enhanced adaptive remedies. The framework employs Python software to analyze performance in real-time and make dynamic JVM configuration changes. The experimental outcomes of the suggested scheme prove that the proposed framework minimizes GC delays and job times, increasing throughput and system efficiency astonishingly. The research not only contributes to the performance optimization of Apache Spark in the large litter setting, but also provides insights into the automation of JVM settings to reduce overhead.
Downloads
References
Théo, R. and Claire, D., 2024. JAVA PERFORMANCE TUNING: JVM GARBAGE COLLECTORS, JIT OPTIMIZATIONS, AND PROFILING TOOLS. Journal of Adaptive Learning Technologies, 1(5), pp.35-49.
Beckwith, M., 2024. Jvm performance engineering: inside openjdk and the hotspot java virtual machine. Addison-Wesley Professional.
Santana, O., 2024. Mastering the Java Virtual Machine: An in-depth guide to JVM internals and performance optimization. Packt Publishing Ltd.
Yang, Y., Wu, M., Chen, H. and Zang, B., 2021, April. Bridging the performance gap for copy-based garbage collectors atop non-volatile memory. In Proceedings of the Sixteenth European Conference on Computer Systems (pp. 343-358).
Tavakolisomeh, S., 2024. User-Centric Approaches to Garbage Collector Selection and Heap Size Optimization for Java Applications.
Théo, R. and Claire, D., 2024. JAVA PERFORMANCE TUNING: JVM GARBAGE COLLECTORS, JIT OPTIMIZATIONS, AND PROFILING TOOLS. Journal of Adaptive Learning Technologies, 1(5), pp.35-49.
Perera, C., 2024. Optimizing performance in parallel and distributed computing systems for large-scale applications. Journal of Advanced Computing Systems, 4(9), pp.35-44.
Aguilera, M.K., Amaro, E., Amit, N., Hunhoff, E., Yelam, A. and Zellweger, G., 2023. Memory disaggregation: Why now and what are the challenges. ACM SIGOPS Operating Systems Review, 57(1), pp.38-46.
Pasham, S.D., 2020. Fault-Tolerant Distributed Computing for Real-Time Applications in Critical Systems. The Computertech, pp.1-29.
Phiri, T., 2023. Adaptive and Autonomous Systems in Advanced Computing A Future of Self-Optimizing Technologies. Journal of Advanced Computing Systems, 3(5), pp.1-12.
Vollem, S., 2024. Developing Autonomous Self-Healing Infrastructure Frameworks Using Predictive Monitoring and Intelligent Automation to Strengthen Reliability and Resilience in Distributed Computing Environments.
Coppolino, L., D’Antonio, S., Nardone, R. and Romano, L., 2023. A self-adaptation-based approach to resilience improvement of complex internets of utility systems. Environment Systems and Decisions, 43(4), pp.708-720.
Kabir, M.A. and Ahmed, M.R., 2024. Python for data analytics: A systematic literature review of tools, techniques, and applications. Academic journal on science, technology, engineering & mathematics education, 4(04), pp.10-69593.
Lavanya, A., Gaurav, L., Sindhuja, S., Seam, H., Joydeep, M., Uppalapati, V., Ali, W. and Sagar, V.S.D., 2023. Assessing the performance of python data visualization libraries: a review. Int. J. Comput. Eng. Res. Trends, 10(1), pp.28-39.
Ahmed, F., 2024. Python For Data Analytics: A Systematic Literature Review Of Tools, Techniques, And Applications. Techniques, And Applications (November 13, 2024).
Han, L.M., Gao, Y.P. and Liu, J.G., 2023. Machine Learning Clustering for Collaborative Filtering Recommendation of Large-Scale e-Commerce in Cloud Computing. International Journal of Cloud Computing, 8(4), pp.1321-1337.
Beckwith, M., 2024. Jvm performance engineering: inside openjdk and the hotspot java virtual machine. Addison-Wesley Professional.
Luengo, J., García-Gil, D., Ramírez-Gallego, S., García, S. and Herrera, F., 2020. Big data preprocessing. Cham: Springer, 1, pp.1-186.
Chaliasos, S., Sotiropoulos, T., Drosos, G.P., Mitropoulos, C., Mitropoulos, D. and Spinellis, D., 2021. Well-typed programs can go wrong: A study of typing-related bugs in jvm compilers. Proceedings of the ACM on Programming Languages, 5(OOPSLA), pp.1-30.
Chaliasos, S., Sotiropoulos, T., Drosos, G.P., Mitropoulos, C., Mitropoulos, D. and Spinellis, D., 2021. Well-typed programs can go wrong: A study of typing-related bugs in jvm compilers. Proceedings of the ACM on Programming Languages, 5(OOPSLA), pp.1-30.
Zhao, J., Pi, A., Zhou, X., Chang, S.Y. and Xu, C., 2022, November. Improving Concurrent GC for Latency Critical Services in Multi-tenant Systems. In Proceedings of the 23rd ACM/IFIP International Middleware Conference (pp. 43-55).
Ghazi, M.G.B.M., Lee, L.C., Samsudin, A.S.B. and Sino, H., 2022. Evaluation of ensemble data preprocessing strategy on forensic gasoline classification using untargeted GC–MS data and classification and regression tree (CART) algorithm. Microchemical Journal, 182, p.107911.
Halawa, M.S., Diaz Redondo, R.P. and Fernández Vilas, A., 2020. Unsupervised kpis-based clustering of jobs in hpc data centers. Sensors, 20(15), p.4111.
Grifoni, M., Franchi, E., Fusini, D., Vocciante, M., Barbafieri, M., Pedron, F., Rosellini, I. and Petruzzelli, G., 2022. Soil remediation: Towards a resilient and adaptive approach to deal with the ever-changing environmental challenges. Environments, 9(2), p.18.
Vlahou, A., Hallinan, D., Apweiler, R., Argiles, A., Beige, J., Benigni, A., Bischoff, R., Black, P.C., Boehm, F., Céraline, J. and Chrousos, G.P., 2021. Data sharing under the General Data Protection Regulation: time to harmonize law and research ethics?. Hypertension, 77(4), pp.1029-1035.
Sirimalla, A., 2024. Self-Healing Cloud Database Platforms: Python Automation and Machine Learning for Proactive Issue Detection Across Multi-Cloud Oracle and SQL Server Deployments. ISCSITR-INTERNATIONAL JOURNAL OF CLOUD COMPUTING (ISCSITR-IJCC)-ISSN (Online): 3067-7378, 5(1), pp.15-41.
Zhong, Z., Xu, M., Rodriguez, M.A., Xu, C. and Buyya, R., 2022. Machine learning-based orchestration of containers: A taxonomy and future directions. ACM Computing Surveys (CSUR), 54(10s), pp.1-35.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.


