Integrating Serverless Architectures and Kubernetes for Scalable and High-Availability AI Workflows
Keywords:
Artificial Intelligence Workflows, Kubernetes Orchestration, Serverless Computing, Scalability, Cloud-Native Infrastructure, Optimization GPU, Hybrid Cloud ArchitectureAbstract
The increasing use of AI in various industries presents major difficulties in developing workflows that are scalable and highly available. Containerized deployments make addressing these dynamics challenging, as workloads fluctuate; therefore, resources remain inefficient, and operational expenditure is increased. Serverless computing applies an event- driven model and operates as pay-as-you-go; therefore, it is flexible, but it has drawbacks in terms of GPU utilization and cold starts. Conversely, Kubernetes operates as a powerful orchestrator, a resilient option with fault-tolerant and dynamic scaling capabilities that help manage complex containerized environments. This paper proposes an integrated framework using serverless and K8s architectures in their respective paradigms for AI workloads to provide workflows that are scalable, available, and efficient. This accomplished through the combination of GPU acceleration, serverless event-triggered functionality and its calling of K8s to facilitate orchestration, allowing for the automation of data preparation, model training, deployment and real-time inference. The performance evaluation of the approach showed that serverless architecture achieves greater throughput and cost-effectiveness in real-time inference tasks, while the K8s containerization achieved greater GPU utilization during the model-training phase. However, the hybrid side of this system provides a resilient solution to the demands of modern AI workloads in hybrid cloud environments, as a balanced and adaptive solution.Downloads
References
D. J. Blezek, L. Olson-Williams, A. Missert, and P. Korfiatis, “AI Integration in the Clinical Workflow,” J. Digit. Imaging, vol. 34, no. 6, pp. 1435–1446, 2021, doi: 10.1007/s10278-021-00525-3.
P. Das, “Optimizing Sensor Integration for Enhanced Localization in Underwater ROVS,” Interanational J. Sci. Res. Eng. Manag., vol. 08, no. 12, pp. 1–6, Dec. 2021, doi: 10.55041/IJSREM10901.
N. Patel, “Sustainable Smart Cities : Leveraging IoT and Data Analytics for Energy Efficiency and Urban Development,” J. Emerg. Technol. Innov. Res., vol. 8, no. 3, 2021.
S. Chatterjee, “Risk Management in Advanced Persistent Threats (APTs) for Critical Infrastructure in the Utility Industry,” Int. J. Multidiscip. Res., vol. 3, no. 4, pp. 1–10, Aug. 2021, doi: 10.36948/ijfmr.2021.v03i04.34396.
A. Madanayake et al., “Low-Power VLSI Architectures for DCT/DWT: Precision vs Approximation for HD Video, Biomedical, and Smart Antenna Applications,” IEEE Circuits Syst. Mag., vol. 15, no. 1, pp. 25–47, 2015, doi: 10.1109/MCAS.2014.2385553.
A. Goyal, “Enhancing Engineering Project Efficiency through Cross-Functional Collaboration and IoT Integration,” Int. J. Res. Anal. Rev., vol. 8, no. 4, pp. 396–402, 2021.
S. B. V. Naga, K. C. Sunkara, S. Thangavel, and R. Sundaram, “Secure and Scalable Data Replication Strategies in Distributed Storage Networks,” Int. J. AI, BigData, Comput. Manag. Stud., vol. 2, no. 2, pp. 18–27, 2021, doi: 10.63282/3050- 9416.IJAIBDCMS-V2I2P103.
R. Tandon and D. Patel, “Evolution of Microservices Patterns for Designing Hyper- Scalable Cloud-Native Architectures,” ESP J. Eng. Technol. Adv., vol. 1, no. 1, pp. 288–297, 2021, doi: 10.56472/25832646/JETA-V1I1P131.
S. S. S. Neeli, “Optimizing Database Management with DevOps: Strategies and Real-World Examples,” J. Adv. Dev. Res., vol. 11, no. 1, 2020.
A. Poniszewska-Marańda and E. Czechowska, “Kubernetes Cluster for Automating Software Production Environment,” Sensors, vol. 21, no. 5, p. 1910, Mar. 2021, doi: 10.3390/s21051910.
S. S. S. Neeli, “Serverless Databases : A Cost-Effective and Scalable Solution,” IJIRMPS, vol. 7, no. 6, 2019.
A. Tripathi, “Serverless Architecture Patterns: Deep Dive into Event-Driven, Microservices, and Serverless APIs,” Int. J. Creat.Res. Thoughts, vol. 7, no. 3, pp. 234–239, 2019.
V. S. Thokala, “Utilizing Docker Containers for Reproducible Builds and Scalable Web Application Deployments,” Int. J. Curr. Eng. Technol., vol. 11, no. 6, pp. 661–668, 2021, doi: 10.14741/ijcet/v.11.6.10.
A. P. Rajan, “A review on serverless architectures - function as a service (FaaS) in cloud computing,” TELKOMNIKA (Telecommunication Comput. Electron. Control., vol. 18, no. 1, p. 530, Feb. 2020, doi: 10.12928/telkomnika.v18i1.12169.
V. S. Thokala, “A Comparative Study of Data Integrity and Redundancy in Distributed Databases for Web Applications,” Int.J. Res. Anal. Rev., vol. 8, no. 04, pp. 383–390, 2021.
S. K. Mohanty, G. Premsankar, and M. di Francesco, “An Evaluation of Open Source Serverless Computing Frameworks,” in 2018 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), 2018, pp. 115–120. doi: 10.1109/CloudCom2018.2018.00033.
P. S. Patchamatla and I. O. Owolabi, “Integrating Serverless Computing and Kubernetes in OpenStack for Dynamic AI Workflow Optimization,” Int. J. Multidiscip. Res. Sci. Eng. Technol., vol. 01, no. 12, 2020, doi: 10.15680/ijmrset.2020.0312021.
S. Miller, T. Siems, and V. Debroy, “Kubernetes for Cloud Container Orchestration Versus Containers as a Service (CaaS): Practical Insights,” in 2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), 2021, pp. 407–408. doi: 10.1109/ISSREW53611.2021.00110.
H. Govind and H. González–Vélez, “Benchmarking Serverless Workloads on Kubernetes,” in 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 2021, pp. 704–712. doi: 10.1109/CCGrid51090.2021.00085.
C.-C. Yang, G. Domeniconi, L. Zhang, and G. Cong, “Design of AI-Enhanced Drug Lead Optimization Workflow for HPC and Cloud,” in 2020 IEEE International Conference on Big Data (Big Data), 2020, pp. 5861–5863. doi: 10.1109/BigData50022.2020.9378387.
D. Fan and D. He, “A Scheduler for Serverless Framework base on Kubernetes,” in Proceedings of the 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence, ACM, Jul. 2020, pp. 229–232. doi: 10.1145/3409501.3409503.
W. Ling, L. Ma, C. Tian, and Z. Hu, “Pigeon: A Dynamic and Efficient Serverless and FaaS Framework for Private Cloud,” in 2019 International Conference on Computational Science and Computational Intelligence (CSCI), 2019, pp. 1416–1421. doi: 10.1109/CSCI49370.2019.00265.
R. A. P. Rajan, “Serverless Architecture - A Revolution in Cloud Computing,” in 2018 Tenth International Conference on Advanced Computing (ICoAC), 2018, pp. 88–93. doi: 10.1109/ICoAC44903.2018.8939081.
A. K. Kulkarni and B. Annappa, “GPU-aware resource management in heterogeneous cloud data centers,” J. Supercomput., vol. 77, no. 11, pp. 12458–12485, Nov. 2021, doi: 10.1007/s11227-021-03779-4.
S. Wellert, M. Richter, T. Hellweg, R. von Klitzing, and Y. Hertle, “Responsive Microgels at Surfaces and Interfaces,” Zeitschrift für Phys. Chemie, vol. 229, no. 7–8, pp. 1225–1250, Aug. 2015, doi: 10.1515/zpch-2014-0568.
K. J. Theisen, “Programming languages in chemistry: a review of HTML5/JavaScript,” J. Cheminform., vol. 11, no. 1, p. 11, Dec. 2019, doi: 10.1186/s13321-019-0331-1.
I. Yakoumis, E. Polyzou, and A. M. Moschovi, “Prometheus: A copper-based polymetallic catalyst for automotive applications. part ii: Catalytic efficiency an endurance as compared with original catalysts,” Materials (Basel)., 2021, doi: 10.3390/ma14092226.
M. Chakraborty and A. P. Kundan, “Grafana,” in Monitoring Cloud-Native Applications, Berkeley, CA: Apress, 2021, pp. 187–240. doi: 10.1007/978-1-4842-6888-9_6.
Guru Charan Kakaraparthi, “Building a GenAI-Powered Advanced Code Generation Assistant Integrated with CI/CD Pipelines,” TIJER - INTERNATIONAL RESEARCH JOURNAL, vol. 9, no. 2, Feb.2022, doi: 10.56975/tijer.v9i2.159058.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.


