Transitioning from MLOps to LLMOps: Navigating the Unique Challenges of Large Language Models
Abstract
:1. Introduction
- What platforms and systems can better support LLMs by building on previous MLOps advances?
- Why is it important to address LLMOps challenges not fully managed by traditional MLOps techniques?
- How do LLMOps improve the accuracy of LLMs?
- Why do traditional ML metrics not fully capture LLM performance?
- Why is it important to use LLMOps to improve the performance and accuracy of LLMs?
- How do MLOps and LLMOps differ in terms of their roles in machine learning engineering, and why is it important to understand these differences when addressing challenges in production and deployment? To this aim, key aspects like data management, model development, infrastructure, deployment, system integration, updates and maintenance, versioning, and parallel processing are discussed.
2. Machine Learning Operations (MLOps)
2.1. Why Do We Need MLOps?
2.2. Benefits of MLOps
2.2.1. Automation and Scalability
2.2.2. Continuous Integration and Deployment (CI/CD)
2.2.3. Monitoring and Performance Tracking
2.2.4. Improved Model Quality and Reduced Risk
2.2.5. Improved Collaboration Between Teams
2.2.6. Compliance and Governance
2.2.7. Data Management and Versioning
2.2.8. Simplifies Complex ML Workflows
2.3. Applications of MLOps
- Provide effective security standards for ML implementations in complex operational technologies using this methodology [62].
- Using the help of MLOps technologies, industrial settings can improve their image recognition accuracy and adapt well to new conditions [80].
- Automating model training and deployment, as well as integrating these processes into typical CI/CD pipelines, which is crucial to address the challenges associated with the effective deployment of machine learning models using the MLOps methodology [63].
- The MLOps principles are particularly advantageous for large projects that require continuous deployment and robust automated operations [81].
- Similar approaches can be used by MLOps cross-domain applications in healthcare and finance to effectively manage changing data streams and concept drift [82].
- This study presents a resilience-aware MLOps approach for AI-powered healthcare diagnostic tools. Its primary goal is to make systems more resistant to harmful outside forces, such as hostile attacks and drift [83].
- It could be used in various applications, particularly by using microscopic pictures. For example, the study in [84] investigated the use of MLOps analysis of sparse image data and introduced a comprehensive approach that employs fingerprinting to select optimal models and datasets. The method also employs automated model development while leveraging continuous deployment and monitoring to facilitate learning from errors.
- The use of MLOps for the prediction of lifestyle-related diseases. Through the analysis of massive volumes of diverse healthcare data, this helps to predict lifestyle diseases, which in turn helps to plan prevention, diagnosis, and treatment [85].
- It addresses the challenges of model retraining and versioning, as well as ensures that the model remains efficient and more effective over time, resulting in the rise of MLOps integrating to everyday applications such as smart kitchens and radiology systems to detect turbine performance. These functions mitigate operational challenges, collaboration challenges, and deployment challenges to build an intelligent application using this methodology [86].Table 1 outlines various existing MLOps platforms, highlighting the key features, and providing examples of the use case to give a deeper understanding of the landscape.
2.4. Challenges in MLOps
3. Large Language Model Ops (LLMOps)
3.1. What Are LLMOps?
3.2. LLMOps Life Cycle Components
3.3. Why Do We Need LLMOps?
3.4. Best Practices for LLMOps
3.5. Applications of LLMOps
- The document [165] presents the concepts of LLM–Computer Interaction (LLMCI), in which LLMs integrate with computer vision to engage with user interfaces. The applications of LLMOps encompass the facilitation of LLMs in comprehending and manipulating UI elements, retrieving information, executing functions, and doing duties analogous to human interactions. LLMOps enable more human-like interactions with computers by integrating language comprehension and visual perception capabilities.
- The study in [166] presents a framework for LLMOps, which is a distinct subset of MLOps targeted to the development, deployment, and maintenance of LLMs within Continuous Integration/Continuous Deployment (CI/CD) pipelines. The case study on text summarization demonstrates that integrating a human feedback loop into the LLMs CI/CD pipeline improved its quality and dependability, emphasizing the importance of human input in improving LLMs performance.
- The work in [43] enhances the user experience in personalized recommendation systems by refining extensive machine learning models to provide accurate, timely, relevant, and precise recommendations based on the interests of each individual user, as well as integrating prompt engineering. LLMOps customize input prompts to improve recommendation accuracy and user happiness.
- The work in [167] indicates a reference framework for the development of a large language models (LLMs) application stack, highlighting common systems, tools, and design methodologies identified in companies and AI startups.
3.6. LLMOps Platforms for Managing Large Language Models
4. Development and Operations (DevOps)
What Are DevOps?
5. Difference Between LLMOps, MLOps, and DevOps
6. Open Issues and Future Research Directions
- The integration of LLMs into Continuous Integration/Continuous Deployment (CI/CD) pipelines presents several open challenges and opportunities. Key barriers include computational costs, inaccuracies, error handling, biases, and concerns related to development, deployment, maintenance, and ethics [166]. These issues highlight the need for innovative approaches to seamlessly incorporate LLMs into CI/CD processes, ensuring they are utilized effectively and efficiently. Future research should focus on strategies to enhance the speed, reliability, and consistency of LLMs integration while mitigating associated risks and addressing the ethical implications involved.
- The evolving landscape of LLMOps presents a variety of ongoing challenges that require continued exploration. One prominent issue is the potential for LLMs to introduce inaccuracies and biases within the Continuous Integration/Continuous Deployment (CI/CD) process. This creates the need for rigorous oversight to ensure the quality and reliability of software products. Another key challenge lies in the difficulty of capturing and reproducing test scripts across diverse devices, platforms, and applications. Disparities in screen dimensions, input methods, platform functionalities, API inconsistencies, and varying application designs further complicate this issue [286]. Addressing these challenges will require innovative approaches to improve cross-platform compatibility and ensure consistent behavior of LLMs-driven systems. Future research should focus on developing strategies to mitigate biases and inaccuracies in LLMs, particularly in CI/CD workflows. Additionally, further investigation is needed into methods for standardizing test script reproduction across heterogeneous environments to enhance the scalability and reliability of LLMOps practices.
- Future research directions in LLMOps focus on advancing the development and reliability of LLMs. Key areas include integrating human feedback loops to improve model outputs, addressing biases and ethical concerns in LLMs applications, and enhancing the integration process. Additionally, there is a need to mitigate challenges in natural language understanding and explore the potential of fine-tuning with domain-specific data to improve performance on specialized tasks. Lastly, further research is needed to formulate best practices for incorporating these models into Continuous Integration/Continuous Deployment (CI/CD) pipelines.
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
LLMs | Large Language Models |
LLMOps | Large Language Model Operations |
MLOps | Machine Learning Operations |
BERT | Bidirectional Encoder Representations from Transformers |
GPT | Generative Pretrained Transformer |
LLaMA | Large Language Model Meta AI |
GenAI | Generative Artificial Intelligence |
AIOps | Artificial Intelligence for IT Operations |
RAGOps | Retrieval-Augmented Generation Operations |
GenAIOps | Generative AI Operations |
T5 | Text-to-Text Transfer Transformer |
AutoML | Automated Machine Learning |
YOLO | You Only Look Once |
RAG | Retrieval-Augmented Generation |
LSTM | Long Short-Term Memory |
DevOps | Development and Operations |
CI/CD | Continuous Integration and Continuous Delivery |
DataOps | Data Operations |
SDLC | Software Development Life Cycle |
CSFs | Critical Success Factors |
MLEs | Machine Learning Engineers |
FMOps | Foundation Model Operations |
HITL | Human-in-the-Loop |
LLMCI | LLM–Computer Interaction |
GCP | Google Cloud Platform |
AWS | Amazon Web Services |
EC2 | Elastic Compute Cloud |
S3 | Simple Storage Service |
NLP | Natural Language Processing |
ETL | Extract, Transform, and Load |
ZeRO | Zero Redundancy Optimizer |
DevSecOps | Development, Security, and Operations |
GPU | Graphics Processing Unit |
TPU | Tensor Processing Unit |
GCS | Google Cloud Storage |
DVC | Data Version Control |
EFS | Elastic File System |
HPC | High-Performance Computing |
GKE | Google Kubernetes Engine |
UI | User Interfaces |
AI | Artificial Intelligence |
References
- Eboigbe, E.O.; Farayola, O.A.; Olatoye, F.O.; Nnabugwu, O.C.; Daraojimba, C. Business intelligence transformation through AI and Data Analytics. Eng. Sci. Technol. J. 2023, 4, 285–307. [Google Scholar] [CrossRef]
- Ghobakhloo, M.; Fathi, M.; Iranmanesh, M.; Vilkas, M.; Grybauskas, A.; Amran, A. Generative artificial intelligence in manufacturing: Opportunities for actualizing Industry 5.0 sustainability goals. J. Manuf. Technol. Manag. 2024, 35, 94–121. [Google Scholar] [CrossRef]
- Szmurlo, H.; Akhtar, Z. Digital Sentinels and Antagonists: The Dual Nature of Chatbots in Cybersecurity. Information 2024, 15, 443. [Google Scholar] [CrossRef]
- Zhang, P.; Kamel Boulos, M.N. Generative AI in medicine and healthcare: Promises, opportunities and challenges. Future Internet 2023, 15, 286. [Google Scholar] [CrossRef]
- Pahune, S. Large Language Models and Generative AI’s Expanding Role in Healthcare. Available online: https://www.researchgate.net/publication/377217911_Large_Language_Models_and_Generative_AI’s_Expanding_Role_in_Healthcare (accessed on 22 September 2024).
- EY Insights. How Generative AI in Supply Chain Can Drive Value. 2023. Available online: https://www.ey.com/en_us/insights/supply-chain/how-generative-ai-in-supply-chain-can-drive-value (accessed on 22 September 2024).
- Jackson, I.; Ivanov, D.; Dolgui, A.; Namdar, J. Generative artificial intelligence in supply chain and operations management: A capability-based framework for analysis and implementation. Int. J. Prod. Res. 2024, 62, 6120–6145. [Google Scholar] [CrossRef]
- Ebert, C.; Louridas, P. Generative AI for software practitioners. IEEE Softw. 2023, 40, 30–38. [Google Scholar] [CrossRef]
- Akpınar, M.T. Generative Artificial Intelligence Applications Specific to the Air Transport Industry. In Interdisciplinary Studies on Contemporary Research Practices in Engineering in the 21st Century II; Kaygusuz, K., Ed.; Özgür Publications: İstanbul, Turkey, 2023. [Google Scholar] [CrossRef]
- InData Labs. AI Latest Developments. 2023. Available online: https://indatalabs.com/blog/ai-latest-developments (accessed on 17 August 2024).
- Ajiga, D.; Okeleke, P.A.; Folorunsho, S.O.; Ezeigweneme, C. The role of software automation in improving industrial operations and efficiency. Int. J. Eng. Res. Update 2024, 7, 22–35. [Google Scholar] [CrossRef]
- Schwartz, R.; Schwartz, R.; Vassilev, A.; Greene, K.; Perine, L.; Burt, A.; Hall, P. Towards a Standard for Identifying and Managing Bias in Artificial Intelligence; US Department of Commerce, National Institute of Standards and Technology: Gaithersburg, MD, USA, 2022; Volume 3.
- Song, C.; Raghunathan, A. Information leakage in embedding models. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, Virtual, 9–13 November 2020; pp. 377–390. [Google Scholar]
- Hitaj, B.; Ateniese, G.; Perez-Cruz, F. Deep models under the GAN: Information leakage from collaborative deep learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; pp. 603–618. [Google Scholar]
- Rigaki, M.; Garcia, S. A survey of privacy attacks in machine learning. ACM Comput. Surv. 2023, 56, 1–34. [Google Scholar] [CrossRef]
- Li, P.; Wang, X.; Huang, K.; Huang, Y.; Li, S.; Iqbal, M. Multi-model running latency optimization in an edge computing paradigm. Sensors 2022, 22, 6097. [Google Scholar] [CrossRef]
- Greco, S.; Vacchetti, B.; Apiletti, D.; Cerquitelli, T. Unsupervised Concept Drift Detection from Deep Learning Representations in Real-time. arXiv 2024, arXiv:2406.17813. [Google Scholar]
- Kreuzberger, D.; Kühl, N.; Hirschl, S. Machine learning operations (mlops): Overview, definition, and architecture. IEEE Access 2023, 11, 31866–31879. [Google Scholar] [CrossRef]
- Symeonidis, G.; Nerantzis, E.; Kazakis, A.; Papakostas, G.A. Mlops-definitions, tools and challenges. In Proceedings of the 2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC), Virtual, 26–29 January 2022; pp. 453–460. [Google Scholar]
- John Snow Labs. Introduction to Large Language Models (LLMs): An Overview of BERT, GPT, and Other Popular Models. 2024. Available online: https://www.johnsnowlabs.com/introduction-to-large-language-models-llms-an-overview-of-bert-gpt-and-other-popular-models/ (accessed on 14 September 2024).
- Gao, Y.; Baptista-Hon, D.T.; Zhang, K. The inevitable transformation of medicine and research by large language models: The possibilities and pitfalls. MEDCOMM-Future Med. 2023, 2, 1–2. [Google Scholar] [CrossRef]
- Minaee, S.; Mikolov, T.; Nikzad, N.; Chenaghlu, M.; Socher, R.; Amatriain, X.; Gao, J. Large language models: A survey. arXiv 2024, arXiv:2402.06196. [Google Scholar]
- Zhu, Y.; Yuan, H.; Wang, S.; Liu, J.; Liu, W.; Deng, C.; Chen, H.; Dou, Z.; Wen, J.R. Large language models for information retrieval: A survey. arXiv 2023, arXiv:2308.07107. [Google Scholar]
- Lee, J.; Stevens, N.; Han, S.C.; Song, M. A survey of large language models in finance (finllms). arXiv 2024, arXiv:2402.02315. [Google Scholar]
- Yuan, F.; Yuan, S.; Wu, Z.; Li, L. How Multilingual is Multilingual LLM? arXiv 2023, arXiv:2311.09071. [Google Scholar]
- Dada, A.; Bauer, M.; Contreras, A.B.; Koraş, O.A.; Seibold, C.M.; Smith, K.E.; Kleesiek, J. CLUE: A Clinical Language Understanding Evaluation for LLMs. arXiv 2024, arXiv:2404.04067. [Google Scholar]
- Wang, W.; Chen, Z.; Chen, X.; Wu, J.; Zhu, X.; Zeng, G.; Luo, P.; Lu, T.; Zhou, J.; Qiao, Y.; et al. Visionllm: Large language model is also an open-ended decoder for vision-centric tasks. In Proceedings of the 37th Conference on Neural Information Processing Systems, New Orleans, LA, USA, 6–10 December 2024. [Google Scholar]
- Naveed, H.; Khan, A.U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; Mian, A. A comprehensive overview of large language models. arXiv 2023, arXiv:2307.06435. [Google Scholar]
- Pahune, S.; Chandrasekharan, M. Several categories of large language models (llms): A short survey. arXiv 2023, arXiv:2307.10188. [Google Scholar] [CrossRef]
- Hadi, M.U.; Qureshi, R.; Shah, A.; Irfan, M.; Zafar, A.; Shaikh, M.B.; Akhtar, N.; Wu, J.; Mirjalili, S.; Shah, M.; et al. Large language models: A comprehensive survey of its applications, challenges, limitations, and future prospects. Authorea Prepr. 2023. [Google Scholar] [CrossRef]
- Zheng, J.; Qiu, S.; Shi, C.; Ma, Q. Towards Lifelong Learning of Large Language Models: A Survey. arXiv 2024, arXiv:2406.06391. [Google Scholar]
- Huang, K.; Wang, Y.; Goertzel, B.; Li, Y.; Wright, S.; Ponnapalli, J. Generative AI Security; Springer: Berlin/Heidelberg, Germany, 2024. [Google Scholar] [CrossRef]
- Yao, Y.; Duan, J.; Xu, K.; Cai, Y.; Sun, Z.; Zhang, Y. A survey on large language model (llm) security and privacy: The good, the bad, and the ugly. High-Confid. Comput. 2024, 4, 100211. [Google Scholar] [CrossRef]
- Zhao, S.; Tuan, L.A.; Fu, J.; Wen, J.; Luo, W. Exploring Clean Label Backdoor Attacks and Defense in Language Models. IEEE/ACM Trans. Audio Speech Lang. Process. 2024, 32, 3014–3024. [Google Scholar] [CrossRef]
- Yao, J.; Luo, H.; Zhang, X.L. Interpretable spectrum transformation attacks to speaker recognition. arXiv 2023, arXiv:2302.10686. [Google Scholar] [CrossRef]
- Tong, C.; Zheng, X.; Li, J.; Ma, X.; Gao, L.; Xiang, Y. Query-Efficient Black-Box Adversarial Attacks on Automatic Speech Recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 2023, 31, 3981–3992. [Google Scholar] [CrossRef]
- Alber, D.A.; Yang, Z.; Alyakin, A.; Yang, E.; Rai, S.; Valliani, A.A.; Zhang, J.; Rosenbaum, G.R.; Amend-Thomas, A.K.; Kurland, D.B.; et al. Medical large language models are vulnerable to data-poisoning attacks. Nat. Med. 2025, 1–9. [Google Scholar] [CrossRef]
- Chu, J.; Liu, Y.; Yang, Z.; Shen, X.; Backes, M.; Zhang, Y. Comprehensive assessment of jailbreak attacks against llms. arXiv 2024, arXiv:2402.05668. [Google Scholar]
- Abdali, S.; He, J.; Barberan, C.; Anarfi, R. Can LLMs be Fooled? Investigating Vulnerabilities in LLMs. arXiv 2024, arXiv:2407.20529. [Google Scholar]
- Signoz. LLMOps: What Is LLMOps and How Is It Different from MLOps? 2023. Available online: https://signoz.io/guides/llmops/ (accessed on 18 August 2024).
- Shan, R.; Shan, T. Enterprise LLMOps: Advancing Large Language Models Operations Practice. In Proceedings of the 2024 IEEE Cloud Summit, Washington, DC, USA, 27–28 June 2024; pp. 143–148. [Google Scholar]
- Kamath, U.; Keenan, K.; Somers, G.; Sorenson, S. LLMs in Production. In Large Language Models: A Deep Dive: Bridging Theory and Practice; Springer: Berlin/Heidelberg, Germany, 2024; pp. 315–373. [Google Scholar]
- Chenxi, S.; Liang, P.; Wu, Y.; Zhan, T.; Jin, Z. Maximizing user experience with LLMOps-driven personalized recommendation systems. Appl. Comput. Eng. 2024, 64, 100–106. [Google Scholar] [CrossRef]
- AIMultiple Research. LLM Security: Ensuring Safe and Secure Use of Large Language Models. 2023. Available online: https://research.aimultiple.com/llm-security/ (accessed on 18 August 2024).
- Najafabadi, F.A.; Bogner, J.; Gerostathopoulos, I.; Lago, P. An Analysis of MLOps Architectures: A Systematic Mapping Study. arXiv 2024, arXiv:2406.19847. [Google Scholar]
- Faubel, L.; Schmid, K. MLOps: A Multiple Case Study in Industry 4.0. arXiv 2024, arXiv:2407.09107. [Google Scholar]
- di Laurea, I.S. Mlops-Standardizing the Machine Learning Workflow. Ph.D. Thesis, University of Bologna, Bologna, Italy, 2021. [Google Scholar]
- Xu, R. A Design Pattern for Deploying Machine Learning Models to Production. 2020. Available online: https://scholarworks.calstate.edu/downloads/1v53k296v (accessed on 29 October 2024).
- Eken, B.; Pallewatta, S.; Tran, N.K.; Tosun, A.; Babar, M.A. A Multivocal Review of MLOps Practices, Challenges and Open Issues. arXiv 2024, arXiv:2406.09737. [Google Scholar]
- Testi, M.; Ballabio, M.; Frontoni, E.; Iannello, G.; Moccia, S.; Soda, P.; Vessio, G. MLOps: A taxonomy and a methodology. IEEE Access 2022, 10, 63606–63618. [Google Scholar] [CrossRef]
- Jana, A.D. The MLOps Approach to Model Deployment: A Road Map to Seamless Scalability. J. Artif. Intell. Cloud Comput. 2022, 1, 1–4. [Google Scholar] [CrossRef]
- Matsui, B.M.A.; Goya, D.H. MLOps: Five Steps to Guide its Effective Implementation. In Proceedings of the 1st International Conference on AI Engineering: Software Engineering for AI, Pittsburgh, PA, USA, 16–17 May 2022. [Google Scholar] [CrossRef]
- Tabassam, A. MLOps: A Step Forward to Enterprise Machine Learning. arXiv 2023. [Google Scholar] [CrossRef]
- Wazir, S.; Kashyap, G.S.; Saxena, P. Mlops: A review. arXiv 2023, arXiv:2308.10908. [Google Scholar]
- Barring, N. AIOps vs MLOps vs LLMOps: Choosing the Right AI Operations Strategy. 2024. Available online: https://www.nscale.com/blog/aiops-vs-mlops-vs-llmops-choosing-the-right-ai-operations-strategy (accessed on 28 December 2024).
- Prabhune, S.; Berndt, D.J. Deploying Large Language Models with Retrieval Augmented Generation. arXiv 2024. [Google Scholar] [CrossRef]
- NVIDIA. Mastering LLM Techniques with LLMOps. 2023. Available online: https://developer.nvidia.com/blog/mastering-llm-techniques-llmops/ (accessed on 26 December 2024).
- Gupta, P.; Bagchi, A. MLOps: Machine Learning Operations. In Essentials of Python for Artificial Intelligence and Machine Learning; Springer: Berlin/Heidelberg, Germany, 2024; pp. 489–518. [Google Scholar]
- Databricks. MLOps Glossary. 2024. Available online: https://www.databricks.com/solutions/machine-learning (accessed on 22 September 2024).
- Nogare, D.; Silveira, I.F. Experimentation, deployment and monitoring Machine Learning models: Approaches for applying MLOps. arXiv 2024, arXiv:2408.11112. [Google Scholar]
- Databricks. MLOps. 2023. Available online: https://www.databricks.com/glossary/mlops (accessed on 18 August 2024).
- Ahmad, T.; Adnan, M.; Rafi, S.; Akbar, M.A.; Anwar, A. MLOps-Enabled Security Strategies for Next-Generation Operational Technologies. In Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering, Salerno, Italy, 18–21 June 2024; pp. 662–667. [Google Scholar]
- Liang, P.; Song, B.; Zhan, X.; Chen, Z.; Yuan, J. Automating the Training and Deployment of Models in MLOps by Integrating Systems with Machine Learning. arXiv 2024, arXiv:2405.09819. [Google Scholar]
- Subramanya, R.; Sierla, S.; Vyatkin, V. From DevOps to MLOps: Overview and application to electricity market forecasting. Appl. Sci. 2022, 12, 9851. [Google Scholar] [CrossRef]
- Banerjee, A.; Chen, C.C.; Hung, C.C.; Huang, X.; Wang, Y.; Chevesaran, R. Challenges and Experiences with {MLOps} for Performance Diagnostics in {Hybrid-Cloud} Enterprise Software Deployments. In Proceedings of the 2020 USENIX Conference on Operational Machine Learning (OpML 20), Santa Clara, CA, USA, 28 July–7 August 2020. [Google Scholar]
- Neptune.ai. MLOps: A Comprehensive Guide to Machine Learning Operations. 2023. Available online: https://neptune.ai/blog/mlops (accessed on 18 August 2024).
- Mehmood, Y.; Sabahat, N.; Ijaz, M.A. MLOps critical success factors—A systematic literature review. VFAST Trans. Softw. Eng. 2024, 12, 183–209. [Google Scholar] [CrossRef]
- Joshi, A. MLOps Mastery: Streamlining Machine Learning Lifecycle Management. Int. J. Sci. Res. 2024. [Google Scholar] [CrossRef]
- Kabbay, H.S. Streamlining AI Application: MLOps Best Practices and Platform Automation Illustrated through an Advanced RAG based Chatbot. In Proceedings of the 2024 2nd International Conference on Sustainable Computing and Smart Systems (ICSCSS), Coimbatore, India, 10–12 July 2024. [Google Scholar] [CrossRef]
- Bodor, A.; Hnida, M.; Daoudi, N. From Development to Deployment: An Approach to MLOps Monitoring for Machine Learning Model Operationalization. In Proceedings of the 2023 14th International Conference on Intelligent Systems: Theories and Applications (SITA), Casablanca, Morocco, 22–23 November 2023. [Google Scholar] [CrossRef]
- Godwin, R.C.; Melvin, R.L. Toward efficient data science: A comprehensive MLOps template for collaborative code development and automation. SoftwareX 2024, 26, 101723. [Google Scholar] [CrossRef]
- Tatineni, S.; Boppana, V.R. AI-Powered DevOps and MLOps Frameworks: Enhancing Collaboration, Automation, and Scalability in Machine Learning Pipelines. J. Artif. Intell. Res. Appl. 2021, 1, 58–88. [Google Scholar]
- Sothilingam, R.; Pant, V.; Eric, S. Using i* to Analyze Collaboration Challenges in MLOps Project Teams. In Proceedings of the iStar, Hyderabad, India, 17 October 2022; pp. 1–6. [Google Scholar]
- Torres, A.P.G.; Sawhney, N. Role of Regulatory Sandboxes and MLOps for AI-Enabled Public Sector Services. Rev. Socionetwork Strateg. 2023, 17, 297–318. [Google Scholar] [CrossRef]
- Pulicharla, M.R. Data Versioning and Its Impact on Machine Learning Models. J. Sci. Technol. 2024, 5, 22–37. [Google Scholar] [CrossRef]
- Makinen, S.; Skogstrom, H.; Laaksonen, E.; Mikkonen, T. Who Needs MLOps: What Data Scientists Seek to Accomplish and How Can MLOps Help? In Proceedings of the 2021 IEEE/ACM 1st Workshop on AI Engineering—Software Engineering for AI (WAIN), Madrid, Spain, 30–31 May 2021. [Google Scholar] [CrossRef]
- Narayanappa, A.K.; Amrit, C. An Analysis of the Barriers Preventing the Implementation of MLOps. In Proceedings of the IFIP Advances in Information and Communication Technology, Denton, TX, USA, 2–3 November 2023. [Google Scholar] [CrossRef]
- Prasanna, G. Optimizing the Future: Unveiling the Significance of MLOps in Streamlining the Machine Learning Lifecycle. Int. J. Sci. Res. Eng. Technol. 2024, 4, 5–8. [Google Scholar] [CrossRef]
- Gallinucci, E. MLOps—Standardizing the Machine Learning Workflow. 2023. Available online: https://amslaurea.unibo.it/id/eprint/23645/1/tesi_enrico_salvucci.pdf (accessed on 22 December 2024).
- Varga, P.; Kovári, Á.; Herkules, M.; Hegedus, C. MLOps in CPS–a use-case for image recognition in changing industrial settings. In Proceedings of the NOMS 2024—2024 IEEE Network Operations and Management Symposium, Seoul, Republic of Korea, 6–10 May 2024; pp. 1–4. [Google Scholar]
- Araujo, G.; Kalinowski, M.; Endler, M.; Calefato, F. Professional Insights into Benefits and Limitations of Implementing MLOps Principles. arXiv 2024, arXiv:2403.13115. [Google Scholar]
- Tembhekar, P.; Malaiyappan, J.N.A.; Shanmugam, L. Cross-Domain Applications of MLOps: From Healthcare to Finance. J. Knowl. Learn. Sci. Technol. 2023, 2, 581–598. [Google Scholar] [CrossRef]
- Moskalenko, V.; Kharchenko, V. Resilience-aware MLOps for AI-based medical diagnostic system. Front. Public Health 2024, 12, 1342937. [Google Scholar] [CrossRef] [PubMed]
- Sitcheu, A.Y.; Friederich, N.; Baeuerle, S.; Neumann, O.; Reischl, M.; Mikut, R. MLOps for Scarce Image Data: A Use Case in Microscopic Image Analysis. In Proceedings of the Proceedings-33. Workshop Computational Intelligence, Berlin, Germany, 23–24 November 2023; KIT Scientific Publishing: Karlsruhe, Germany, 2023; Volume 23, p. 169. [Google Scholar]
- Reddy, M.; Dattaprakash, B.; Kammath, S.; Kn, S.; Manokaran, S.; Be, R. Application of MLOps in Prediction of Lifestyle Diseases. ECS Trans. 2022, 107, 1191. [Google Scholar] [CrossRef]
- Vartak, M. From ml models to intelligent applications: The rise of mlops. Proc. Vldb Endow. 2021, 14, 3419. [Google Scholar] [CrossRef]
- Posoldova, A. Machine learning pipelines: From research to production. IEEE Potentials 2020, 39, 38–42. [Google Scholar] [CrossRef]
- Silva, L.C.; Zagatti, F.R.; Sette, B.S.; dos Santos Silva, L.N.; Lucrédio, D.; Silva, D.F.; de Medeiros Caseli, H. Benchmarking machine learning solutions in production. In Proceedings of the 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA), Miami, FL, USA, 14–17 December 2020; pp. 626–633. [Google Scholar]
- Karlaš, B.; Interlandi, M.; Renggli, C.; Wu, W.; Zhang, C.; Mukunthu Iyappan Babu, D.; Edwards, J.; Lauren, C.; Xu, A.; Weimer, M. Building continuous integration services for machine learning. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, CA, USA, 6–10 July 2020; pp. 2407–2415. [Google Scholar]
- Nigenda, D.; Karnin, Z.; Zafar, M.B.; Ramesha, R.; Tan, A.; Donini, M.; Kenthapadi, K. Amazon sagemaker model monitor: A system for real-time insights into deployed machine learning models. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 14–18 August 2022; pp. 3671–3681. [Google Scholar]
- Karakus, C.; Huilgol, R.; Wu, F.; Subramanian, A.; Daniel, C.; Cavdar, D.; Xu, T.; Chen, H.; Rahnama, A.; Quintela, L. Amazon sagemaker model parallelism: A general and flexible framework for large model training. arXiv 2021, arXiv:2111.05972. [Google Scholar]
- Das, P.; Ivkin, N.; Bansal, T.; Rouesnel, L.; Gautier, P.; Karnin, Z.; Dirac, L.; Ramakrishnan, L.; Perunicic, A.; Shcherbatyi, I.; et al. Amazon SageMaker Autopilot: A white box AutoML solution at scale. In Proceedings of the Fourth International Workshop on Data Management for End-to-End Machine Learning, Portland, OR, USA, 14 June 2020; pp. 1–7. [Google Scholar]
- Choi, W.; Choi, T.; Heo, S. A Comparative Study of Automated Machine Learning Platforms for Exercise Anthropometry-Based Typology Analysis: Performance Evaluation of AWS SageMaker, GCP VertexAI, and MS Azure. Bioengineering 2023, 10, 891. [Google Scholar] [CrossRef] [PubMed]
- Bagai, R. Comparative Analysis of AWS Model Deployment Services. arXiv 2024, arXiv:2405.08175. [Google Scholar]
- Pala, S.K. Databricks Analytics: Empowering Data Processing, Machine Learning and Real-Time Analytics. Mach. Learn. 2021, 10, 76–82. [Google Scholar]
- L’Esteve, R. Databricks. In The Azure Data Lakehouse Toolkit; Apress: Berkeley, CA, USA, 2022. [Google Scholar] [CrossRef]
- Althati, C.; Tomar, M.; Malaiyappan, J.N.A. Scalable Machine Learning Solutions for Heterogeneous Data in Distributed Data Platform. J. Artif. Intell. Gen. Sci. 2024, 4, 299–309. [Google Scholar] [CrossRef]
- Ruan, W.; Chen, Y.; Forouraghi, B. On Development of Data Science and Machine Learning Applications in Databricks. In Proceedings of the World Congress on Services, San Diego, CA, USA, 25–30 June 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 78–91. [Google Scholar]
- Zaharia, M. Lessons from large-scale software as a service at databricks. In Proceedings of the ACM Symposium on Cloud Computing, Santa Cruz, CA, USA, 20–23 November 2019; p. 101. [Google Scholar]
- El Moutaouakal, W.; Baïna, K. Comparative Experimentation of MLOps Power on Microsoft Azure, Amazon Web Services, and Google Cloud Platform. In Proceedings of the 2023 IEEE 6th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech), Marrakesh, Morocco, 21–23 November 2023; pp. 1–8. [Google Scholar]
- De Rosa, P.; Bromberg, Y.D.; Felber, P.; Mvondo, D.; Schiavoni, V. On the Cost of Model-Serving Frameworks: An Experimental Evaluation. In Proceedings of the 2024 IEEE International Conference on Cloud Engineering (IC2E), Paphos, Cyprus, 24–27 September 2024; pp. 221–232. [Google Scholar]
- PyTorch Team. PyTorch. Available online: https://pytorch.org/ (accessed on 6 October 2024).
- Hao, Y.; Zhao, X.; Bao, B.; Berard, D.; Constable, W.; Aziz, A.; Liu, X. Torchbench: Benchmarking pytorch with high api surface coverage. arXiv 2023, arXiv:2304.14226. [Google Scholar]
- Mishra, P. Distributed PyTorch Modelling, Model Optimization, and Deployment. In PyTorch Recipes: A Problem-Solution Approach to Build, Train and Deploy Neural Network Models; Springer: Berlin/Heidelberg, Germany, 2022; pp. 187–212. [Google Scholar]
- Jain, S.; Kumar, P. Cost Effective Generic Machine Learning Operation: A Case Study. In Proceedings of the 2023 International Conference on Data Science and Network Security (ICDSNS), Tiptur, India, 28–29 July 2023; pp. 1–6. [Google Scholar]
- Documentation, M. MLflow Tutorials and Examples. Available online: https://mlflow.org/docs/latest/tutorials-and-examples/index.html (accessed on 6 October 2024).
- Hsu, C.C.; Chen, P.H.; Wu, I.Z. End-to-End Automation of ML Model Lifecycle Management using Machine Learning Operations Platforms. In Proceedings of the 2024 International Conference on Consumer Electronics-Taiwan (ICCE-Taiwan), Kaohsiung, Taiwan, 16–18 July 2024; pp. 209–210. [Google Scholar]
- Vijayan, N.E. Building Scalable MLOps: Optimizing Machine Learning Deployment and Operations. Indian Sci. J. Res. Eng. Manag. 2024, 8, 1–9. [Google Scholar] [CrossRef]
- Heydari, M.; Rezvani, Z. Challenges and Experiences of Iranian Developers with MLOps at Enterprise. In Proceedings of the 2023 7th Iranian Conference on Advances in Enterprise Architecture (ICAEA), Tehran, Iran, 15–16 November 2023. [Google Scholar] [CrossRef]
- Zhou, Y.; Yu, Y.; Ding, B. Towards mlops: A case study of ml pipeline platform. In Proceedings of the 2020 International Conference on Artificial Intelligence and Computer Engineering (ICAICE), Beijing, China, 22–25 October 2020; pp. 494–500. [Google Scholar]
- Yuan, D.Y.; Wildish, T. Bioinformatics application with kubeflow for batch processing in clouds. In Proceedings of the International Conference on High Performance Computing, Pune, India, 16–19 December 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 355–367. [Google Scholar]
- Tagliabue, J.; Bowne-Anderson, H.; Tuulos, V.; Goyal, S.; Cledat, R.; Berg, D. Reasonable scale machine learning with open-source metaflow. arXiv 2023, arXiv:2303.11761. [Google Scholar]
- Documentation, M. What Is Metaflow? Available online: https://docs.metaflow.org/introduction/what-is-metaflow (accessed on 6 October 2024).
- InfoQ. Netflix Introduces Metaflow: Simplifying Human-Centric AI Development. 2024. Available online: https://www.infoq.com/news/2024/03/netflix-metaflow/#:~:text=By%20creating%20various%20integrations%20for,without%20incurring%20unsustainable%20operational%20overhead (accessed on 6 October 2024).
- Gliozzo, A.; Biran, O.; Patwardhan, S.; McKeown, K. Semantic technologies in IBM Watson. In Proceedings of the Fourth Workshop on Teaching NLP and CL, Sofia, Bulgaria, 9 August 2013; pp. 85–92. [Google Scholar]
- Packowski, S.; Lakhana, A. Using IBM watson cloud services to build natural language processing solutions to leverage chat tools. In Proceedings of the 27th Annual International Conference on Computer Science and Software Engineering, Markham, ON, Canada, 6–8 November 2017; pp. 211–218. [Google Scholar]
- Cloudera. Cloudera Delivers Open Standards-Based MLOps, Empowering Enterprises to Industrialize AI. 2020. Available online: https://www.cloudera.com/about/news-and-blogs/press-releases/2020-05-06-cloudera-delivers-open-standards-based-mlops-empowering-enterprises-to-industrialize-ai.html (accessed on 6 October 2024).
- Valova, I. Research and Analysis of the Execution of Different Types of SQL Queries with Impala in Cloudera in Education. In Proceedings of the 2024 XXXIII International Scientific Conference Electronics (ET), Sozopol, Bulgaria, 17–19 September 2024; pp. 1–5. [Google Scholar]
- Martins, R.R. Automation of Machine Learning Pipelines for Anomaly Detection Challenges. Ph.D. Thesis, Universidade do Minho, Braga, Portugal, 2023. [Google Scholar]
- Contino. Apache Airflow: The Hands-On Guide. Available online: https://www.contino.io/insights/apache-airflow (accessed on 6 October 2024).
- Yasmin, J.; Wang, J.A.; Tian, Y.; Adams, B. An empirical study of developers’ challenges in implementing Workflows as Code: A case study on Apache Airflow. J. Syst. Softw. 2025, 219, 112248. [Google Scholar] [CrossRef]
- Tian, L.; Sedona, R.; Mozaffari, A.; Kreshpa, E.; Paris, C.; Riedel, M.; Schultz, M.G.; Cavallaro, G. End-to-End Process Orchestration of Earth Observation Data Workflows with Apache Airflow on High Performance Computing. In Proceedings of the IGARSS 2023—2023 IEEE International Geoscience and Remote Sensing Symposium, Pasadena, CA, USA, 16–21 July 2023; pp. 711–714. [Google Scholar]
- Abdelkader, H.; Abdelrazek, M.; Schneider, J.G.; Rani, P.; Vasa, R. Robustness Attributes to Safeguard Machine Learning Models in Production. In Proceedings of the 2023 IEEE Engineering Informatics, Melbourne, Australia, 22–23 November 2023; pp. 1–9. [Google Scholar]
- Amershi, S.; Begel, A.; Bird, C.; DeLine, R.; Gall, H.; Kamar, E.; Nagappan, N.; Nushi, B.; Zimmermann, T. Software engineering for machine learning: A case study. In Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), Montreal, QC, Canada, 27 May 2019; pp. 291–300. [Google Scholar]
- Polyzotis, N.; Roy, S.; Whang, S.E.; Zinkevich, M. Data management challenges in production machine learning. In Proceedings of the 2017 ACM International Conference on Management of Data, Chicago, IL, USA, 14–19 May 2017; pp. 1723–1726. [Google Scholar]
- Haertel, C.; Staegemann, D.; Daase, C.; Pohl, M.; Nahhas, A.; Turowski, K. MLOps in Data Science Projects: A Review. In Proceedings of the 2023 IEEE International Conference on Big Data (BigData), Sorrento, Italy, 15–18 December 2023; pp. 2396–2404. [Google Scholar]
- de Arcaya, J.D.; de Armentia, J.L.; Miñón, R.; Ojanguren, I.L.; Torre-Bastida, A.I. Large Language Model Operations (LLMOps): Definition, Challenges, and Lifecycle Management. In Proceedings of the 2024 9th International Conference on Smart and Sustainable Technologies (SpliTech), Bol and Split, Croatia, 25–28 June 2024. [Google Scholar] [CrossRef]
- Weights & Biases. Understanding LLMOps: Large Language Model Operations. Available online: https://wandb.ai/site/articles/understanding-llmops-large-language-model-operations/ (accessed on 7 September 2024).
- Sinha, M.; Menon, S.; Sagar, R. LLMOps: Definitions, Framework and Best Practices. In Proceedings of the 2024 International Conference on Electrical, Computer and Energy Technologies (ICECET), Sydney, Australia, 25–27 July 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Huang, K.; Manral, V.; Wang, W. From LLMOps to DevSecOps for GenAI; Springer: Berlin/Heidelberg, Germany, 2023. [Google Scholar] [CrossRef]
- Databricks. The Big Book of MLOps—2nd Edition. 2023. Available online: https://www.databricks.com/sites/default/files/2024-06/2023-10-EB-Big-Book-of-MLOps-2nd-Edition.pdf (accessed on 5 October 2024).
- Kolesnikov, D.; Kozlova, A.; Alexandrov, A.; Kalmykov, N.; Treshkov, P.; LeBaron, T.W.; Medvedev, O. Applying ChatGPT to writing scientific articles on the use of telemedicine: Opportunities and limitations. Artif. Intell. Health 2024, 1, 53–63. [Google Scholar] [CrossRef]
- Korada, L. GitHub Copilot: The Disrupting AI Companion Transforming the Developer Role and Application Lifecycle Management. J. Artif. Intell. Cloud Comput. 2024, 3, 1–4. [Google Scholar] [CrossRef]
- Wermelinger, M. Using GitHub Copilot to Solve Simple Programming Problems; Association for Computing Machinery: New York, NY, USA, 2023. [Google Scholar] [CrossRef]
- Osawa, K. Integrating Automated Written Corrective Feedback into E-Portfolios for second language Writing: Notion and Notion AI. RELC J. 2023, 55. [Google Scholar] [CrossRef]
- Jasper AI. Jasper AI. 2024. Available online: https://www.jasper.ai/ (accessed on 28 September 2024).
- Chang, T.A.; Bergen, B.K. Language model behavior: A comprehensive survey. Comput. Linguist. 2024, 50, 293–350. [Google Scholar] [CrossRef]
- Duan, J.; Zhang, S.; Wang, Z.; Jiang, L.; Qu, W.; Hu, Q.; Wang, G.; Weng, Q.; Yan, H.; Zhang, X.; et al. Efficient Training of Large Language Models on Distributed Infrastructures: A Survey. arXiv 2024. [Google Scholar] [CrossRef]
- Lee, Y.S.; Lee, J.K. A Study on Technological Perception Analysis of LLMs through Big Data Analysis of News Articles. J. Korea Multimed. Soc. 2024, 27, 287–298. [Google Scholar] [CrossRef]
- Mailach, A.; Simon, S.; Dorn, J.; Siegmund, N. Practitioners’ Discussions on Building LLM-based Applications for Production. arXiv 2024, arXiv:2411.08574. [Google Scholar]
- Park, C.; Jiang, J.; Wang, S.; Paul, S.; Tang, J. LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs. arXiv 2024. [Google Scholar] [CrossRef]
- Abdelnabi, S.; Fay, A.; Cherubin, G.; Salem, A.; Fritz, M.; Paverd, A. Are you still on track!? Catching LLM Task Drift with Activations. arXiv 2024. [Google Scholar] [CrossRef]
- Echterhoff, J.; Faghri, F.; Vemulapalli, R.; Hu, T.Y.; Li, C.; Tuzel, O.; Pouransari, H. MUSCLE: A Model Update Strategy for Compatible LLM Evolution. arXiv 2024. [Google Scholar] [CrossRef]
- Bhan, L. Mastering LLMOps: Best Practices for Managing and Deploying Large Language Models. 2023. Available online: https://lekha-bhan88.medium.com/mastering-llmops-best-practices-for-managing-and-deploying-large-language-models-c8ca0da648d9 (accessed on 26 December 2024).
- Fernandez, R.C.; Elmore, A.J.; Franklin, M.J.; Krishnan, S.; Tan, C. How large language models will disrupt data management. Proc. VLDB Endow. 2023, 16, 3302–3309. [Google Scholar] [CrossRef]
- Snell, C.; Lee, J.; Xu, K.; Kumar, A. Scaling llm test-time compute optimally can be more effective than scaling model parameters. arXiv 2024, arXiv:2408.03314. [Google Scholar]
- Leiker, D.; Finnigan, S.; Gyllen, A.R.; Cukurova, M. Prototyping the use of Large Language Models (LLMs) for adult learning content creation at scale. arXiv 2023, arXiv:2306.01815. [Google Scholar]
- Hassani, S. Enhancing Legal Compliance and Regulation Analysis with Large Language Models. arXiv 2024, arXiv:2404.17522. [Google Scholar]
- Soni, V. Large language models for enhancing customer lifecycle management. J. Empir. Soc. Sci. Stud. 2023, 7, 67–89. [Google Scholar]
- Preuss, N.; Alshehri, A.S.; You, F. Large Language Models for Life Cycle Assessments: Opportunities, Challenges, and Risks. J. Clean. Prod. 2024, 466, 142824. [Google Scholar] [CrossRef]
- Yosifova, V. Application of Open- source Large Language Model (LLM) for Simulation of a Vulnerable IoT System and Cybersecurity Best Practices Assistance. Preprints 2024. [Google Scholar] [CrossRef]
- Xu, J.; Stokes, J.W.; McDonald, G.; Bai, X.; Marshall, D.; Wang, S.; Swaminathan, A.; Li, Z. AutoAttacker: A Large Language Model Guided System to Implement Automatic Cyber-attacks. arXiv 2024. [Google Scholar] [CrossRef]
- Vinayak, E.S.; Anbuthiruvarangan, M.K.; Chakradhar, K.; P, A. Enhancing Cybersecurity Through AI-Driven Threat Detection: A Transfer Learning Approach. Int. J. Multidiscip. Res. 2024, 6. [Google Scholar] [CrossRef]
- Hassanin, M.; Keshk, M.; Salim, S.; Alsubaie, M.; Sharma, D. PLLM-CS: Pre-trained Large Language Model (LLM) for Cyber Threat Detection in Satellite Networks. arXiv 2024. [Google Scholar] [CrossRef]
- Boi, B.; Esposito, C.; Lee, S. Smart Contract Vulnerability Detection: The Role of Large Language Model (LLM). ACM SIGAPP Appl. Comput. Rev. 2024, 24, 19–29. [Google Scholar] [CrossRef]
- Gebreab, S.A.; Salah, K.; Jayaraman, R.; ur Rehman, M.H.; Ellaham, S. LLM-Based Framework for Administrative Task Automation in Healthcare. In Proceedings of the 2024 12th International Symposium on Digital Forensics and Security (ISDFS), San Antonio, TX, USA, 29–30 April 2024; pp. 1–7. [Google Scholar]
- Pashangpour, S.; Nejat, G. The Future of Intelligent Healthcare: A Systematic Analysis and Discussion on the Integration and Impact of Robots Using Large Language Models for Healthcare. Robotics 2024, 13, 112. [Google Scholar] [CrossRef]
- Pears, M.; Konstantinidis, S. The Impact of Aligning Artificial Intelligence Large Language Models With Bloom’s Taxonomy in Healthcare Education. Adv. Bus. Inf. Syst. Anal. Book Ser. 2024. [Google Scholar] [CrossRef]
- Jiang, X.; Yan, L.; Vavekanand, R.; Hu, M. Large Language Models in Healthcare Current Development and Future Directions. Preprints 2024. [Google Scholar] [CrossRef]
- Liu, J.; Wang, C.; Liu, S. Applications of Large Language Models in Clinical Practice: Path, Challenges, and Future Perspectives. OSF Prepr. 2024. [Google Scholar] [CrossRef]
- Tustumi, F.; Andreollo, N.A.; de Aguilar-Nascimento, J.E. Future of the language models in healthcare: The role of chatgpt. ABCD 2023, 36, e1727. [Google Scholar] [CrossRef]
- Wade, E.C.; Stirman, S.W.; Ungar, L.H.; Boland, C.L.; Schwartz, H.A.; Yaden, D.B.; Sedoc, J.; DeRubeis, R.J.; Willer, R.; Eichstaedt, J.C. Large language models could change the future of behavioral healthcare: A proposal for responsible development and evaluation. npj Ment. Health Res. 2024, 3, 12. [Google Scholar] [CrossRef]
- Sharaf, S.; Anoop, V. An Analysis on Large Language Models in Healthcare: A Case Study of BioBERT. arXiv 2023. [Google Scholar] [CrossRef]
- Xia, X.; Jin, Z.; Aiello, M.; Zhang, D.; Liang, G.; Hu, X. Software Service Engineering in the Era of Large Language Models. In Proceedings of the 2024 IEEE International Conference on Software Services Engineering (SSE), IEEE Computer Society, Shenzhen, China, 7–13 July 2024; p. xxiii. [Google Scholar]
- Barham, H.; Fasha, M. Towards LLMCI—Multimodal AI for LLM-Vision UI Operation. Preprint 2024. [Google Scholar] [CrossRef]
- Chen, T. Challenges and Opportunities in Integrating LLMs into Continuous Integration/Continuous Deployment (CI/CD) Pipelines. In Proceedings of the 2024 5th International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT), Nanjing, China, 29–31 March 2024. [Google Scholar] [CrossRef]
- Kulkarni, A.; Shivananda, A.; Kulkarni, A.; Gudivada, D. LLMs for Enterprise and LLMOps. In Applied Generative AI for Beginners: Practical Knowledge on Diffusion Models, ChatGPT, and Other LLMs; Springer: Berlin/Heidelberg, Germany, 2023; pp. 117–154. [Google Scholar]
- Schillaci, Z. On-Site Deployment of LLMs. In Large Language Models in Cybersecurity: Threats, Exposure and Mitigation; Springer Nature Switzerland: Cham, Switzerland, 2024; pp. 205–211. [Google Scholar]
- Asmitha, M.; Danda, A.; Bysani, H.; Singh, R.P.; Kanchan, S. Automation of Text Summarization Using Hugging Face NLP. In Proceedings of the 2024 5th International Conference for Emerging Technology (INCET), Belgaum, India, 24–26 May 2024; pp. 1–7. [Google Scholar]
- Wang, J.; Chen, N.; Sun, Q.; Huang, W.; Wang, C.; Gao, M. HugNLP: A Unified and Comprehensive Library for Natural Language Processing. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, Birmingham, UK, 21–25 October 2023; pp. 5111–5116. [Google Scholar]
- Jacob, T.P.; Bizotto, B.L.S.; Sathiyanarayanan, M. Constructing the ChatGPT for PDF Files with Langchain–AI. In Proceedings of the 2024 International Conference on Inventive Computation Technologies (ICICT), Lalitpur, Nepal, 24–26 April 2024; pp. 835–839. [Google Scholar]
- Micheal, A.A.; Prasanth, A.; Aswin, T.; Krisha, B. Advancing Educational Accessibility: The LangChain LLM Chatbot’s Impact on Multimedia Syllabus-Based Learning. Preprint 2024. [Google Scholar] [CrossRef]
- Rahman, M.A. A Survey on Security and Privacy of Multimodal LLMs-Connected Healthcare Perspective. In Proceedings of the 2023 IEEE Globecom Workshops (GC Wkshps), Kuala Lumpur, Malaysia, 4–8 December 2023; pp. 1807–1812. [Google Scholar]
- Topsakal, O.; Akinci, T.C. Creating large language model applications utilizing langchain: A primer on developing llm apps fast. In Proceedings of the International Conference on Applied Engineering and Natural Sciences, Konya, Turkey, 10–12 July 2023; Volume 1, pp. 1050–1056. [Google Scholar]
- Huh, J.; Park, H.J.; Ye, J.C. Breast ultrasound report generation using LangChain. arXiv 2023, arXiv:2312.03013. [Google Scholar]
- Burgan, C.; Kowalski, J.; Liao, W. Developing a Retrieval Augmented Generation (RAG) Chatbot App Using Adaptive Large Language Models (LLM) and LangChain Framework. Proc. West Va. Acad. Sci. 2024, 96. [Google Scholar] [CrossRef]
- Ananthajothi, K.; David, J.; Kavin, A. Cardiovascular Disease Prediction Using Langchain. In Proceedings of the 2024 International Conference on Advances in Computing, Communication and Applied Informatics (ACCAI), Chennai, India, 9–10 May 2024; pp. 1–6. [Google Scholar]
- Asyrofi, R.; Dewi, M.R.; Lutfhi, M.I.; Wibowo, P. Systematic Literature Review Langchain Proposed. In Proceedings of the 2023 International Electronics Symposium (IES), Denpasar, Indonesia, 8–10 August 2023; pp. 533–537. [Google Scholar]
- Jacobs, S.A.; Tanaka, M.; Zhang, C.; Zhang, M.; Song, L.; Rajbhandari, S.; He, Y. Deepspeed ulysses: System optimizations for enabling training of extreme long sequence transformer models. arXiv 2023, arXiv:2309.14509. [Google Scholar]
- Holmes, C.; Tanaka, M.; Wyatt, M.; Awan, A.A.; Rasley, J.; Rajbhandari, S.; Aminabadi, R.Y.; Qin, H.; Bakhtiari, A.; Kurilenko, L.; et al. Deepspeed-fastgen: High-throughput text generation for llms via mii and deepspeed-inference. arXiv 2024, arXiv:2401.08671. [Google Scholar]
- Hanindhito, B.; Patel, B.; John, L.K. Bandwidth Characterization of DeepSpeed on Distributed Large Language Model Training. In Proceedings of the 2024 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Indianapolis, Indiana, 5–7 May 2024; pp. 241–256. [Google Scholar]
- Cloud, G. Dialogflow CX Documentation. Available online: https://cloud.google.com/dialogflow/cx/docs (accessed on 7 September 2024).
- Barus, S.P.; Surijati, E. Chatbot with dialogflow for FAQ services in Matana university library. Int. J. Inform. Comput. 2022, 3, 51–62. [Google Scholar] [CrossRef]
- Dall’Acqua, A.; Tamburini, F. Implementing a pragmatically adequate chatbot in DialogFlow CX. In Proceedings of the Ceur Workshop Proceedings, CEUR-WS, Naples, Italy, 4–5 October 2021; Volume 3033, pp. 1–8. [Google Scholar]
- Sung, M.C. Development of Prospective Teachers’ Adaptive Expertise in AI Chatbots: Comparative Analyses of Google Dialogflow ES and CX. Multimed.-Assist. Lang. Learn. 2022, 25, 132–151. [Google Scholar]
- Microsoft. Azure OpenAI Service. Available online: https://azure.microsoft.com/en-us/products/ai-services/openai-service (accessed on 7 September 2024).
- Bisson, S.; Branscombe, M.; Hoder, C.; Raman, A. Azure AI Services at Scale for Cloud, Mobile, and Edge; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2022. [Google Scholar]
- LaNeve, J.; Naik, K. Building and Deploying LLM Applications with Apache Airflow. 2023. Available online: https://airflowsummit.org/slides/2023/Building_and_deploying_LLM_applications_with_Apache_Airflow___Airflow_Summit_2023.pdf (accessed on 7 September 2024).
- Dureja, P. Harnessing Apache Airflow Operators for Enhanced Workflow Automation. J. Artif. Intell. Cloud Comput. 2023, 2, 1–3. [Google Scholar] [CrossRef]
- Amazon Web Services (AWS). Deploy Large Language Models on AWS Inferentia2 Using Large Model Inference Containers. Available online: https://aws.amazon.com/blogs/machine-learning/deploy-large-language-models-on-aws-inferentia2-using-large-model-inference-containers/ (accessed on 7 September 2024).
- Fregly, C.; Barth, A.; Eigenbrode, S. Generative AI on AWS: Building Context-Aware Multimodal Reasoning Applications; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2023. [Google Scholar]
- Katti, J.; Agarwal, J.; Bharata, S.; Shinde, S.; Mane, S.; Biradar, V. University admission prediction using Google Vertex AI. In Proceedings of the 2022 First International Conference on Artificial Intelligence Trends and Pattern Recognition (ICAITPR), Hyderabad, India, 10–12 March 2022; pp. 1–5. [Google Scholar]
- Wang, H.; Yang, J.; Liang, G.; Lee, Y.; Cao, Z. Analyzing the Usability, Performance, and Cost-Efficiency of Deploying ML Models on BigQuery ML and Vertex AI in Google Cloud. In Proceedings of the 2024 8th International Conference on Cloud and Big Data Computing, Oxford, UK, 15–17 August 2024; pp. 15–25. [Google Scholar]
- Jabbari, R.; bin Ali, N.; Petersen, K.; Tanveer, B. What is DevOps? A systematic mapping study on definitions and practices. In Proceedings of the Scientific Workshop Proceedings of XP2016, Scotland, UK, 24 May 2016; pp. 1–11. [Google Scholar]
- Kadaskar, H.R. Unleashing the Power of Devops in Software Development. Int. J. Sci. Res. Mod. Sci. Technol. 2024, 3, 1–7. [Google Scholar]
- Azad, N.; Hyrynsalmi, S. Multivocal Literature Review on DevOps Critical Success Factors. In Proceedings of the EASE ’24: Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering, Salerno, Italy, 18–21 June 2024. [Google Scholar] [CrossRef]
- Karunarathne, M.; Wijayanayake, W.J.I.; Prasadika, A.P.K.J. DevOps Adoption in Software Development Organizations: A Systematic Literature Review. In Proceedings of the 2024 4th International Conference on Advanced Research in Computing (ICARC), Belihuloya, Sri Lanka, 21–24 February 2024; pp. 282–287. [Google Scholar] [CrossRef]
- GeeksforGeeks. DevOps Lifecycle. 2023. Available online: https://www.geeksforgeeks.org/devops-lifecycle/ (accessed on 12 October 2024).
- Kusumadewi, R.; Adrian, R. Performance Analysis of Devops Practice Implementation of CI/CD Using Jenkins. J. Comput. Sci. Inf. Technol. 2023, 15, 90–95. [Google Scholar] [CrossRef]
- Moutsatsos, I.; Hossain, I.; Agarinis, C.; Harbinski, F.; Abraham, Y.; Dobler, L.; Zhang, X.; Wilson, C.D.; Jenkins, J.L.; Holway, N.; et al. Jenkins-CI, an Open-Source Continuous Integration System, as a Scientific Data and Image-Processing Platform. J. Biomol. Screen. 2017, 22, 238–249. [Google Scholar] [CrossRef]
- Rai, P.; Madhurima; Dhir, S.; Madhulika; Garg, A. A prologue of JENKINS with comparative scrutiny of various software integration tools. In Proceedings of the 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 11–13 March 2015; pp. 201–205. [Google Scholar]
- Kanchana, A.; Chandrashekar Murthy, B.N. Automated Development and Testing of ECUs in Automotive Industry with Jenkins. In Proceedings of the 2020 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), Online, 2–4 July 2020; pp. 1–5. [Google Scholar]
- Arefeen, M.S.; Schiller, M. Continuous integration using GitLab. Undergrad. Res. Nat. Clin. Sci. Technol. J. 2019, 3, 1–6. [Google Scholar] [CrossRef]
- Sharif, M.; Janto, S.; Lueckemeyer, G. Coaas: Continuous integration and delivery framework for hpc using gitlab-runner. In Proceedings of the 2020 4th International Conference on Big Data and Internet of Things, Fuzhou, China, 12–14 June 2020; pp. 54–58. [Google Scholar]
- Reddy, A.K.; Alluri, V.R.R.; Thota, S.; Ravi, C.S.; Bonam, V.S.M. DevSecOps: Integrating Security into the DevOps Pipeline for Cloud-Native Applications. J. Artif. Intell. Res. Appl. 2021, 1, 89–114. [Google Scholar]
- Gallaba, K.; Lamothe, M.; McIntosh, S. Lessons from eight years of operational data from a continuous integration service: An exploratory case study of circleci. In Proceedings of the 44th International Conference on Software Engineering, Pittsburgh, PA, USA, 25–27 May 2022; pp. 1330–1342. [Google Scholar]
- Hung, P.D.; Giang, D.T. Continuous Integration for Android Application Development and Training. In Proceedings of the ICEMT ’19: Proceedings of the 3rd International Conference on Education and Multimedia Technology, Nagoya, Japan, 22–25 July 2019. [Google Scholar] [CrossRef]
- Sochat, V. Containershare: Open Source Registry to build, test, deploy with CircleCI. J. Open Source Softw. 2018, 3, 878. [Google Scholar] [CrossRef]
- Krief, M. Learning DevOps: The Complete Guide to Accelerate Collaboration with Jenkins, Kubernetes, Terraform and Azure DevOps; Packt Publishing Ltd.: Birmingham, UK, 2019. [Google Scholar]
- Jackson, S. Setting Up Azure DevOps. In Accelerating Unity Through Automation: Power Up Your Unity Workflow by Offloading Intensive Tasks; Springer: Berlin/Heidelberg, Germany, 2023; pp. 193–243. [Google Scholar]
- Borra, P. Maximizing Efficiency and Collaboration with Microsoft Azure DevOps. Planning 2024, 4, 556–562. [Google Scholar]
- Mustyala, A. CI/CD Pipelines in Kubernetes: Accelerating Software Development and Deployment. Eph-Int. J. Sci. Eng. 2022, 8, 1–11. [Google Scholar]
- Shevchuk, R.; Karpinski, M.; Kasianchuk, M.; Yakymenko, I.; Melnyk, A.; Tykhyi, R. Software for Improve the Security of Kubernetes-based CI/CD Pipeline. In Proceedings of the 2023 13th International Conference on Advanced Computer Information Technologies (ACIT), Wrocław, Poland, 21–23 September 2023; pp. 420–425. [Google Scholar]
- Schmeling, B.; Dargatz, M. Kubernetes-Native Pipelines. In Kubernetes Native Development: Develop, Build, Deploy, and Run Applications on Kubernetes; Springer: Berlin/Heidelberg, Germany, 2022; pp. 243–302. [Google Scholar]
- Benedetti, P.; Coviello, G.; Rao, K.; Chakradhar, S. Scale Up while Scaling Out Microservices in Video Analytics Pipelines. In Proceedings of the 2023 IEEE 16th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC), Singapore, 18–21 December 2023; pp. 584–591. [Google Scholar]
- Bahaweres, R.B.; Zulfikar, A.; Hermadi, I.; Suroso, A.I.; Arkeman, Y. Docker and Kubernetes Pipeline for DevOps Software Defect Prediction with MLOps Approach. In Proceedings of the 2022 2nd International Seminar on Machine Learning, Optimization, and Data Science (ISMODE), Jakarta, Indonesia, 22–23 December 2022; pp. 248–253. [Google Scholar]
- Atkinson, B.; Edwards, D. Generic Pipelines Using Docker: The DevOps Guide to Building Reusable, Platform Agnostic CI/CD Frameworks; Apress: Berkeley, CA, USA, 2018. [Google Scholar]
- Fernández González, D.; Rodríguez Lera, F.J.; Esteban, G.; Fernández Llamas, C. Secdocker: Hardening the continuous integration workflow: Wrapping the container layer. SN Comput. Sci. 2022, 3, 1–13. [Google Scholar] [CrossRef]
- Arapidis, C. Sonar Code Quality Testing Essentials; Packt Publishing: Birmingham, UK, 2012. [Google Scholar]
- Andrade, M.J. White-Box Testing Automation With SonarQube: Continuous Integration, Code Review, Security, and Vendor Branches. In Code Generation, Analysis Tools, and Testing for Quality; IGI Global: Hershey, PA, USA, 2019; pp. 64–88. [Google Scholar]
- Artefact Engineering and Data Science. Why You Need LLMOps. 2024. Available online: https://medium.com/artefact-engineering-and-data-science/why-you-need-llmops-48c0925827de (accessed on 25 December 2024).
- Puget Systems Team. LLM Inference Consumer GPU Performance. 2023. Available online: https://www.pugetsystems.com/labs/articles/llm-inference-consumer-gpu-performance/ (accessed on 21 December 2024).
- Matics Analytics Team. LLMOps: Scaling LLM Deployment with 100% Throughput Improvement. 2023. Available online: https://www.maticsanalytics.com/post/llmops-scaling-llm-deployment-with-~100-throughput-improvement (accessed on 6 October 2024).
- Jeong, C. A study on the implementation of generative ai services using an enterprise data-based llm application architecture. arXiv 2023, arXiv:2309.01105. [Google Scholar] [CrossRef]
- Kim, T.; Wang, Y.; Chaturvedi, V.; Gupta, L.; Kim, S.; Kwon, Y.; Ha, S. LLMem: Estimating GPU Memory Usage for Fine-Tuning Pre-Trained LLMs. arXiv 2024, arXiv:2404.10933. [Google Scholar]
- DataCamp Team. TPU vs GPU: What’s the Difference for AI? 2023. Available online: https://www.datacamp.com/blog/tpu-vs-gpu-ai (accessed on 6 October 2024).
- Incubity by Ambilio Team. GPU vs TPU for LLM Training: A Comprehensive Analysis. 2024. Available online: https://incubity.ambilio.com/gpu-vs-tpu-for-llm-training-a-comprehensive-analysis/ (accessed on 6 October 2024).
- Google Cloud. Google Cloud TPU. Available online: https://cloud.google.com/tpu?hl=en (accessed on 6 October 2024).
- Wu, Y.E.; Wu, H.I.; Chin, K.C.; Yang, Y.C.; Tsay, R.S. Accelerate Large Language Model Inference on Edge TPU with OpenVX framework. In Proceedings of the 2024 IEEE 6th International Conference on AI Circuits and Systems (AICAS), Abu Dhabi, United Arab Emirates, 22–25 April 2024; pp. 502–506. [Google Scholar]
- Brakel, F.; Odyurt, U.; Varbanescu, A.L. Model Parallelism on Distributed Infrastructure: A Literature Review from Theory to LLM Case-Studies. arXiv 2024, arXiv:2403.03699. [Google Scholar]
- Carrión, D.S.; Prohaska, V. Exploration of TPUs for AI Applications. arXiv 2023, arXiv:2309.08918. [Google Scholar]
- Run:ai Team. Parallelism Strategies for Distributed Training. 2023. Available online: https://www.run.ai/blog/parallelism-strategies-for-distributed-training (accessed on 6 October 2024).
- Kim, S.; Moon, S.; Tabrizi, R.; Lee, N.; Mahoney, M.W.; Keutzer, K.; Gholami, A. An LLM compiler for parallel function calling. arXiv 2023, arXiv:2312.04511. [Google Scholar]
- Singh, S.; Karatzas, A.; Fore, M.; Anagnostopoulos, I.; Stamoulis, D. An LLM-Tool Compiler for Fused Parallel Function Calling. arXiv 2024, arXiv:2405.17438. [Google Scholar]
- Low, Y.; Gonzalez, J.E.; Kyrola, A.; Bickson, D.; Guestrin, C.E.; Hellerstein, J. Graphlab: A new framework for parallel machine learning. arXiv 2014, arXiv:1408.2041. [Google Scholar]
- Bender, E.M.; Gebru, T.; McMillan-Major, A.; Shmitchell, S. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event, 3–10 March 2021; pp. 610–623. [Google Scholar]
- Ni, C.; Wu, J.; Wang, H.; Lu, W.; Zhang, C. Enhancing cloud-based large language model processing with elasticsearch and transformer models. In Proceedings of the International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2024), Guangzhou, China, 1–3 March 2024; Volume 13180, pp. 1648–1654. [Google Scholar]
- Tamkin, A.; Brundage, M.; Clark, J.; Ganguli, D. Understanding the capabilities, limitations, and societal impact of large language models. arXiv 2021, arXiv:2102.02503. [Google Scholar]
- Luu, H.; Pumperla, M.; Zhang, Z. The Future of MLOps. In MLOps with Ray: Best Practices and Strategies for Adopting Machine Learning Operations; Springer: Berlin/Heidelberg, Germany, 2024; pp. 305–327. [Google Scholar]
- Liu, Y.; Xu, Y.; Song, R. Transforming User Experience (UX) through Artificial Intelligence (AI) in interactive media design. Eng. Sci. Technol. J. 2024, 5, 2273–2283. [Google Scholar] [CrossRef]
- Jin, H.; Zhang, Y.; Meng, D.; Wang, J.; Tan, J. A comprehensive survey on process-oriented automatic text summarization with exploration of llm-based methods. arXiv 2024, arXiv:2403.02901. [Google Scholar]
- Alammar, J.; Grootendorst, M. Hands-On Large Language Models; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2024. [Google Scholar]
- Jones, B.; Dixon, G. Boosting Textual Understanding in LLMs with Context-Aware Flexible Length Tokenization. OSF Preprints. Available online: https://osf.io/preprints/osf/9gnjt (accessed on 24 December 2024).
- Mumuni, A.; Mumuni, F. Automated data processing and feature engineering for deep learning and big data applications: A survey. J. Inf. Intell. 2024, in press. [Google Scholar] [CrossRef]
- Digital Alpha Team. LLMOps Unveiled: Your Step-by-Step Guide to Building Production-Ready LLM Applications. 2023. Available online: https://www.digital-alpha.com/llmops-unveiled-your-step-by-step-guide-to-building-production-ready-llm-applications/ (accessed on 6 October 2024).
- CHen, Z.; Cao, L.; Madden, S.; Fan, J.; Tang, N.; Gu, Z.; Shang, Z.; Liu, C.; Cafarella, M.; Kraska, T. Seed: Simple, efficient, and effective data management via large language models. arXiv 2023, arXiv:2310.00749. [Google Scholar]
- Pansara, R.R. NoSQL Databases and Master Data Management: Revolutionizing Data Storage and Retrieval. Int. Numer. J. Mach. Learn. Robot. 2020, 4, 1–11. [Google Scholar]
- Gupta, I.; Singh, A.K.; Lee, C.N.; Buyya, R. Secure data storage and sharing techniques for data protection in cloud environments: A systematic review, analysis, and future directions. IEEE Access 2022, 10, 71247–71277. [Google Scholar] [CrossRef]
- Nambiar, A.; Mundra, D. An overview of data warehouse and data lake in modern enterprise data management. Big Data Cogn. Comput. 2022, 6, 132. [Google Scholar] [CrossRef]
- Kehrer, K.; Kaiser, C. Machine Learning Upgrade: A Data Scientist’s Guide to MLOps, LLMs, and ML Infrastructure; John Wiley & Sons: Hoboken, NJ, USA, 2024. [Google Scholar]
- Wang, X.; Kim, H.; Rahman, S.; Mitra, K.; Miao, Z. Human-LLM collaborative annotation through effective verification of LLM labels. In Proceedings of the CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 11–16 May 2024; pp. 1–21. [Google Scholar]
- Kim, H.; Mitra, K.; Chen, R.L.; Rahman, S.; Zhang, D. Meganno+: A human-llm collaborative annotation system. arXiv 2024, arXiv:2402.18050. [Google Scholar]
- Tan, Z.; Beigi, A.; Wang, S.; Guo, R.; Bhattacharjee, A.; Jiang, B.; Karami, M.; Li, J.; Cheng, L.; Liu, H. Large language models for data annotation: A survey. arXiv 2024, arXiv:2402.13446. [Google Scholar]
- Sutharsan, M. Smart analysis of automated and semi-automated approaches to data annotation for machine learning. ICTACT J. Data Sci. Mach. Learn. 2023, 4, 457–460. [Google Scholar]
- Pustejovsky, J.; Stubbs, A. Natural Language Annotation for Machine Learning: A Guide to Corpus-Building for Applications; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2012. [Google Scholar]
- Van Atteveldt, W.; Van der Velden, M.A.; Boukes, M. The validity of sentiment analysis: Comparing manual annotation, crowd-coding, dictionary approaches, and machine learning algorithms. Commun. Methods Meas. 2021, 15, 121–140. [Google Scholar] [CrossRef]
- Pathania, N. Learning Continuous Integration with Jenkins; Packt Publishing Ltd.: Birmingham, UK, 2016. [Google Scholar]
- Zhao, Z.; Chen, Y.; Bangash, A.A.; Adams, B.; Hassan, A.E. An empirical study of challenges in machine learning asset management. Empir. Softw. Eng. 2024, 29, 98. [Google Scholar] [CrossRef]
- Safri, H.; Papadimitriou, G.; Deelman, E. Dynamic Tracking, MLOps, and Workflow Integration: Enabling Transparent Reproducibility in Machine Learning. In Proceedings of the 2024 IEEE 20th International Conference on e-Science (e-Science), Osaka, Japan, 16–20 September 2024; pp. 1–10. [Google Scholar]
- Semmelrock, H.; Ross-Hellauer, T.; Kopeinik, S.; Theiler, D.; Haberl, A.; Thalmann, S.; Kowald, D. Reproducibility in Machine Learning-based Research: Overview, Barriers and Drivers. arXiv 2024, arXiv:2406.14325. [Google Scholar]
- Wu, Z. Sky Computing with Intercloud Brokers. Ph.D. Thesis, University of California, Berkeley, CA, USA, 2024. [Google Scholar]
- Sikeridis, D.; Papapanagiotou, I.; Rimal, B.P.; Devetsikiotis, M. A Comparative taxonomy and survey of public cloud infrastructure vendors. arXiv 2017, arXiv:1710.01476. [Google Scholar]
- Estrin, E. Cloud Security Handbook: Find Out How to Effectively Secure Cloud Environments Using AWS, Azure, and GCP; Packt Publishing Ltd.: Birmingham, UK, 2022. [Google Scholar]
- Esas, O. Design Patterns and Anti-Patterns in Microservices Architecture: A Classification Proposal and Study on Open Source Projects. Master’s Thesis, Politecnico di Milano, Milano, Italy, 2020. Available online: https://www.politesi.polimi.it/handle/10589/186745 (accessed on 24 December 2024).
- Huang, Y.; Wan, L.J.; Ye, H.; Jha, M.; Wang, J.; Li, Y.; Zhang, X.; Chen, D. New Solutions on LLM Acceleration, Optimization, and Application. arXiv 2024, arXiv:2406.10903. [Google Scholar]
- Isaev, M.; McDonald, N.; Vuduc, R. Scaling infrastructure to support multi-trillion parameter LLM training. In Proceedings of the Architecture and System Support for Transformer Models (ASSYST@ ISCA 2023), Oralndo, FL, USA, 17 June 2023. [Google Scholar]
- Srivatsa, K.G. Leveraging Large Language Models for Generating Infrastructure as Code: Open and Closed Source Models and Approaches. Ph.D. Thesis, International Institute of Information Technology Hyderabad, Hyderabad, India, 2024. [Google Scholar]
- David, R.B. Kubernetes Auto-Scaling: YoYo Attack Vulnerability and Mitigation. Master’s Thesis, Reichman University, Herzliya, Israel, 2021. [Google Scholar]
- Ekanayaka, E.M.I.M.; Thathsarani, J.K.K.H.; Karunanayaka, D.S.; Kuruwitaarachchi, N.; Skandakumar, N. Enhancing DevOps Infrastructure for Efficient Management of Microservice Applications. In Proceedings of the 2023 IEEE International Conference on e-Business Engineering (ICEBE), Sydney, Australia, 4–6 November 2023; pp. 63–68. [Google Scholar]
- Mohajeri, M.A. Leveraging Large Language Model for Enhanced Business Analytics on AWS. Master’s Thesis, Centria University of Applied Sciences, Kokkola, Finland, 2024. Available online: https://www.theseus.fi/handle/10024/859982 (accessed on 12 December 2024).
- Li, H.; Wang, S.X.; Shang, F.; Niu, K.; Song, R. Applications of large language models in cloud computing: An empirical study using real-world data. Int. J. Innov. Res. Comput. Sci. Technol. 2024, 12, 59–69. [Google Scholar] [CrossRef]
- Zhou, X.; Zhao, X.; Li, G. LLM-Enhanced Data Management. arXiv 2024, arXiv:2402.02643. [Google Scholar]
- Wang, Z.; Zhong, W.; Wang, Y.; Zhu, Q.; Mi, F.; Wang, B.; Shang, L.; Jiang, X.; Liu, Q. Data management for large language models: A survey. arXiv 2023, arXiv:2312.01700. [Google Scholar]
- Webber, E.; Olgiati, A. Pretrain Vision and Large Language Models in Python: End-to-End Techniques for Building and Deploying Foundation Models on AWS; Packt Publishing Ltd.: Birmingham, UK, 2023. [Google Scholar]
- Capizzi, A.; Distefano, S.; Mazzara, M. From devops to devdataops: Data management in devops processes. In Proceedings of the Software Engineering Aspects of Continuous Development and New Paradigms of Software Production and Deployment: Second International Workshop, DEVOPS 2019, Château de Villebrumier, France, 6–8 May 2019; Revised Selected Papers 2. Springer: Berlin/Heidelberg, Germany, 2020; pp. 52–62. [Google Scholar]
- Chawla, H.; Khattar, P.; Chawla, H.; Khattar, P. Building Blocks of Data Analytics. In Data Lake Analytics on Microsoft Azure; Apress: Berkeley, CA, USA, 2020; pp. 11–25. [Google Scholar]
- Chan, Y.C.; Pu, G.; Shanker, A.; Suresh, P.; Jenks, P.; Heyer, J.; Denton, S. Balancing Cost and Effectiveness of Synthetic Data Generation Strategies for LLMs. arXiv 2024, arXiv:2409.19759. [Google Scholar]
- Liu, Y.; Zhang, H.; Miao, Y.; Le, V.H.; Li, Z. OptLLM: Optimal Assignment of Queries to Large Language Models. arXiv 2024, arXiv:2405.15130. [Google Scholar]
- Kaswan, S.; Goyal, P.; Khirasaria, V.; Yugal, L.; Amita, E. Current Trends and Challenges of Cloud Computing and Emerging Technological Utilization in Technical Education. In Proceedings of the 2024 5th International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 7–9 August 2024; pp. 629–634. [Google Scholar]
- Patel, P.; Choukse, E.; Zhang, C.; Goiri, Í.; Warrier, B.; Mahalingam, N.; Bianchini, R. Characterizing Power Management Opportunities for LLMs in the Cloud. In Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, La Jolla, CA, USA, 27 April–1 May 2024; Volume 3, pp. 207–222. [Google Scholar]
- Thati, B.; Shyam, K.M.; Sindhura, S.; Pulletikurthy, D.; Chowdary, N.S. Continuous Deployment in Action: Developing a Cloud-Based Image Matching Game. Int. J. Innov. Technol. Interdiscip. Sci. 2024, 7, 68–79. [Google Scholar]
- Rangnau, T.; Buijtenen, R.V.; Fransen, F.; Turkmen, F. Continuous security testing: A case study on integrating dynamic security testing tools in ci/cd pipelines. In Proceedings of the 2020 IEEE 24th International Enterprise Distributed Object Computing Conference (EDOC), Eindhoven, The Netherlands, 5–8 October 2020; pp. 145–154. [Google Scholar]
- Choudhary, S. Kubernetes-Based Architecture For an On-premises Machine Learning Platform. Master’s Thesis, Aalto University, Espoo, Finland, 2021. [Google Scholar]
- Wilkins, G.; Keshav, S.; Mortier, R. Hybrid Heterogeneous Clusters Can Lower the Energy Consumption of LLM Inference Workloads. In Proceedings of the 15th ACM International Conference on Future and Sustainable Energy Systems, Singapore, 4–7 June 2024; pp. 506–513. [Google Scholar]
- Mienye, I.D.; Swart, T.G.; Obaido, G. XtremeLLMs: Towards Extremely Large Language Models. Preprints 2024. Available online: https://www.preprints.org/manuscript/202408.1483/v1 (accessed on 14 December 2024).
- Yu, S.; Fang, C.; Ling, Y.; Wu, C.; Chen, Z. LLM for Test Script Generation and Migration: Challenges, Capabilities, and Opportunities. arXiv 2023. [Google Scholar] [CrossRef]
Platforms | Key Features | Use Case | Focus | References |
---|---|---|---|---|
AWS Sagemaker (Python SDK version 2.94.0) | Fully managed infrastructure, workflows, and tools for building, training, and deploying machine learning (ML) models for any use case. | Suitable for enterprise-level ML applications and workflows. | Scalability, Integration with AWS ecosystem (S3, EC2, and other services). | [87,88,89,90,91,92,93,94] |
Databricks (Version 10.4) | A comprehensive analytics platform that offers collaborative, real-time notebooks, scalable data processing, and integrated machine learning workflows. | Well-suited for the analysis of large-scale datasets (Structured, Unstructured) and the collaborative construction of machine learning models (Building, Training, and Deployment). | Integration with Apache Spark, Collaborative notebooks supports multiple languages (Python, R, Scala, SQL). | [95,96,97,98,99] |
Azure AutoML (Azure ML SDK version 1.38.0) | Automated model building and tuning, support for many machine learning tasks, and seamless interaction with Azure services; cost effective. | Ideal for individuals who need to quickly create, train, and use machine learning models, as well as time series forcasting. | Support for a variety of data types and models, integration with Azure ML, time series specialization, automation and accessibility. | [100] |
TensorFlow (Version 2.10.0) | Comprehensive machine learning framework, distributed training with (TensorFlow Distributed Strategy), model serving (TensorFlow Serving) and edge deployment with (TensorFlow Lite). | Machine learning development from beginning to end, scalable training, production deployment, mobile and edge inference technology. | Management of machine learning pipelines, end-to-end development and training of machine learning models, serving models at scale, enables fast training and cross platform compatibility (mobile devices, cloud environment, etc.). | [101] |
PyTorch (Version 1.13.0) | High-level application programming interfaces (TorchVision, TorchText, etc.), model serving (TorchServe), combination with Kubernetes and cloud platforms. | The use of research and experimentation, the construction of flexible models, and production deployment. Provides dynamic methodology for constructing and training neural networks. | Deployment of models at scale, efficient deployment, scalable training (distributed data parallel). | [102,103,104] |
MLFlow (Version 2.0.0) | MLflow is a popular MLOps platform with full machine learning model lifecycle management tools. It simplifies ML project management with experiment tracking, model versioning, and reproducibility. | MLflow is adaptable, addressing various machine learning scenarios in model selection, deployment, model performance monitoring, model versioning, and management. | Cross-Platform Integration: MLflow interacts with various machine learning frameworks, Scalability and Flexibility, Cloud-native. | [105,106,107,108] |
Kubeflow (Version 1.4.0) | Open-source MLOps platform Kubeflow runs scalable and portable machine learning workloads on Kubernetes. It manages model construction, training, deployment, and monitoring for ML. | Kubeflow facilitates the automation and scaling of recommendation models for e-commerce, media platforms, fraud detection models, and bioinformatics application utilizing Kubeflow. | Cloud-Native and Kubernetes Integration, Scalabilty, Automation, and Reproducibility. | [105,107,109,110,111] |
Metaflow (Version 2.5.0) | Human-centric MLOps platform Metaflow simplifies data science project development and management. Focus on code while handling scaling, versioning, monitoring, and simplifying model development. | Metaflow is a user-friendly library that assists scientists and engineers in constructing and overseeing practical data science projects. Netflix is capable of holding numerous Metaflow projects. | Diverse integrations for Metaflow, Parallelization and Resource Optimizations, Collaboration, and Transparency. | [112,113,114] |
IBM Watson Studio (Version 3.0.1) | IBM Watson Studio enables analysts, data scientists, and developers to collaboratively build, train, and deploy machine learning models. | IBM Watson Studio is a comprehensive platform that merges many tools and technologies to streamline the development, deployment, and maintenance of machine learning models, and it facilitates with NLP and AI projects. | IBM Watson Cloud services are used to develop natural language processing solutions, and it enables the creation of advanced chat tools, as well as support for open-source tools. | [115,116] |
Cloudera (Version 7.1.4) | Cloudera manages machine learning models in production at scale, model monitoring, ETL capabilities, and governance tools. | Comprehensive control of the entire machine learning lifecycle; facilitates visibility throughout the whole machine learning lifecycle. | Cloudera provides open standards-based MLOps enabling enterprises to industrialize AI, big data processing and analytics solutions, and enterprise data management. | [117,118] |
Apache Airflow (Version 2.3.0) | Apache Airflow is a widely utilized tool for orchestrating intricate, multistage data pipelines and workflows across many sectors, especially in data engineering, machine learning, and ETL operations. | Apache Airflow has been utilized for the orchestration of ML operations and the scheduling of automated model training, as well as batch processing. | Automation of machine learning pipelines for anomaly detection challenges, data migration and integration, and task scheduling and monitoring. | [105,119,120,121,122] |
Platforms | Key Features | Use Case | Focus | References |
---|---|---|---|---|
Hugging Face Transformers (Version 4.30.0) | It comes with a model hub, API support, and pretrained models for a number of NLP tasks. It is also easy to integrate with PyTorch and TensorFlow. | This is the best way to quickly build and change natural language processing (NLP) models, multimodal applications, and text summarization. | Fine tunning of NLP uses, model modification, scalability, and performance. | [130,168,169,170] |
LangChain (Version 0.0.150) | Multiple LLMs can be used together, including OpenAI, Hugging Face, and others. Interactive agent and prompt management are supported. | Perfect for making complicated LLM apps with dynamic prompt management and processes based on agents, dynamic content generation. | Automation of complex workflows, the creation of chatbots, and managing context for LLMs; natural language understanding, text generation, and text classfications. | [171,172,173,174,175,176,177,178] |
DeepSpeed (Version 0.8.0) | ZeRO optimization, mixed precision training, gradient accumulation, and distributed training are some of the features that are included in optimized training for big models. | Ideally suited for the efficient and effective training of large-scale deep learning models with limited resources. | Suitable for the training of large-scale deep learning models, scalability for large models, simplified distributed training. | [179,180,181] |
Google Cloud Dialogflow CX (Version 2.0.1) | Visual flow builder, multiturn discussions, enhanced natural language understanding, support for omnichannel use, context management. | Ideally suited for the development of complicated conversational agents and chatbots that can handle numerous turns across multiple platforms. | Virtual assistants, customer care bots (optimize the customer experience), seamless integration with Google Cloud platform. | [182,183,184,185] |
Azure Bot Service with OpenAI (Version 1.0.0) | Integrated bot framework, support for OpenAI models (such as GPT series), powerful artificial intelligence capabilities, bot orchestration. | Outstanding for the development, deployment, and management of intelligent conversational agents through the utilization of OpenAI’s large language models (LLMs). | Automating and scaling customer interactions, built-in Azure services for scalability, integration with a variety of communication channels. | [186,187] |
Apache Airflow (with LLMOps) (Version 2.3.0) | Orchestration of workflows, dynamic scheduling of tasks, interaction with machine learning pipelines, and supports distributed execution. | A perfect solution for the management and automation of large language model (LLM) pipelines that contain complicated dependencies, serving and deploying LLMs. | Automation of complex workflows, integration with various ML tools, scalability for large scalable models. | [188,189] |
AWS Inferentia (Neuron SDK version 1.7.0) | Designed for deep learning models, high throughput, low latency, cost-efficiency, and multimodel support. | Designed to facilitate the deployment of LLMs and other deep learning models at reduced costs while maintaining high-performance inference efficiency and speech recognition assistance. | Providing real-time inference for LLMs, scaling at a cost-effective rate, and accelerating deep learning models; integration with AWS ecosystem. | [190,191] |
Google Cloud Vertex AI (Version 1.12.1) | Machine learning platform that covers the entire process, including AutoML, managed machine learning pipelines, individualized model training, model monitoring, and tracking. | This solution is perfect for constructing, training, deploying, and administering machine learning models and LLMs at a distributed scale, as well as NLP tasks (text classifications, chatbots, etc.). | End-to-end machine learning platforms, scalability and flexibility, model management and monitoring. | [192,193] |
Platforms | Key Features | Use Case | Focus | References |
---|---|---|---|---|
Jenkins (Version 2.319.3) | Jenkins is a free and open-source automation tool designed to streamline Continuous Integration and Continuous Delivery (CI/CD) processes, and it features support for development, testing, and deployment. | Automated development and testing, multienvironment deployment, customized workflows for larger enterprises, infrastructure automation. | Flexibility allows users to construct customized pipelines, scalability, and distributing tasks; integration with other DevOps tools. | [199,200,201,202] |
GitLab CI/CD (Version 15.5.0) | GitLab covers full DevOps lifecycle, including source control to deployment. Complete DevSecOps capabilities. | GitLab manages Continuous Integration (CI) operations, multicloud deployment, and automated testing. | Ideal for enterprises utilizing GitLab for source control and those requiring DevSecOps; flexibility and customization. | [203,204,205] |
CircleCI (Version 2.1.0) | CircleCI is a cloud-based Continuous Integration (CI) service that streamlines software development by automating the build process. | CircleCI can be integrated to Google Firebase Test Lab to test Android apps on multiple devices and configurations using cloud infrastructure. | Effective for small- to mid-sized projects with fast, scalable CI/CD; infrastructure optimization. | [206,207,208] |
Azure DevOps (Version 2.0.0) | Azure DevOps is Azure Pipelines, which was first made on Microsoft Team Foundation Server. It has grown into a strong environment that Microsoft uses a lot for software development. | Azure DevOps has various components, such as Azure Boards, Repos, Pipelines, Artifacts, and Test Plans; makes software creation faster and easier. | Microsoft builds the majority of its software with Azure Pipelines, which allows the company to take advantage of the stability, resilience, agilty, and collaboration. | [209,210,211] |
Kubernetes (Version 1.31.5) | A Kubernetes pipeline indicates the automated processes within a CI/CD framework that utilizes Kubernetes for the deployment, scaling, and management of containerized applications. | It focuses on the development cycle, which includes writing code, compiling it, testing it, and fixing bugs, all while Kubernetes runs apps in containers. | Optimal for the orchestration, scaling, and automation of containers in distributed environments. | [212,213,214,215] |
Docker (Version 20.10.9) | Docker DevOps pipeline enhances automation and environment replication by integrating containerization into Continuous Integration (CI) workflows. | Utilizing Docker enables teams to establish uniform environments, optimize application delivery, and promote swift iterations. | Ideal for using containers to create, execute, and share isolated application environments; simplifies application deployment. | [216,217,218] |
SonarQube (Version 9.6.1) | SonarQube provides a solution that automates white-box testing and security assessments within a continuous integration (CI) pipeline. | SonarQube enables developers to effectively oversee vendor branches and guarantee that both their proprietary code and third-party components adhere to security and testing best practices. | Optimal for maintaining code quality, identifying defects, and guaranteeing adherence to coding standards. | [219,220] |
Aspect | LLMOps | MLOps | DevOps | References |
---|---|---|---|---|
Hardware Requirements | ||||
GPUs (Graphics Processing Units) | Due to the size and complexity of LLMs like GPT-4, GPUs are essential for LLMOps performance. | GPUs are commonly utilized in MLOps to accelerate deep learning model training. MLOps workflows can scale across GPUs, depending on model size and complexity. | The primary focus of DevOps is to utilize CPU-based tasks, with GPUs rarely used based on deploying ML models. | [222,223,224,225] |
TPUs (Tensor Processing Units) | TPUs can handle large parallel processing better than GPUs, making them ideal for LLMOps large models. TPUs are ideal for training LLMs at scale when speed and efficiency are crucial. | TPUs are utilized in MLOps to deploy TensorFlow-based models. They can replace GPUs in MLOps pipelines, because they are optimized for neural network matrix multiplications. | Although TPUs are not widely used in DevOps, they are still able to be utilized in high-performance circumstances that need quicker processing (deplying ML models with more computational loads). | [138,226,227,228,229,230,231] |
Parallel Processing | LLMs have billions of parameters; therefore, LLMOps require considerable parallel processing (critical for large model training). | MLOps use parallel processing to speed up model training, especially in remote situations/distributed environments. Parallelism varies by model size, with many smaller ML models not requiring parallel processing. | In the context of DevOps, parallel processing is typically utilized for the purpose of managing and automating a variety of tasks, such as include testing, system monitoring, and parallel deployment pipelines. | [230,232,233,234,235] |
Aspect | LLMOps | MLOps | DevOps | References |
---|---|---|---|---|
Data Management and Handling Large Datasets | ||||
Data Volume | Handles vast datasets, frequently consisting of petabytes of textual information, to train LLMs. Highly heterogeneous, often multimodal (e.g., large-scale corpora of text, videos, etc.). | Capable of processing moderate to large datasets, adapting to the size requirements of the ML model and infrastructure. Heterogeneous data (images, text, etc.). | Manages various data volumes, focusing on operations, logs, and configuration data with a focus on cloud storage for large scale applications. (homogeneous data). | [236,237,238] |
Data Preprocessing | Encompasses specific methodologies for processing textual data, including tokenization, contextual embeddings, and managing extensive text corpora. | Includes data cleaning, normalization, feature engineering, and transformation for diverse ML tasks (ETL tools, Pandas, Spark, etc.). | Includes system configuration and operational data preprocessing, such as log collection, normalization, and parsing for monitoring and troubleshooting. | [43,239,240,241,242,243,244] |
Data Storage and Management | Due to the immense volume of text data, the utilization of modern distributed storage systems and high-throughput solutions is necessary. | Utilizes many storage technologies such as cloud storage, databases, and distributed file system to achieve scalability. | Uses cloud-based storage and databases to efficiently manage and backup operational data (e.g., AWS S3, GCS, etc.). | [145,245,246,247,248,249] |
Data Labeling and Annotation | Usually involves the process of annotating huge amounts of text data (multimodel data annotation), typically using semi-supervised techniques or pretrained models for labeling. | Supervised learning relies on many procedures, such as manual labeling, automated annotation tools, and crowdsourcing, to ensure accurate data labeling. | Labeling operational metrics and logs can enhance analysis and monitoring, although data labeling and annotation are rare. | [250,251,252,253,254,255,256,257] |
Data Versioning and Lineage | Utilizes complex versioning techniques to manage substantial amounts of training data, maintaining traceability in extensive NLP operations. | Utilized methods such as DVC (Data Version Control) to monitor modifications in datasets and guarantee reproducibility. | Uses Git for configuration management and code version control (not directly associated with data versioning). | [250,258,259,260] |
Aspect | LLMOps | MLOps | DevOps | References |
---|---|---|---|---|
Cloud Platforms and Services | ||||
Primary Cloud Platforms | LLMs require specific services (training, fine tunning, etc.) and infrastructure to accommodate the extensive dataset and computational requirements. Example: AWS offers high-performance computing (HPC) services such as P3 instances and EFS (Elastic File System). Azure offers custom inferencing solutions and integration with Azure OpenAI. | Cloud providers offer a wide range of services that are well suited for various machine learning tasks, such as training models, deploying them, and monitoring their performance. Example: Amazon Web Services (AWS) provides a wide range of MLOps technologies, such as SageMaker and EFS (Elastic File System). Google Cloud Platform (GCP) encompasses the AI Platform BigQuery. Microsoft Azure offers Azure Machine Learning and Databricks. | When it comes to managing infrastructure, automation, and continuous integration and delivery pipelines, DevOps largely makes use of Amazon Web Services (EC2, S3). Google Cloud Platform (GCE, Kubernetes Engine). Microsoft Azure (App Service, Azure pipelines). | [138,261,262,263,264] |
Scalability | Requires highly scalable solutions to handle the substantial computing requirements and extensive data of LLMs. Example: Highly scalable infrastructure. Distributed training refers to the process of training machine learning models over several TPUs or GPUs, such as Azure dedicated computing clusters. | Provides adaptable scaling solutions to handle diverse workloads and model specifications. Example: MLOps solutions offer adaptable scaling capabilities to accommodate a wide range of workloads. Auto-scaling services are AWS EC2 auto-scaling and Google Kubernetes Engine (GKE). | Scalable infrastructure for application deployment, automated workflows, and continuous integration and continuous delivery pipelines. It is common practice to employ auto-scaling for cloud resources Amazon Elastic Compute Cloud auto-scaling and Kubernetes. | [265,266,267,268,269] |
Data Management and Storage | Essential for efficient LLM training, it makes use of state-of-the-art storage technologies built for high throughput and massive text datasets. Example: Particularly for LLMs, offer tailored storage solutions. Amazon Web Services: S3 for large datasets and FSX for Lustre. GCP offers BigQuery and Cloud Storage. | Data lakes, databases, and storage solutions that support a wide range of data formats. Amazon Web Services (AWS) offers three key services: S3, Redshift. Google Cloud Platform (GCP) offers several powerful data storage and processing services, including BigQuery for data analysis. Azure offers several storage options, including Data Lake Storage, Cosmos DB, and Blob Storage. | Cloud platform-based solutions such as Amazon Web Services S3 version control systems, such as Git, are essential to configuration management, since they allow for the tracking of changes. DevOps tool for data management such as DBMS, PostgreSQL, MongoDB, and MySQL. | [270,271,272,273,274,275,276] |
Cost Management | Intense computing demands lead to increased costs; methods for cost management include using reserved instances and specialized hardware. Example: High Costs: Because LLMs use a lot of resources. Optimizing costs: Reserved Instances and Preemptible VMs. Dedicated devices like AWS Inferentia and Azure AI Accelerators are used for inference cost management. | Contains resources for monitoring and improving the efficiency of various machine learning projects budgets. Example: Tools for keeping an eye on and lowering costs are called cost-effectiveness. AWS: AWS Budgets and Cost Explorer. GCP: Tools and records for managing costs. Azure: Keeping track of costs and billing. | The management of expenses by DevOps teams is accomplished through the utilization of auto-scaling, pay-as-you-go cloud models. Tools such as Amazon Web Services, Cost Explorer, Microsoft Azure’s Cost Management, and Google Clouds are frequently utilized for the purpose of monitoring and controlling expenses. | [277,278,279,280] |
Service Offerings | Requires utilizing powerful computing systems for training, utilizing dedicated hardware for making predictions, and providing customized fine-tuning services. Example: High-Performance Computing (HPC) is used for training large-scale LLM models, such as Google Cloud Platform’s Tensor Processing Units (TPUs), and Amazon Web Services P3 instances. Model fine-tuning can be done via APIs such as Azure OpenAI and Hugging Face on AWS. | This focuses on tools and services that facilitate the automated development, deployment, and monitoring of models. Example: AutoML Services refer to the automated process of developing and tuning models, AWS SageMaker Autopilot, and GCP AutoML. Managed Kubernetes Services, such as AWS EKS and GCP GKE, are used for deploying applications in a scalable manner. Model monitoring refers to Azure Monitor and AWS CloudWatch, to keep track of the performance and behavior of deployed models. | Continuous Integration and Continuous Delivery (CI/CD) pipelines are the focus of a wide range of services that are provided by DevOps initiatives. It is possible to use GitLab CI/CD, Jenkins, or AWS CodePipeline as references. In addition, monitoring tools such as Prometheus, Grafana, AWS CloudWatch, and Azure Monitor are essential components for monitoring the performance of the system. | [53,138,166,281,282,283,284,285] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Pahune, S.; Akhtar, Z. Transitioning from MLOps to LLMOps: Navigating the Unique Challenges of Large Language Models. Information 2025, 16, 87. https://doi.org/10.3390/info16020087
Pahune S, Akhtar Z. Transitioning from MLOps to LLMOps: Navigating the Unique Challenges of Large Language Models. Information. 2025; 16(2):87. https://doi.org/10.3390/info16020087
Chicago/Turabian StylePahune, Saurabh, and Zahid Akhtar. 2025. "Transitioning from MLOps to LLMOps: Navigating the Unique Challenges of Large Language Models" Information 16, no. 2: 87. https://doi.org/10.3390/info16020087
APA StylePahune, S., & Akhtar, Z. (2025). Transitioning from MLOps to LLMOps: Navigating the Unique Challenges of Large Language Models. Information, 16(2), 87. https://doi.org/10.3390/info16020087