Energy-Efficient Cloud Infrastructure Design For Large Language Model Training And Inference

Authors

  • Gopi Kathiresan Author

DOI:

https://doi.org/10.64252/qwxkxm32

Keywords:

Cloud, Inference, Energy, LLM, Infrastructure

Abstract

The rapidly increasing development of Large Language Models (LLMs) has rapidly placed a tremendous burden on cloud computing in terms of energy requirements, cost of operation, and environmental impacts, never seen before. This paper presents an overall architectural design that focuses on making energy efficient all the levels of LLM training and inference workloads. With a multi-prong approach, the design will take advantage of energy efficient accelerators (e.g., NVIDIA H100, TPUs), new cooling solutions (liquid and direct-to-chip), as well as software-level optimizations such as quantization and pruning and knowledge distillation.

Downloads

Download data is not yet available.

Downloads

Published

2025-06-22

Issue

Section

Articles

How to Cite

Energy-Efficient Cloud Infrastructure Design For Large Language Model Training And Inference. (2025). International Journal of Environmental Sciences, 1126-1133. https://doi.org/10.64252/qwxkxm32