Passer à la navigation principale Passer à la recherche Passer au contenu principal

Untangling GPU Power Consumption: Job-Level Inference in Cloud Shared Settings

  • École de technologie supérieure
  • Universite Claude Bernard Lyon 1
  • University of Rennes 1

Résultats de recherche: Chapitre dans un livre, rapport, actes de conférenceParticipation à un ouvrage collectif lié à un colloque ou une conférenceRevue par des pairs

Résumé

As the demand for AI-driven workloads increases, the energy consumption of Graphics Processing Units (GPUs) devices has come under intense scrutiny, particularly in hyperscale data centers where large numbers of accelerators are centralized and leased to diverse clients. In the context of cloud hyperscalers, GPUs power monitoring presents several challenges that vary depending on the product offered. The monitoring capabilities of physical devices may be limited or even absent for some products. However, given the substantial energy demands of GPUs, power monitoring is essential for both cloud providers and clients. Operators require tools to manage power distribution effectively, such as balancing workloads across Power Distribution Units (PDUs), while clients need visibility into power usage to optimize their workloads for energy efficiency. To address these challenges, we propose methods for estimating the energy consumption of jobs running on GPU devices in cloud environments, spanning from shared and managed offerings like ML-as-a-Service (MLaaS) to less managed products (e.g., Infrastructure-as-a-Service (IaaS)). Our models demonstrate the benefits of sharing GPUs for small AI workloads, as well as the current sub-optimal utilization of GPUs in cloud hyperscalers, based on insights from an IaaS GPU cluster.

langue originaleAnglais
titreEUROSYS 2026 - Proceedings of the 2026 European Conference on Computer Systems
EditeurAssociation for Computing Machinery, Inc
Pages624-640
Nombre de pages17
ISBN (Electronique)9798400722127
Les DOIs
étatPublié - 26 avr. 2026
Evénement2026 European Conference on Computer Systems, EUROSYS 2026 - Edinburgh, Royaume-Uni
Durée: 27 avr. 202630 avr. 2026

Série de publications

NomEUROSYS 2026 - Proceedings of the 2026 European Conference on Computer Systems

Conférence

Conférence2026 European Conference on Computer Systems, EUROSYS 2026
Pays/TerritoireRoyaume-Uni
La villeEdinburgh
période27/04/2630/04/26

Empreinte digitale

Voici les principaux termes ou expressions associés à « Untangling GPU Power Consumption: Job-Level Inference in Cloud Shared Settings ». Ces libellés thématiques sont générés à partir du titre et du résumé de la publication. Ensemble, ils forment une empreinte digitale unique.

Contient cette citation