☁️ When we pay for AI cloud compute, what are we really paying for? 💲

Community Article Published October 28, 2025

We often talk about the financial, energy, and environmental costs of AI interchangeably, or at least in the same breath, but how do they actually relate to each other? To start answering these questions, we ran an analysis looking at the hourly cost for GPU instances across different cloud providers, and how this compares to other characteristics - like energy, memory, and GPU purchase price. We find that strong correlation between the energy requirements, purchase costs, and rent prices on cloud instances of most commercial GPUs, and follow up with a discussion about the market dynamics at large and their importance in the context of AI's strong growth.

We also provide an Interactive Space with the project's data and visualizations!

Introduction

Methodology

Results
CapEx and Compute Cost

Energy and Compute Cost

Discussion

Conclusion

Introduction

Access to cloud compute, especially to GPUs, has become a core part of training and deploying AI models. While a handful of the biggest tech companies pour billions of dollars into building out compute clusters with tens of thousands of GPUs, most other companies, research labs and individual researchers rely upon renting out compute instances from providers such as Amazon Web Services, and Google Cloud Platform to train and deploy their models.

Renting out a compute instance with a dedicated GPU can cost anywhere from 50 cents to hundreds of dollars per hour, depending on the kind of instance you’re renting. But what are customers really paying for, and what does it depend on? Is it the up-front cost of the GPUs themselves (the “CapEx”, or capital expenditure), the operating expenses (“OpEx”) incurred by the provider, or something else entirely? And is optimizing to reduce the financial costs of systems synonymous with managing their energy and environmental impacts? The answers to these questions can shed light on the long- and short- term trends that are taking place in the field of AI, and the relationships that exist between upfront costs and operational costs for different compute providers.

In our analysis, we look at ninety AI-centric compute instances across 5 compute providers, and look at their cost per hour as a function of different characteristics of the instances, and try to disentangle the relationship between energy and cost in the context of AI compute. The initial guiding question for our analysis is the extent to which energy (represented by TDP) is correlated with the cost of compute for users (represented by the dollar cost per hour).

Screenshot of the data visualization space for this analysis.

Methodology

In order to do our analysis, we gathered data from 5 major cloud compute providers – Microsoft Azure, Amazon Web Services, Google Cloud Platform, Scaleway Cloud, and OVH Cloud – about the price and nature of their AI-specific compute offerings (i.e. all instances that have GPUs). For each instance, we looked at its characteristics – the type and number of GPUs and CPUs that it contains, as well as the quantity of memory it contains and its storage capacity.

For each CPU and GPU model, we looked up its TDP(Thermal Design Power) -- its power consumption under the maximum theoretical load), which is an indicator of the operating expenses required to power it. For GPUs specifically, we also looked at the Manufacturer's Suggested Retail Price (MSRP)¹, i.e. how much that particular GPU model cost at the time of its launch, as an indicator of the capital expenditure required for the compute provider to buy the GPUs to begin with.

Results

We look at the data gathered from two different angles:

To what extent the up-front cost of the GPUs (the “CapEx”) is transferred to downstream users, and
To what extent the operating expenses of powering the compute instances (the “OpEx”) is transferred to downstream users

CapEx and Compute Cost

Plotting the up-front cost of GPUs and their dollar cost per hour, we can see a strong correlation between the two – as the initial cost of the GPU rises, as does its cost per hour (note that both variables are in log scale). This makes sense, given that companies have to recuperate their initial investment (the CapEx) with the money that they charge to customers.

We find a correlation coefficient term of 0.9 between the two factors – this could be explained for example if companies are primarily motivated to justify their purchase cost rapidly, given the high cost of GPUs and the speed at which GPU models are updated. Looking at different GPU models (indicated in different colors), we can see that they rise proportionally with cost – i.e. an instance with 2 V100 GPUs will cost roughly 2 times more per hour than one with 4 GPUs, which follows the interpretation that we proposed above.

Energy and Compute Cost

The second question that we asked ourselves was whether the operating expenses (i.e. the energy required) for GPU cloud compute instances were equally correlated with the hourly cost charged by compute providers. As can be seen in the plot below, there is also a positive correlation between the two factors, across different compute providers, with both axes in logarithmic scale :

The coefficient term for these two factors is lower (0.68), meaning energy consumption is correlated than the initial investment that we looked at previously. We posit that this is due to the relatively cheap price of energy – compared to the up-front cost of GPUs – required to operate these instances on top of the initial investment made by compute providers. This is more of an indicator of long-term profitability, as the operating cost is constant (or even, in some cases, rising) over time, as more and more datacenters are built in a small number of regions, putting strain on local energy grids.

We have created an interactive Space that lets you explore our results (including the plots shown above) and generate your own plots to find new trends!

Discussion

Given the increasingly prohibitive costs of AI compute (with the latest generation of GPUs costing tens of thousands of dollars), this creates a skewed market dynamic. This is similar to the market dynamics of the housing market – as property costs rise, less and less individuals can afford to buy property, with most being forced to rent, paying monthly rent to landlords without contributing to their own, long-term investment and subsequent profit that this can have. We see a similar dynamic currently playing out in the AI compute space, with only a handful of companies that can afford the up-front investment into GPUs and the vast majority being forced to rent instances by the hour. In fact, a recent article by Semi Analysis on GPU cloud economics reached a similar conclusion, stating that “capital is the only real barrier to entry, not physical infrastructure”.

Honing in on a few specific GPUs to illustrate this point- the NVIDIA Tesla V100, whose cost typically falls around $10,000 USD, whereas its average rental cost per hour is between $2 and $3 per hour -which means that assuming 100% utilization, it would take between **4 and 7 months to recuperate the up-front investment. **For an older GPU like the P100 (which was launched in 2016), it would take less than 4 months (assuming current compute costs); for the most recent generation of GPUs such as the B200, whose retail price can exceed $500,000 per node of 8 GPUs, it would take over 6 months to amortize despite an hourly cost exceeding $100 per hour. Multiplied by the thousands of GPUs that are bought and rented by cloud providers (and discounted by the fact that they are not used all of the time), this means that companies are under constant pressure to recover their costs as quickly as possible, since new generations of GPUs are launched regularly.

In terms of the cost of energy, a compute instance with a single V100 GPU (with an accompanying CPU) would require around 2825 kWH of energy per year - in a region such as Virginia, which is home to 35% of all hyperscale data centers worldwide, based on the latest cost of energy, that would add up to $423 per year; for an instance with 8 V100 GPUs, it would cost $3660. Large hyperscale data centres accommodate tens of thousands of servers, meaning that their energy bills add up to millions of dollars per year. However, we would need more transparency to have a better understanding of how general data center operating costs (including salaries, space, maintenance, permits, etc.), and upfront costs amortization relate to energy cost, as well as how much of a margin exists on compute rental if utilization is well optimized.

Also, while neither TDP nor MSRP is a perfect indicator of operating costs, given the additional overhead in both infrastructure costs and energy expenditures that are added on top during data center operation. For instance, a recent analysis by Epoch AI of the total training cost of AI models estimated that energy was a marginal part of total cost of AI training and experimentation (less than 6% in the case of all 4 frontier AI models analyzed), and a recent analysis by Dwarkesh Patel and Romeo Dean estimated that power generation represents roughly 7% of a datacenter’s cost. This helps contextualize the numbers that we have obtained within a larger expenditure dynamic that also includes human and overhead costs, as well as the costs associated with building out and maintaining datacenters.

Conclusion

In our analysis, we take a deep dive into the dynamics of capital expenditures and operating expenses when it comes to buying and maintaining AI-specific compute instances by 5 different cloud providers. We find that both types of costs contribute strongly towards the dollar cost per hour paid by consumers, with operating expenses (measured via their theoretical maximal energy load, of TDP) slightly more correlated with the dollar cost of instances per hour than the up-front cost of the GPUs themselves (the MSRP).

Taken together within the broader context of growth of the cost of training and deploying AI systems as well as their rising energy consumption, it helps shed light on the market pressures at play both in the short- and long-term. Given the cost of datacenter-grade GPUs and the emphasis put on them, it is also increasingly difficult for individuals to own their own compute resources. However, this dynamic could still be shifted in numerous ways - from developing consumer GPUs that are significantly more accessible while also allowing individuals to deploy their own models to putting an emphasis on smaller, task-specific models and continuing to favor open-sourcing models instead of using proprietary, black-box models and APIs.

Acknowledgments

We want to thank Boris Gamazaychikov for his valuable feedback and suggestions.

[1] MSRP was often gathered from third-party sources and not NVIDIA directly – for a full list of sources, look at the data presented in our Space

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote