Skip to content
4 min read

How Will AI Pricing Change As Inference Costs Drop to Almost Zero?

How Will AI Pricing Change As Inference Costs Drop to Almost Zero?

This article discusses key insights from a recent YouTube video by Ajit Ghuman from Monetizely's channel titled "AI Like Running Water: Mary Meeker AI Trends Report." In the video, Ghuman analyzes Mary Meeker's return to publishing her iconic trends report, this time with a focus on AI, and examines the implications for pricing and monetization strategies as inference costs plummet.

Mary Meeker's Internet Trends Report has been an institution in Silicon Valley since 1995. After pausing publication in 2019 (followed by the pandemic), the report has returned with a significant focus on AI trends. Ajit Ghuman examines this report specifically through the lens of pricing and monetization strategies.

As he notes in the video: "Given there are so many good concise summaries of what is happening in the world of pricing and monetization, I thought we should definitely look at this report from the standpoint of pricing and monetization."

The Dramatic Fall of AI Inference Costs

One of the most significant trends identified is the rapid decline in AI inference costs per token. This is crucial because inference costs directly impact the cost of goods sold (COGS) for AI products, which fundamentally shapes monetization strategies.

Ghuman highlights that "Nvidia's 2024 Blackwell GPU uses 105,000 times less energy to generate tokens than its 2014 Kepler predecessor. AI inference costs are 99.7% lower over two years per Stanford AI."

This decline is happening at an unprecedented rate. According to the Mary Meeker report, AI costs are falling faster than historical price declines for other technologies:

"This chart is funny because it talks about what I'm saying that, hey, AI could be as cheap as electricity or water. And it's saying that the price decline, the relative cost is faster than electricity. It is faster than even computer memory."

Implications for AI Companies and Their Pricing Models

This rapid decline in costs creates both opportunities and challenges for AI companies, particularly those that established their pricing models in 2023 when costs were significantly higher.

Ghuman observes: "Because there has been such a drastic reduction in prices, a lot of companies that started in the year 2023, one of the ones I just recently made a video about was 11 Labs. They have not actually changed their pricing from 2023."

This creates a precarious situation for companies that don't adapt their pricing strategies. As Ghuman warns: "The companies that were disruptive in 2023, are going to get disrupted today in 2025, right? There is no other way."

The Convergence of Open Source and Closed Source Models

Another key trend is the performance convergence between open source (free) and closed source (paid) AI models. This convergence, combined with falling inference costs, means that basic AI capabilities are becoming commoditized.

"If open source models, which are essentially free, are converging with closed source models and inference costs are reducing for AI, what does that mean? That the only way for Google or OpenAI to build product is vertical integration," Ghuman explains.

He further emphasizes: "AI will not be the key differentiator. AI will just be par for the course."

OpenAI's Financial Strategy and User Growth

The report examines OpenAI's financial position, showing that by the end of last year, they had expenses of $5 billion against revenue of $3.5 billion. However, Ghuman contextualizes this apparent loss:

"This is not that extreme as it sounds because OpenAI has onboarded users faster than Google onboarded users… they are serving a heck of a lot more users for that 3.5 billion and they haven't even begun to monetize their things well because they are penetrating the market at the moment."

The user growth numbers are indeed impressive. According to the data presented in the video, ChatGPT has grown to "close to 800 million weekly active users now. That's insane, right?"

This massive user acquisition makes strategic sense: "How expensive is $5 billion to become the backbone of AI? Not that expensive."

The Future of AI Pricing Models

As inference costs approach zero, traditional usage-based pricing models for application-layer AI products will need to evolve. Ghuman predicts a significant shift:

"When the inference cost goes to zero, you don't necessarily have to continue pricing an application layer AI product based on usage based pricing. It is the foundation layer AI product that will continue to be metered consumption based like PG&E becomes like running water. On top of that, you now start to have subscription offerings."

This shift is already happening with major players: "Google's recently come out with VO3. They've recently revamped their AI subscription offerings. OpenAI is revamping subscription offerings. Sooner for the consumers, you're going to have subscription offerings, you're not going to have metered offerings because the cost will continue to go to zero."

The Coming Deflation in AI Services

The inevitable result of these trends is significant price deflation for AI-powered services. Companies building purely generative products will need to integrate with existing software stacks or risk becoming obsolete as AI capabilities become essentially free.

Ghuman provides a dramatic example of potential price compression: "It could be that a product marketer that costs $250,000 today costs you $2,000 next year. It could be that dramatic."

Moving Toward New Pricing Models

As AI becomes more capable across multiple domains, traditional usage-based pricing becomes impractical. When "one service does 14 use cases for you, you're not going to be able to pay for 14 use cases. You will have to pay in some sort of subscription based pricing."

The industry will need to develop new pricing models that reflect the value delivered when "AI does all of the work." This might involve outcome-based pricing approaches that consider what percentage of a human worker's job an AI can perform.

Conclusion: AI Like Running Water

The insights from Mary Meeker's AI Trends Report, as analyzed by Ajit Ghuman, point to a future where AI capabilities become as ubiquitous and inexpensive as utilities like electricity or water. This fundamental shift will require AI companies to rethink their monetization strategies, moving away from usage-based pricing at the application layer toward subscription models and potentially outcome-based pricing.

As Ghuman concludes: "We live in a brave new world" where the economics of AI are being rapidly transformed by plummeting inference costs and the convergence of open and closed source models. Companies that don't adapt their pricing strategies to this new reality risk being disrupted by competitors who embrace the deflationary nature of AI technologies.