I think those are going to be run until they die. The capex vs opex is too high to obsolete them in a few years. They'll keep serving current gen LLMs for as long as they keep running.
It won't make sense to run them after two years. The vendors will be limited on datacenter space, power and cooling, and there will be new hardware available that will run the same models at a fraction of the power.
A100 -> H100 was >3x tokens per joule, H100 -> B200 >10x. There are significant low-hanging fruit still available in architectural efficiency, and the vendors are chasing them.
This is the big risk for AI companies that I feel is not being sufficiently priced in. Almost none of the investments they are making are durable, the depreciation schedules for everything but the real estate should be less than 24 months. Until the hardware is stable enough that you only get double-digit % improvements per generation, it should almost be counted as opex.
The annual operating cost of these is <10% purchase price annually. Even if the B200 is 10x more efficient in practice you can still operate the H100 at profit.
As it stands there's way more demand than supply. The new GPUs are going to run frontier models while the older ones serve smaller ones.
That said some of these are running in tents hooked up to mobile turbines. I can see some of those going away but generally I think you'll see them used until they start to fail in 5-10 years.
They can also be used for other things than running the main frontier whatever model as well.
E.g. grok isn't truly multi-modal, it has a callable tool that is a separate VLM it invokes on image URLs or files (for a long time it was grok-1.5v, but I think they have upgraded now, it was pretty bad).
And then you have the small summarizer models for the CoT/thought traces, the guidable summarizer models for the standard browse tools, etc.
That's not the issue. The issue is how tall these grills are and to a lesser extent curb weight. You don't need either to have third row seating or the ability to haul things.
I think every business wants to bill on value not usage. That's where the real money is made. If a diagnosis is worth $100 and takes $1 worth of tokens you want to bill as close to $100 as you can. Right now they're billing $1 and barely making money.
Healthcare market is completely distorted. Price isn't linked to value because the person that uses a service is usually not the person directly paying for it. Worse, the price usually isn't known upfront, so no one is making a rational decision based on "value".
I think dead on arrival is too extreme but the niche is certainly hard to see. The "just works" crowd will buy consoles and the "max performance" crowd won't be happy with this value. The niche is something like "willing to tolerate some headaches but not so much as to build my own PC". That exists but seems small.
Feels like they should have gone cheap. Undercut the switch and be the cheapest way to play games on your TV. We're pretty far past performance equalling more entertainment. A 150-200 box to play indie side scrollers is a niche that exists.
I don't buy consoles because I have been a PC gamer for over a decaded. But you know, I'm a parent now. And I want couch gaming with my family. That's the use case for me my family. I got mu child a steam deck (I have one too). A steam deck is a terrible idea for an elementary schooler. It has to stay docked now because she's already broken parts of it.
But even docked, it's a winner for her and all her friends. She's converting more parents over via her friends. The well off ones are just buying from Valve like me. The less so, are using whatever PC is around to mixed results. I'll see how it goes as the kids get older, but I think there's a bigger case than you think and I think it's mostly years long PC gamers who want a more communal experience be it with partners, kids, or friends.
Eh this is true but in practice "my RAM stick went bad" is 0% of our downtime. It's almost always that we screwed up something in the app layer. And my company of ~100 does have 5 engineers on call 24/7. It's baked into our salary to do that.
There's a huge middle ground between Manhattan and homesteading on a $10k property. Here in Atlanta you can buy a 4/3 house for $270k or a 2/1 house for $160k. The rust belt and sunbelt are full of places with cheap housing.
Perhaps it's Sea Island which is a luxury enclave with homes in the millions. They seem to define cities as incorporated & unincorporated cities, towns, communities, etc.
reply