I'd like bigger GPUs. A trillion parameter model at 16 bits needs 2000gb+ for in...

mcbuilder · on June 6, 2023

TBH this is what all ML researcher / engineers have wanted for the past 10 years.

0cf8612b2e1e · on June 6, 2023

My question on the very slow growth of available memory: are there technical reasons they cannot trivially build a card with 100GB of RAM (even with lower performance) or has it been a business decision to milk the market for every penny?

Dylan16807 · on June 6, 2023

High speed I/O pins cost a lot, and GDDR generally has 32 data pins per chip and no way to attach multiple chips to the same pins. So 256 bits and 16GB is hard to exceed by much on that tech. The high end is 384 bits and 24GB.

There is a mode to attach 16 data pins to each GDDR chip, so with some extra effort you could probably double that to 48GB. Or at least 32GB. Maybe this is a valid niche, or maybe there isn't enough demand.

The alternative to this is HBM, which can stack up big amounts, but it's a lot more expensive.

Baeocystin · on June 6, 2023

I don't disagree with Dylan, but I'm more than willing to bet that the only reason Nvidia's cards (and that's who we're talking about. CUDA is a hell of a moat.) are RAM-starved is that they haven't felt the pressure to do otherwise. AMD has an institutional aversion towards good software. Intel isn't even an also-ran, yet.

Apple and their unified memory architecture may be the prod needed to get larger levels of RAM available to single cards solutions. We'll see.

shaklee3 · on June 7, 2023

Nvidia has had unified memory for more than 6 years. This chip is just a faster interconnect for it.