A first‑principles teardown of Google’s Hypercomputer—chips, power, networking, memory, and models—and what actually has to be deleted, rebuilt, and scavenged to make the 6‑month doubling curve real.
it’s super excited we are at this point. and to think that we might see an actual double within 6 months is crazy. so where will we be in just a few years?
The SPAD concept is the most intresting part of this breakdown. Seperating prefill and decode workloads at the hardware level makes so much sense when you think about how different the computational profiles are. Most companies treat inference as one monolithic problem but Google splitting readers and writers into distinct physical architectures could be a huge efficiency win.
I'm just excited to see it happen
same, i’m all for it 🔥
“Build Petabyte Shelves—CXL/photonic memory fabrics—that take KV and context off local HBM and turn them into pooled assets.”
so hyped haha. loved the practical approach here. no sci-fi just how they are really planning to do it!
it’s super excited we are at this point. and to think that we might see an actual double within 6 months is crazy. so where will we be in just a few years?
The SPAD concept is the most intresting part of this breakdown. Seperating prefill and decode workloads at the hardware level makes so much sense when you think about how different the computational profiles are. Most companies treat inference as one monolithic problem but Google splitting readers and writers into distinct physical architectures could be a huge efficiency win.