5 Comments
User's avatar
Destiny S. Harris's avatar

I'm just excited to see it happen

Expand full comment
ToxSec's avatar

same, i’m all for it 🔥

Expand full comment
ToxSec's avatar

“Build Petabyte Shelves—CXL/photonic memory fabrics—that take KV and context off local HBM and turn them into pooled assets.”

so hyped haha. loved the practical approach here. no sci-fi just how they are really planning to do it!

Expand full comment
ToxSec's avatar

it’s super excited we are at this point. and to think that we might see an actual double within 6 months is crazy. so where will we be in just a few years?

Expand full comment
Neural Foundry's avatar

The SPAD concept is the most intresting part of this breakdown. Seperating prefill and decode workloads at the hardware level makes so much sense when you think about how different the computational profiles are. Most companies treat inference as one monolithic problem but Google splitting readers and writers into distinct physical architectures could be a huge efficiency win.

Expand full comment