Not that we should be at all surprised, but Nvidia leans on AI pretty hard for speeding up how it plans and designs its next generation of GPUs
For more than 16 years, Nvidia's annual GTC event has been packed full with talks and presentations about all the things you can do with a GPU beyond rendering 3D graphics. This year's conference was no different, of course, but hidden amongst all the talk was an insight into how Nvidia goes about designing its chips, and AI is naturally a big part of it all.
It was Bill Dally, Nvidia's chief scientist, who let us in on this behind-the-scenes glimpse, whilst chatting with his equivalent at Google, Jeff Dean, on the topic of 'advancing to AI's next frontier'. The first bit that caught my attention (thanks to Bearly AI on X) was how Nvidia uses an AI agent when it switches to a new process node.
"Every time we have a new semiconductor process, we have to port our standard cell library to it. It's about 2,500 - 3,000 cells. And that used to take a team of eight people about 10 months, so 80 person months," begins Dally.
"Then we developed a program based on reinforcement learning called NVCell—I think we're up to NVCell 2 or 3 now—and it's overnight on one GPU, and the results are actually better than the human designs… in measures of goodness, of size of the cells, power dissipation, delay. It matches or exceeds the human."
A cell library is basically a big set of pre-designed plans of logic gates, electronic components, and interconnections. Instead of recreating everything from scratch when you want to lay out the texture units in a new GPU, for example, you let software pull everything it needs from this library.
However, with each new process node that a wafer manufacturer, such as TSMC, comes up with, the plans need to be altered because the physical sizes and layouts of the logic gates will change. NVCell has been around for a few years now, so this was hardly a revelation, as Nvidia has talked about it in the past, but I can't recall hearing specific details on the number of cells it has in its library.
That's not the only thing AI improves the speed of. "We have the F model. It's basically an executable model of the GPU where we tape out the design. We have all the geometry done that goes out to TSMC to make chips. And we'd like to collapse that space. And what turns out, what the real long pole in that space is, is design verification. And so we're particularly looking at how we can use AI to prove the designs work more quickly."
AI agents are also being used to explore different ways of doing things with chip design. "We have this program called prefix RL, which applies reinforcement learning to a really age-old problem in computer design, which is where to put the lookahead stages in a Carry-lookahead chain."
My electronics knowledge is a bit rusty these days, but Dally is talking about a method of speeding up the fundamental operation behind adding two digital numbers together. He continues: "This is a problem that's been studied since the 1950s, and this RL program goes at it like it was an Atari video game. It's not trying to make the fastest adder, it's trying to make, actually, [an] adder that barely meets timing and is as small and low power as possible."
"It comes up with totally bizarre designs that no human would ever come up with, but they're actually, you know, 20 or 30% better than the human designs."
Not everything is about exploration or speed when it comes to using AI at Nvidia, though. "What we did is we took a, you know, a generic LLM [ChipNeMo and BugNeMo], and then we fine-tuned it by feeding it all of the design documents proprietary to Nvidia," Dally explained.
"So this is stuff that you can't get outside the company. It's all the RTL hardware design document, all of the RTL for every GPU ever designed at Nvidia, all of the architecture specs…·And now you have this LLM, that's actually very smart about GPU design." Ah, so a kind of GPUgpt, but one that's genuinely useful? It would seem so, according to Dally.
"One of the biggest gains of this is when you have a junior designer. It turns out the senior designers spend an enormous amount of their time explaining to the junior designer simple things, like, you know, how does a texture unit work? Now they don't have to ask the senior designers that. They can ask ChipNeMo that, and [it] will explain to them in great detail how the texture unit works."
Hmm, let's hope Nvidia has a robust system for checking if NeMo is hallucinating and telling its junior designers that a texture unit is used to improve how well a polygon feels in your mouth or something daft like that, otherwise the RTX 6090 could well end up only reaching its full potential in the kitchen.