Many of us are now familiar with the infamous video of NVIDIA CEO, Jensen Huang, bringing the tastiest of tech buns out of his gigantic oven during his GTC Keynote presentation earlier in May. If you're not, well here it is:
But what exactly had he been cooking in there all this time? In Jensen's words, "The world's largest graphics card," otherwise known as their brand-new DGX-A100.
But what exactly is in this beast and what is it used for?
In short, this is datacentre tech, and it's designed to power deep learning, a form of Artificial Intelligence designed to analyse and learn from vast amounts of data.
Under all those gold-plated heatsinks lies 8 of NVIDIA's A100 GPUs, boasting the all new 7nm Ampere architecture we've all been waiting to see. These have all been paired up and geared to go using their signature NVLinks, twelve of them to be precise, alongside 6 of their NVSwitches and 8 Mellanox ConnectX-6 NICs, with Dual 64-core Rome EPYC 7742s from AMD, making for a whopping 128-cores and 256 threads. There's also 1TB of RAM, 15TB of NVMe storage, and the whole thing operates on PCIe 4.0.
This is revolutionary for the datacentre sectors, especially so for those focused on AI learning.
One particular point of interest for the new DGX-A100s, is their ability to transfer data back and forth at incredibly high speeds, thanks to the bandwidth provided by their 6 NVSwitches, which allow for up to 4.8TBs of Bi-directional bandwidth (data which can be sent and received), which in practice means they could feasibly transfer 426 hours of HD video footage in only one second. Yes. You read that right. 426 hours - one second...
On top of this already impressive feat, the system can also be instanced up to 7 times per GPU, meaning it could theoretically manage up to 56 completely different tasks simultaneously. It could even serve as multiple VMs with dynamic resource allocation (known as Multiple Instance Graphics, or MIG), allowing for entire teams of data scientists to collaborate using the same A100 GPU. And all whilst retaining the same level of power as a single V100 GPU per each individual user.
Equally, seeing as the entire system is capable of performing analytics, training, and inference, the system is able to adapt dynamically to the ever-changing demands of deep learning.
What does any of this mean in real world application?
Put simply, the previous AI datacentre solution offered by NVIDIA was the DGX-1, which featured their V100 GPUs. The new A100 GPUs however, put them to shame. As mentioned above, it's possible to instance the DGX-A100s up to 56 times total, and each of those individual instances is equal in power to an entire V100. Here's what that actually looks like in terms of the datacentres of today, versus the datacentres of tomorrow:
In essence, what NVIDIA has done here is monumental. They have created a datacentre solution which is 10 times more powerful, a tenth of the cost, has a twentieth of the power consumption and is one twenty-fifth of the footprint.
In a way, they have created possibly the world's most powerful, hyperconverged AI/GPU datacentre, and covered it in gold-plating.
And if that isn't a flex, I don't know what is...