  • Grace Blackwell Superchip Overview

    The Grace Blackwell Superchip represents a monumental leap in computing technology, merging unparalleled processing power with advanced architectural design. With a memory clock speed of 8Gbps provided by HBM3E technology and an expansive memory bus width of 2x2x4096-bit, this superchip achieves an astonishing memory bandwidth of 2x8TB/sec, supported by a vast 384GB of VRAM. At the core of its computational prowess, the superchip boasts tensor processing capabilities across a wide range of precisions, delivering up to 20 PFLOPS for FP4 Dense Tensor operations, 10 P(FL)OPS for INT8/FP8, 5 PFLOPS for FP16, 2.5 PFLOPS for TF32, and an impressive 90 TFLOPS for FP64 Dense Tensor calculations. Connectivity is no less advanced, with dual NVLink 5 interfaces reaching 1800GB/sec and PCIe 6.0 connections providing an additional 256GB/sec bandwidth. Powered by two "Blackwell GPUs" and harboring a staggering 416 billion transistors, this superchip is not just a powerhouse but a marvel of modern engineering. Its thermal design power (TDP) stands at 2700W, indicative of its high performance and energy demands. Fabricated on TSMC's 4NP process, the Grace Blackwell Superchip sets a new standard for high-performance computing platforms, blending the Grace and Blackwell architectures to achieve unmatched computational efficiency and throughput.

Attribute Specification
Type Grace Blackwell Superchip
Memory Clock 8Gbps HBM3E
Memory Bus Width 2x2x4096-bit
Memory Bandwidth 2x8TB/sec
VRAM 384GB (2x2x96GB)
FP4 Dense Tensor 20 PFLOPS
INT8/FP8 Dense Tensor 10 P(FL)OPS
FP16 Dense Tensor 5 PFLOPS
TF32 Dense Tensor 2.5 PFLOPS
FP64 Dense Tensor 90 TFLOPS
Interconnects 2x NVLink 5 (1800GB/sec) + 2x PCle 6.0 (256GB/sec)
GPU 2x “Blackwell GPU”
GPU Transistor Count 416B (2x2x104B)
TDP 2700W
Manufacturing Process TSMC 4NP
Interface Superchip
Architecture Grace + Blackwell



