Type: Grace Blackwell Superchip

Memory Clock: 8Gbps HBM3E

Memory Bus Width: 2x2x4096-bit

Memory Bandwidth: 2x8TB/sec

VRAM: 384GB (2x2x96GB)

FP4 Dense Tensor: 20 PFLOPS

INT8/FP8 Dense Tensor: 10 P(FL)OPS

FP16 Dense Tensor: 5 PFLOPS

TF32 Dense Tensor: 2.5 PFLOPS

FP64 Dense Tensor: 90 TFLOPS

Interconnects: 2x NVLink 5 (1800GB/sec) + 2x PCIe 6.0 (256GB/sec)

GPU: 2x “Blackwell GPU”

GPU Transistor Count: 416B (2x2x104B)

TDP: 2700W

Manufacturing Process: TSMC 4NP

Interface: Superchip

Architecture: Grace + Blackwell

Categories: ,
  • Grace Blackwell Superchip Overview

    The Grace Blackwell Superchip represents a monumental leap in computing technology, merging unparalleled processing power with advanced architectural design. With a memory clock speed of 8Gbps provided by HBM3E technology and an expansive memory bus width of 2x2x4096-bit, this superchip achieves an astonishing memory bandwidth of 2x8TB/sec, supported by a vast 384GB of VRAM. At the core of its computational prowess, the superchip boasts tensor processing capabilities across a wide range of precisions, delivering up to 20 PFLOPS for FP4 Dense Tensor operations, 10 P(FL)OPS for INT8/FP8, 5 PFLOPS for FP16, 2.5 PFLOPS for TF32, and an impressive 90 TFLOPS for FP64 Dense Tensor calculations. Connectivity is no less advanced, with dual NVLink 5 interfaces reaching 1800GB/sec and PCIe 6.0 connections providing an additional 256GB/sec bandwidth. Powered by two "Blackwell GPUs" and harboring a staggering 416 billion transistors, this superchip is not just a powerhouse but a marvel of modern engineering. Its thermal design power (TDP) stands at 2700W, indicative of its high performance and energy demands. Fabricated on TSMC's 4NP process, the Grace Blackwell Superchip sets a new standard for high-performance computing platforms, blending the Grace and Blackwell architectures to achieve unmatched computational efficiency and throughput.

Attribute Specification
Type Grace Blackwell Superchip
Memory Clock 8Gbps HBM3E
Memory Bus Width 2x2x4096-bit
Memory Bandwidth 2x8TB/sec
VRAM 384GB (2x2x96GB)
FP4 Dense Tensor 20 PFLOPS
INT8/FP8 Dense Tensor 10 P(FL)OPS
FP16 Dense Tensor 5 PFLOPS
TF32 Dense Tensor 2.5 PFLOPS
FP64 Dense Tensor 90 TFLOPS
Interconnects 2x NVLink 5 (1800GB/sec) + 2x PCle 6.0 (256GB/sec)
GPU 2x “Blackwell GPU”
GPU Transistor Count 416B (2x2x104B)
TDP 2700W
Manufacturing Process TSMC 4NP
Interface Superchip
Architecture Grace + Blackwell



    There are no reviews yet.

    Only logged in customers who have purchased this product may leave a review.

    You may also like…

    • NVIDIA DGX A100 Deep Learning Console – DGX A100

      Add to cart
    • NVIDIA DGX GH200 Deep Learning Console