NVIDIA H100 GPU

The NVIDIA H100 Tensor Core GPU enables an order-of-magnitude leap for large-scale AI and HPC with unprecedented performance, scalability, and security for every data center and includes the NVIDIA AI Enterprise software suite to streamline AI development and deployment. With NVIDIA NVLINK Switch System direct communication between up to 256 GPUs, H100 accelerates exascale workloads with a dedicated Transformer Engine for trillion parameter language models. H100 can be partitioned for small jobs to rightsized Multi-Instance GPU (MIG) partitions. With Hopper Confidential Computing, this scalable compute power can secure sensitive applications on shared data center infrastructure. The inclusion of the NVIDIA AI Enterprise software suite reduces time to development and simplifies the deployment of AI workloads, and makes H100 the most powerful end-to-end AI and HPC data center platform.

  • The World's Most Advanced Chip
  • Transformer Engine: Supercharging Al, Delivering Up to 30X Higher Performances
  • NVLink Switch System
  • Second-generation Multi-Instance GPU (MIG): 7X More Secure Tenants
  • NVIDIA Confidential Computing
  • New DPX Instructions: Solving Exponential Problems with Accelerated Dynamic-Programming

Deep Learning Training

Performance and Scalability

The era of exascale AI has arrived with trillion parameter models now required to take on next generation performance challenges such as conversational AI and deep recommender systems.

Confidential Computing

Secure Data and AI Models in Use

New Confidential Computing capabilities make GPU secure end-to-end without sacrificing performance, making it ideal for ISV solution protection and Federated Learning applications.

Deep Learning Inference

Performance and Scalability

AI today solves a wide array of business challenges using an equally wide array of neural networks. So a great AI inference accelerator has to deliver the highest performance and the versatility to accelerate these networks, in any location from data center to edge that customers choose to deploy them.

High-Performance Computing

Faster Double-Precision Tensor Cores

HPC's importance has never been clearer in the last century than since the onset of the global pandemic. Using supercomputers, scientists have recreated molecular simulations of how COVID infects human respiratory cells and developed vaccines at unprecedented speed.

Data Analytics

Faster Double-Precision Tensor Cores

Data analytics often consumes the majority of the time in AI application development since the scale-out solutions with commodity CPU-only servers get bogged down by lack of a scalable computing performance as large datasets are scattered across multiple servers.

Optimizing Compute Utilization

Mainstream to Multi-node Jobs

IT managers seek to maximize utilization (both peak and average) of their compute resources. They often employ the dynamic reconfiguration of compute to right-size resources for the workload in use.

Choose the right Data Center GPU

Solution Category

DL Training & Data Analytics

DL Inference

HPC/AI

Omniverse
RenderFarms

Virtual Workstation

Virtual Desktop (VDI)

Mainstream Acceleration

Far Edge Acceleration

AI-on-5G

GPU Solution for Compute

H100*

PCIE
SXM
CNX
PCIE
SXM
CNX
PCIE
SXM
CNX

 

 

 

PCIE
CNX

 

CNX

A100

PCIE
SXM
A100X
PCIE
SXM
PCIE
SXM
A100X

 

 

 

PCIE
A100X

 

A100X

A30

 

PCIE
PCIE

 

 

 

PCIE

 

A30X

GPU Solution for Graphics/Compute

L40

 

 

 

 

 

A40

 

 

 

 

 

 

A10

 

 

 

A16

 

 

 

 

 

 

 

GPU Solution for Small Form Factor Compute/Graphics

A2

 

 

 

 

T4

 

 

 

 

 

Price-performance comparison within each solution category (Compute, Graphics and Compute, SFF Compute and Graphics) and workload column.

  • Best
  • Better
  • Good
  • PCie Form Factor
  • CNX-H100 + ConnectX-7 Converged Accelerator
  • SXM Form Factor
  • A100X/A30X - A100 or A30 + BlueField-2 Converged Accelerator

Specifications:

H100 CNX
GPU Memory 80GB HBM2e
Memory Bandwidth > 2.0TB/s
MIG instances 7 instances @ 10GB each
3 instances @ 20GB each
2 instances @ 40GB each
Interconnect PCIe Gen5 128GB/s
NVLINK Bridge Two-way
Networking 1x 400Gb/s, 2x 200Gb/s ports, Ethernet or InfiniBand
Form Factor Dual-slot full-height, full length (FHFL)
Max Power 350W
Nvidia H100 CNX

Specifications:

NVIDIA L40S
GPU Architecture NVIDIA Ada Lovelace Architecture
GPU Memory 48GB GDDR6 with ECC
Memory Bandwidth PCIe Gen4x 16: 64GB/s bidirectional
CUDA™ Cores 18,176
RT Cores 142
Tensor Cores 568
RT Core Performance 212 TFLOPS
FP32 91.6 TFLOPS
TF32 Tensor Core 366 TFLOPS
BFLOAT16 Tensor Core 366 I 7332 TFLOPS
FP16 Tensor Core 366 I 7332 TFLOPS
FP8 Tensor Core 733 I 14662 TFLOPS
Peak INT8 Tensor 733 I 14662 TOPS
Peak INT4 Tensor 733 I 14662 TOPS
Form Factor 4.4" (H) x 10.5" (L), dual slot
Display Ports 4 x DisplayPort 1.4a
Max Power Consumption 350W
Power Connector 16-pin
Thermal Passive
Virtual GPU (vGPU) Software Support Yes
NVENC I NVDEC 3x l 3x (Includes AV1 Encode and Decode)
Secure Boot with Root of Trust Yes
NEBS Ready Level 3
MIG Support No
Nvidia L40S

Every Deep Learning Framework

mxnet
Mytorch
Spark
TensorFlow