Name: Nvidia Tesla M40
Brand: NVIDIA
SKU: TCSM40M-PB

Nvidia Tesla M40

NVIDIA GPU Boost™ delivering up to 7 Teraflops of single-precision performance.

24 GB of GDDR5 memory for training large deep learning models.

Server-qualified to deliver maximum uptime in the data center.

Power your data center with the world’s fastest deep learning training accelerator.

Deep learning is redefining what’s possible, from image recognition and natural language processing to neural machine translation and image classification. From early-stage startups to large web service providers, deep learning has become the fundamental building block in delivering amazing solutions for end users. Deep learning models typically take days to weeks to train, forcing scientists to make compromises between accuracy and time to deployment. The NVIDIA Tesla M40 GPU accelerator, based on the ultra-efficient NVIDIA Maxwell™ architecture, is designed to deliver the highest single precision performance. Together with its high memory density, this makes the Tesla M40 the world’s fastest accelerator for deep learning training. Running Caffe and Torch on the Tesla M40 delivers the same model within hours versus days on CPU-based compute systems:

TESLA M40 FEATURES THE LARGEST MEMORY CAPACITY PER GPU

Researchers and developers are building bigger, more sophisticated neural networks to increase detection and prediction accuracy. Training these bigger networks demands more GPU memory, and the M40 is purpose-built to handle these workloads. This accuracy improves performance in a variety of applications

More accurate speech recognition
More accurate image identifying of objects like street signs, pedestrians, etc.
Deeper understanding in video and natural language content
Better detection of anomalies in medical images, improving medical diagnosis

DEEP LEARNING ECOSYSTEM BUILT FOR TESLA PLATFORM

The Tesla M40 accelerator provides a powerful foundation for customers to leverage best-in-class software and solutions for deep learning. NVIDIA cuDNN, DIGITS™ and various deep learning frameworks are optimized for the NVIDIA Maxwell™ architecture and Tesla M40 to power the next generation machine learning applications.

GPU Architecture	NVIDIA Maxwell
NVIDIA CUDA® Cores	3072
Single-Precision Performance	7 Teraflops with NVIDIA GPU Boost
Double-Precision Performance	0.2 Teraflops
GPU Memory	24 GB GDDR5
Memory Bandwidth	288 GB/s
System Interface	PCI Express 3.0 x16
Max Power Consumption	250 W
Thermal Solution	Passive
Form Factor	4.4” H × 10.5” L, Dual Slot, Full Height
Compute APIs	CUDA, DirectCompute, OpenCL™, OpenACC