GPU-Accelerated Applications

Accelerated computing has revolutionised a broad range of industries with over five hundred applications optimized for GPUs to help you accelerate your work.


Meet CUDA, a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). Developers are using CUDA to dramatically speed up computing applications by harnessing the power of GPUs.

Applications that are GPU-accelerated runs part of the subsequent workload on the CPU - which is optimised for single-threaded performance – while the compute intensive portion of the application runs on thousands of GPU cores in parallel. With CUDA, developers program in popular languages such as C, C++, Fortran, Python and MATLAB and express parallelism through extensions in the form of a few basic keywords.


The NVIDIA Deep Learning GPU Training System (DIGITS) puts the power of deep learning into the hands of engineers and data scientists, used to rapidly train the highly accurate deep neural network (DNNs) for image classification, segmentation and object detection tasks.

DIGITS simplifies common deep learning tasks such as managing data, designing and training neural networks on multi-GPU systems, monitoring performance in real time with advanced visualizations, and selecting the best performing model from the results browser for deployment. DIGITS is completely interactive so that data scientists can focus on designing and training networks rather than programming and debugging.


The NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. cuDNN is part of the NVIDIA Deep Learning SDK.

Deep learning researchers and framework developers worldwide rely on cuDNN for high-performance GPU acceleration. It allows them to focus on training neural networks and developing software applications rather than spending time on low-level GPU performance tuning. cuDNN accelerates widely used deep learning frameworks, including Caffe2, MATLAB, Microsoft Cognitive Toolkit, TensorFlow, Theano, and PyTorch.