GPGPU Definition

A General-Purpose Graphics Processing Unit (GPGPU) is a graphics processing unit (GPU) that is programmed for purposes beyond graphics processing, such as performing computations typically conducted by a Central Processing Unit (CPU).

GPGPU example from Nvidia
Image from Nvidia


What is GPGPU?

GPGPU, also known as GPGPU computing, refers to the increasingly commonplace, modern trend of using GPUs for non-specialized computations in addition to their traditional purpose of computation for computer graphics. Incorporating GPUs for general purposes enhances CPU architecture by accelerating portions of an application while the rest continues to run on the CPU, ultimately creating an overall faster, high-performance application by combining CPU and GPU processing power. 

Harnessing the power of GPUs for general purposes can be accomplished via parallel computing platforms, which allow software developers and engineers to access graphics cards and write programs that enable GPUs to manage any task that can be parallelized.


Essentially all modern GPUs are GPGPUs. A GPU is a programmable processor on which thousands of processing cores run simultaneously in massive parallelism, where each core is focused on making efficient calculations, facilitating real-time processing and analysis of enormous datasets. While GPUs were originally designed primarily for the purpose of rendering images, GPGPUs can now be programmed to direct that processing power toward addressing scientific computing needs as well.

If a graphics card is compatible with any particular framework that provides access to general purpose computation, it is a GPGPU. The primary difference is that where GPU computing is a hardware component, GPGPU is fundamentally a software concept in which specialized programming and equipment designs facilitate massive parallel processing of non-specialized calculations.

What is GPGPU Acceleration?

GPGPU acceleration refers to a method of accelerated computing in which compute-intensive portions of an application are assigned to the GPU and general-purpose computing is relegated to the CPU, providing a supercomputing level of parallelism. While highly complex calculations are computed in the GPU, sequential calculations can be performed in parallel in the CPU. 

Frameworks for GPGPU accelerated computing can be created by any language that allows the code running on the CPU to poll a GPU shader for return values. GPGPU acceleration creates an overall faster application by migrating data into graphical form for the GPU to process and analyze, leading to insightful GPU-Accelerated Analytics.

How to Use GPGPU

Writing GPU enabled applications requires a parallel computing platform and application programming interface (API) that allows software developers and software engineers to build algorithms to modify their application and map compute-intensive kernels to the GPU. GPGPU supports several types of memory in a memory hierarchy for designers to optimize their programs. GPGPU memory is used for transferring data between device and host -- shared memory is an efficient way for threads in the same block to share their runtime and data. A GPU Database uses GPU computation power to analyze massive amounts of information and return results in milliseconds.

GPGPU-Sim, developed at the University of British Columbia, provides a detailed simulation model of a contemporary GPU running CUDA and/or OpenCL workloads. Some open-source GPGPU benchmarks containing CUDA codes include: Rodinia benchmarks, SHOC, Tensor module in Eigen 3.0 open-source C++ template library for linear algebra, and SAXPY benchmark. Metal GPGPU, an Apple Inc. API, is a low-level graphics programming API for iOS and macOS but it can also be used for general-purpose compute on these devices.

NVIDIA revolutionized the GPGPU and accelerated computing in 2007 with its creation of the Compute Unified Device Architecture (CUDA), the de facto computing language for image processing and algorithms.


The CUDA platform is a software layer that gives direct access to the GPU's virtual instruction set and parallel computational elements for the execution of compute kernels. Designed to work with programming languages such as C, C++, and Fortran, CUDA is an accessible platform, requiring no advanced skills in graphics programming, and available to software developers through CUDA-accelerated libraries and compiler directives. CUDA-capable devices are typically connected with a host CPU and the host CPUs are used for data transmission and kernel invocation for CUDA devices.

The CUDA model for GPGPU accelerates a wide variety of applications, including GPGPU AI, computational science, image processing, numerical analytics, and deep learning. The CUDA Toolkit includes GPU-accelerated libraries, a compiler, programming guides, API references, and the CUDA runtime.

Does HEAVY.AI Offer a GPGPU Solution?

The HEAVY.AI platform provides an ecosystem of tools and solutions that enable users to harness the massive parallelism of GPUs. HEAVY.AIDB, the foundation of the HEAVY.AI platform, is designed to exploit efficient inter-GPU communication infrastructure such as NVIDIA NVLink when available, harnessing the power of GPUs and CPUs and returning SQL query results for enormous datasets in milliseconds.