RAPIDS is a suite of open-source software libraries for executing end-to-end data science and analytics pipelines entirely on graphics processing units (GPUs). RAPIDS accelerates data science pipelines to create more productive workflows.
What Is RAPIDS?
RAPIDS is a library of open source software that runs exclusively on GPUs. It works with different machine learning algorithms to provide a faster processing speed without serialization costs. RAPIDS also supports multi-GPU deployments on large dataset sizes for end-to-end data science pipelines.
How Does RAPIDS Work?
RAPIDS uses GPU-accelerated machine learning to make the entire data science and analytics workflows run faster. A GPU-optimized core dataframe helps build databases and machine learning applications.
RAPIDS is designed to look and feel like Python and offers a collection of libraries for running a data science pipeline completely through GPUs.
RAPIDS was created in 2017 by the GPU Open Analytics Initiative (GoAI) along with partners in the machine learning community, which used a GPU Dataframe based on the Apache Arrow columnar memory platform. The goal was to accelerate end-to-end data science and analytics pipelines on GPUs.
RAPIDS includes a Dataframe API, which integrates with machine learning algorithms.
How Does RAPIDS Improve Data Science and Analytics Pipelines?
RAPIDS accelerates the data science pipeline including data loading, ETL, model training and inference to allow for more interactive and exploratory workflows.
Benefits of RAPIDS include:
- Integration — Accelerate a Python data science toolchain with minimal code changes.
- Scale — Seamless scaling on any GPU, including multi-GPU deployments and multi-node clusters.
- Accuracy — Allows for faster model deployment and iterations to increase machine learning model accuracy.
- Speed — Faster training time to improve data science productivity.
- Open source — Customizable open-source software supported by NVIDIA and built on Apache Arrow.
How Does RAPIDS Use GPUs?
RAPIDS is a collection of GPU-accelerated machine learning libraries that will provide GPU versions of machine learning algorithms.
RAPIDS also includes graph analytics libraries that seamlessly integrate into a data science pipeline.
Native GPU in-memory visualization libraries are in the works. RAPIDS plans to include data visualization libraries based on Apache Arrow for visualization with very large datasets.
Does HEAVY.AI Offer RAPIDS?
Yes. HEAVY.AI partners with NVIDIA on RAPIDS. As founding members of the GoAI consortium, we contributed to the earliest efforts to leverage Apache Arrow as a standard for efficient zero-copy data interchange using a GPU-based dataframe, between different tools in the GPU data science pipeline. RAPIDS delivers additional libraries that leverage and extend this foundation. We’re planning deeper integration with this toolset and ecosystem.