Columnar Database

Columnar Database Definition

A columnar database stores data in columns rather than the rows used by traditional databases. Columnar databases are designed to read data more efficiently and return queries with greater speed.

Columnar Database example showing the difference between row-store and column-store.
Image from Dr. Michael K Hernandez


What Is A Columnar Database?

The columnar database architecture has been called the future of business intelligence (BI) because it allows for instant analytical queries that enterprises depend on for business decisions. A columnar database is faster and more efficient than a traditional database because the data storage is by columns rather than by rows.

Columnar databases are used in data warehouses where businesses send massive amounts of data from multiple sources for BI analysis. Column oriented databases have faster query performance because the column design keeps data closer together, which reduces seek time.

Columnar Database Versus Row Based Database

Traditional databases are row oriented databases that store data by row. The fields for each record are sequentially stored in a long row. For example, “Customer 1: name, address, date of birth, etc.” Then all the information for Customer 2 appears in a new row.

In a columnar database, the names of every customer appear in a “name” column and all the addresses appear in an “address” column, etc.

Columnar Database Versus Relational Database

A relational database is ideal for transactional applications because it stores rows of data.

A columnar database is preferred for analytical applications because it allows for fast retrieval of columns of data. Columnar databases are designed for data warehousing and big data processing because they scale using distributed clusters of low-cost hardware to increase throughput.

Columnar Database Advantages Versus Disadvantages

Columnar databases are column based. They are built for speed because when data is stored by column, you can skip non-relevant data and immediately read what you are looking for. This makes aggregation queries especially fast.

However, columnar data is not ideal when you need to view multiple fields from each row. Traditional row databases tend to be better for queries seeking user-specific values only.

Columnar databases can also take more time to write new data because column has to be written one by one.

Columnar databases excel at:

  • Queries that involve only a few columns
  • Aggregation queries against vast amounts of data
  • Column-wise compression

Traditional databases are better for:

  • Incremental data loading
  • Online Transaction Processing (OLTP) usage
  • Queries against only a few rows

Why Use A Columnar Database?

Columnar databases overcome the limitations of traditional relational databases, which positions columnar databases as the future of business intelligence.

A columnar database provides access to the most relevant elements, which increases the speed of a query even in a database containing millions of records.

Traditional relational databases still offer a complete source of data. But columnar database architecture offers easier analysis of the overall data. The way data is organized in columnar databases allows for faster results and more efficient analysis.

Does HEAVY.AI Offer A Columnar Database?

Yes. HEAVY.AI offers an open source columnar database management system called HEAVY.AIDB. It is built to run on graphics processing units (GPUs), which can be used to explore multi-billion row datasets in milliseconds.