Data Visualization - A Complete Introduction
What is Data Visualization?
Data visualization is a graphical representation of data. It presents data as an image or graphic to make it easier to identify patterns and understand difficult concepts. Technology allows users to interact with the data by changing the parameters to see more detail and create new insights.
The Importance of Data Visualization
Why is data visualization important? Data visualization is an effective way to universally share complex concepts that may otherwise be difficult to convey. For example:
- Visuals are more effective than text
- Charts, graphs and photos convey information more quickly than a large spreadsheet or densely written report.
- Visual metaphors are a universal language
- People of any spoken or written language can communicate with visual metaphors.
- Visualized data increases knowledge
- Data has value only when it is processed, analyzed and remembered. The visualization of data makes it even more valuable because it is easier to consume as information, which in turn becomes knowledge.
How Does Data Visualization Work?
If you’ve ever looked at a pie, bar or line chart, you’ve seen data visualization examples. Without this visual way to see data and abstract its patterns, the data itself could be incomprehensible — just a myriad of numbers.
If seeing is believing, understanding how data visualization works starts with how our eyes and brain inform us. In everyday life, we’re able to interpret our surroundings instantaneously just by looking around. Our brain constructs metaphors to help us understand concepts. Likewise, data visualization uses metaphors. Consider how a pie chart shows the relationship of a piece to the whole “pie.” A line chart describes continuity and a bar chart shows us categories.
The greatest power of data visualization comes from its ability to illustrate things we normally couldn’t see — like a bird’s-eye view. Data visualization helps us see patterns and order in an otherwise chaotic-looking array of digits.
How Does Visualizing Data Improve Decision Making?
Analyze lots of data at one time
- Data visualization lets decision-makers view vast amounts of data with a single look. Managers using data visualization were 28 percent more likely to find useful information, according to an Aberdeen Group survey.
- Data visualization lets decision-makers see patterns that would be hard to find across numbers in the columns and rows of a massive spreadsheet.
Find correlations between business operations
- By making sense out of visual patterns, data visualization allows decision-makers to find correlations between day-to-day tasks and long-term outcomes when it comes to business performance. Data visualization also helps compare to topics side by side.
Identify upcoming trends
- Decision-makers can use data visualization to proactively strategize for the future. For example, it becomes easier to identify upcoming trends when data visualization lets companies immediately see changes in customer preferences.
- Data visualization can make it easier for teams to collaborate. A visual representation saves the time of having to explain to colleagues and supervisors what one found in their big data analysis.
Bigger Data, Better Insights
Who is a Data Visualization User?
There are three kinds of users of data visualization: companies, individuals and academics. Companies use it to analyze performance and adjust their internal processes. Individuals use data visualization to turn data into information and communicate insights to others. Academics uses data visualization to assist in the pursuit of knowledge. Even if an academic’s visualizations aren’t seen by the public, they guide research for information that they ultimately publish.
In companies, users of data visualization include:
- A CEO can track all company metrics with data visualization tools instead of having to check in with every team. A CFO can see all financial metrics in a single dashboard. Other executives can use data visualizations as KPI dashboards in their meetings.
- Sales managers can track the revenue and deals closed by every employee. They can also use data visualization to help motivate a team to hit its quota.
- Marketers can monitor the open rates and impressions of email and advertising campaigns. They can also use data visualization tools to more easily analyze customer and prospect interactions.
Customer service teams
- Call centers can visualize the number of calls taken along with average wait and call times. Help desks can also better track tickets resolved.
Human resources teams
- Recruiters can track applications. HR can visualize employee engagement.
What is No-Code Data Visualization?
Data analysis is no longer the exclusive domain of data scientists. It has been democratized, being used by CEOs, sales representatives, marketing strategists, policymakers. Everyone knows that data analytics can provide actionable, fact-driven insights. Squeezing those insights from huge, heterogeneous, ever growing datasets is another matter.
Extracting useful visualizations from modern datasets still requires the efforts of technical specialists, data experts, and computer programmers. Before the data is presented in a visual form, it runs through a ‘data pipeline’, a system of infrastructure components, tools, scripts and programs. These data pipelines form the backend of business intelligence (BI) tools and systems. The frontend of the system is the user interface, where data is presented to users as a visual dashboard. The mismatch between backend and frontend results in user experiences that are usually slow, non-interactive, and often inhibit spontaneous exploration and discovery.
No-code data visualization eliminates this inefficiency, enabling non-technical users to run scenarios without competing for scarce data science skills. No-code analytics enable business users, who may not know how to write code or even use a command-line interface, to construct rich, interactive visualizations across large, often fused datasets. Business users may not know specifically what they are looking for--insights can emerge simply through ‘free-swim’, formulating and testing hypotheses in real-time. No-code data visualization helps all users in an organization easily tap into the wealth of insights waiting to be found in the data.
What are Some Data Visualization Techniques?
The basic data visualization techniques include:
Relationship (scatter plot)
- Shows the connection and impact between elements such as life expectancy and GDP per capita.
Timeframe (line graph)
- Shows how something changes over time.
Composition (pie chart)
- Shows the parts of a single unit.
Comparisons (bar chart)
- Compares two or more values.
- Allows users to drill into data, explore and find more detail in a geospatial context.
HEAVY.AI has developed a number of advance data visualization techniques that include:
- Compare different datasets in the same dashboard, without having to join tables. Visually compare dozens of data sources in the same dashboard.
Multilayer Geo Charts
- Map metrics in distinct layers. Users can adjust layer opacity, reorder layers, or hide them, as they interact with many metrics overlaid on the same chart.
- Instantly display pointmaps, heatmaps, and choropleths alongside supporting non-geographic charts, graphs and tables. Visualizations refresh immediately in the context of any chosen location.
- Visually interact with any chart, which instantly filters all others, making hidden relationships obvious. Cross-filter instantly, even across billions of rows of data.
Data Visualization Examples:
- Schedules of Famous Creators — Uses color bars to show the daily routines of famous creative people from Pablo Picasso to Ben Franklin, comparing the hours they slept, ate and work.
- The Year in News — Analyzes 184 million Twitter mentions to graphically show the top stories Americans were talking about throughout the year.
- Age Pyramid — Combines animation with data visualization to show the growth and decline of generations over the decades.
- Evolution of Rock Music — Shows 100 years of the evolution of rock music on one graph.
- Shipping Traffic — Follow ships through US coastal waters using a distributed cluster of the OmniSciDB database to visualize over 11 billion rows of geospatial data.
- US Census — Uncover insights about the American people and workforce on 38M+ rows of 2000-2015 US Census Bureau American Community Survey (ACS) Public Use Microdata Sample data with 400+ columns.
- NBA Shots — Scan through the last 13 Seasons (7,588,492 plays) of the NBA. Look at player and team shot history, field goal and 3-point percentages, and much more.
- US Airline Flights — View flight delays and activity from almost three decades, and see which airlines got you there on time.
- US Political Donations — See which US politicians have gotten the most money from whom, across 25 years of political contributions data.
What is Big Data Visualization?
Big data and data visualization rely heavily on each other in order to uncover the meaningful insights about trends, correlations and patterns that exist within big data - otherwise known as big data analytics. Big data visualization tools like HeavyImmerse enable analysts and data scientists to easily visualize and interact with massive datasets so that analytics insights can be uncovered in a variety of graphical formats.
How is Machine Learning used in Data Visualization?
Data visualization can better depict and explain algorithms common in machine learning — such as the decision tree and the neural network. This allows users to see a neural network in action in ways that a written or verbal description could never capture.
Machine learning is considered hyper-intensive programming, which can be difficult to execute and comprehend. Data viz helps the human eye and mind better appreciate and fathom how a machine learning algorithm works.
What is a Data Visualization Engineer?
A data visualization engineer works collaboratively with data scientists, businesses and other software engineers to create dynamic data visualizations to help clients make more informed decisions based on a variety of data.
A Brief History of Data Visualization
The history of data visualization goes back to prehistoric rock carvings called petroglyphs. Using pictures to tell a story about data is as old as human existence. Data visualization became a profession for map makers in the 1600s. The pie chart first appeared in the early 1800s. Charles Minard took data visualization to a new level in 1869 when he created a groundbreaking statistical graphic that mapped Napoleon’s 1812-1813 invasion and retreat from Russia. The chart simultaneously shows geography, time, temperature, number of troops and direction of the army to illustrate how and why the campaign ended in disaster.
Minard’s graphic map sparked a “golden age” of data visualization techniques that lasted until the early 1900s. The first half of the 20th Century didn’t see much data driven graphic innovation. Statistical models became the rage, using exact numbers without use of visuals.
But the revolution in computers in the 1960s led to new ideas about how to display data. The rapid growth of computing power and data collection from the 1970s onward took Minard’s hand-drawn concepts past limits that were imaginable in his time — like visualizing the entire human genome. Massive amounts of data can now be shown in milliseconds as data visualization software brings together data scientists and artists to change how we look at the world.
In 2011, Todd Mostak helped solve the problem of quickly visualizing vast amounts of data. Mostak was studying the role of social media in the Arab Spring for his thesis at Harvard. The research required looking at millions of tweets, and tying them to a specific location. Each time Mostak ran a query on the giant data set, it would take many hours to get a result. Often, he had to let it run overnight. This problem caused Mostak to wonder: “What if I created my own query engine that uses the massive parallelism of GPU cards to accelerate the queries and visualize the results?" Mostak eventually created software at MIT’s Computer Science and Artificial Intelligence Laboratory, which eventually became the OmniSci analytics platform.
Mostak initially called his company MapD Technologies, and then OmniSci - and now HEAVY.AI. Now with HEAVY.AI, it focuses on pushing the limits of scale and speed in big data analytics to let users visually explore data at the speed of thought.
The next innovation wave in data visualization techniques will combine the objectives and methods of machine learning, interactive visualization and business intelligence.
The concept of business intelligence was popularized in a 1958 article by IBM computer scientist Hans Peter Luhn. He described a “business intelligence system” that was “an automatic system…developed to disseminate information to the various sections of any industrial, scientific, or government organization.”
Luhn’s business intelligence (BI) concept became more feasible with the growth of computing power, driven by Moore’s Law on the density of integrated circuits and the software developed to harness that power. This led to BI that could not only produce data and reports but organize and visualize it.
Today’s BI dashboards, however, are very different from HEAVY.AI technology. While a BI dashboard can query and visualize tens of millions of rows of data, HEAVY.AI's real-time data visualizations are able to do the same with billions of records, with far faster performance.
Michael Friendly, of York University in Toronto, wrote a paper called “A Brief History of Data Visualization” that outlined the greatest developments in data visualization examples and techniques of the past 30 years:
- Development of highly interactive statistical computing systems.
- New paradigms of direct manipulation for visual data analysis like linking, brushing, selection and focusing.
- New methods for visualizing high-dimensional data like scatterplot matrix, parallel coordinates plot and spreadplots.
- Invention (or re-invention) of graphical techniques for discrete and categorical data.
- Application of visualization methods to an ever-expanding array of substantive problems and data structures.
- Substantially increased attention to the cognitive and perceptual aspects of data display.
Friendly also noted the most important advances in theoretical and technological infrastructure that allowed for the creation of new visualization methods:
- Large-scale statistical and graphics software engineering, both commercial (SAS) and non-commercial (Lisp-Stat, the R project), which open-source standards often leverage for information presentation and interaction (Java, Tcl/Tk).
- Extensions of classical linear statistical modeling to ever wider domains (generalized linear models, mixed models and models for spatial/geographical data).
- Vastly increased computer processing speed and capacity, allowing computationally intensive methods (bootstrap methods and Bayesian MCMC analysis).
- Access to massive data problems (measured in terabytes) and real-time streaming data (i.e. millions of records per second).
What is Data Visualization Software?
Data visualization software creates the dashboards that allow for easy interpretation of data, trends and key performance indicators (KPIs). With data visualization software at their disposal, users can build visuals such as charts, graphs and maps that track and measure metrics.
Data visualization software benefits:
- Turn complex data into easily understandable graphics
- Track, visualize and compare related metrics
- Monitor KPIs
What is Scientific Data Visualization?
Scientific data visualization, also known as scientific visualization, is a subset of data visualization that refers to the process of representing raw, scientific data as images, providing an external aid to improve scientists’ interpretations of large data sets and to gain insights that may be overlooked by statistical methods alone.
Explore Interactive Visualization via Demo
Get hands-on with HEAVY.AI's Interactive Data Visualization Demo and imagine the critical insights your team could uncover using your own data.