Veda Shankar
Feb 5, 2019

Improving Connected Car Testing with Vehicle Telematics Analysis

Try HeavyIQ Conversational Analytics on 400 million tweets

Download HEAVY.AI Free, a full-featured version available for use at no cost.


The United States Department of Transportation (USDOT) along with many partners from the industry and academia are doing research to evaluate the ability of connected vehicles to generate and communicate different types of messages using cellular and dedicated short range communications (DSRC) infrastructure. These tests are conducted on an operational roadway where the vehicles provide information about their situational driving environment. In Connected Car testing, a vehicle On Board Unit (OBU) communicates with a RoadSide Unit (RSU) and cell phone towers to transmit telematics information periodically. The Roadside Unit operates on the 5.9 GHz DSRC band compatible with vehicle systems, providing very low latency required for high speed events, such as crash avoidance. The RSUs are edge computing devices which typically have a single processor with limited system memory. As the connected car traffic increases, it is important to understand the load on (i.e. number of vehicles communicating simultaneously with) the RSU. This will be used for making important decisions like how many and where the RSUs should be deployed in a smart city infrastructure.

In this blog, we will explain how we analyzed USDOT full scale vehicle testing data using the OmniSci visual analytics platform. The browser-based dashboard allows us to interactively query and visualize the vehicle and RSU locations on a Pointmap and the vehicle routes on a Linemap, based on any of the parameters reported by the vehicle OBU. OmniSci enables users to ingest Latitude/Longitude information as geometric points and routes as geometric linestrings. We were able to run SQL queries using geospatial functions to get answers on:

  • What is the maximum & minimum distance between the vehicle OBU and RSU; and
  • How many of the routes are within a certain distance from the RSU?

About the Dataset

The dataset is publicly available from the USDOT website and is based on the field testing conducted in Fairfax County, Virginia, which utilized a fleet of 10 vehicles and a finalized test matrix for the Advanced Messaging Concept Development (AMCD) project. Throughout field testing, researchers captured data to better understand capabilities and aid in standards and design activities within the AMCD. The document provides the following information:

The file Advanced_Messaging_Concept_Development__Probe_Vehicle_Data.csv is available on the website and contains 370K PVD messages. The PVD messages contains status information about a vehicle traveling along the test route. The messages are exchanged with RSUs if the mode of transmission is DSRC, otherwise the transmission is over cellular network. Here are all the fields in a PVD message:

Using the ETL script written in Python, we read in the above tables into a dataframe and made the following changes:

  • Added a column communicationType. Based on the modeTransmission integer field (RSU id or 999999 for cellular) it records the communication type as a text with value of RSU or CELL.
  • Added a column timeTransmission. Based on the field time Received which is the number of milliseconds from epoch, it records the time of transmission in a OmniSci compatible TIMESTAMP format.
  • The latitude and longitude for the vehicle location are values in FLOAT, and we used the OmniSci import feature to automatically convert it into GEOMETRIC POINT with SRID (Spatial Reference Identifier) of 4326. Converting the location information to geometric primitives allows us to run SQL queries using geospatial functions. As the order of the coordinates in the CSV file is opposite (latitude, longitude) to what is expected by default, we loaded the data using the WITH option lonlat='false'.COPY  probeVehicle FROM '/tmp/probeVehicle.csv' WITH (lonlat='false');
  • Added a column communicationPoint of type GEOMETRIC POINT. Based on the communication type, RSU or CELL, we know the Latitude/Longitude of the RSU from the RSU information table. As most of the communication is happening over the CELL, we assumed a cell tower location in the Tyson’s corner for Lat/Lon coordinates instead of leaving it as NULL.

Visual Data Exploration with OmniSci Immerse

We chose the OmniSci Cloud platform for developing our application which is the fastest way to start using the product. You can sign up for a cloud account, and get immediate access to an instance of OmniSci running on GPU in a public cloud. Using the OmniSci Immerse user interface, we created a dashboard that allows us to quickly analyze the different variables from the test dataset so that we can see some patterns emerge. The dashboard consists of the following charts:

  • A multi-layered pointmap that has the geometric coordinates of the RSUs overlayed with a linemap containing the geometric LINESTRING representing the route taken by the vehicles.
  • A line chart with the time of test along the x-axis and the number of records colored by the mode of communication (RSU and/or CELL) along the y-axis.
  • A pie chart that categorizes based on the mode of communication (RSU or CELL).
  • A bubble chart that shows a breakdown of the number of records corresponding to individual RSU devices and CELL tower.
  • Barcharts to represent the number of records based on the vehicle ID and the different tests conducted.
  • Histogram to represent the vehicle speed distribution during the test.
  • A heatmap that shows the distribution of how many times a vehicle communicated with a RSU device or CELL tower.

From the dashboard we can see that during these tests the vehicle mostly communicated with the cell tower. To find more about the RSU communication, we unselected the cell tower by clicking on the RSU in the pie chart, and immediately see that the distribution of the load amongst the RSU devices in the bubble chart. The time chart also shows that most of the communication with the RSU happens past midnight on Nov 16th when the vehicles are on Leesburg Pike route on test #4.

SQL Queries using Geospatial Functions

As we have created the table using geospatial primitives for the vehicle location, RSUs and routes, we can exercise the SQL geospatial functions to get additional insights from the test dataset.

The SQL function ST_Distance returns the shortest planar distance between geometries. By default, geo data is stored as GEOMETRY, converted to Earth coordinates in lon/lat as Geodetic coordinate system EPSG 4326. So the SQL function ST_Distance will return the shortest planar distance in degrees. In order to get the distance in meters, you have to cast the geometries as geographies or project in Web Mercator. Here are the queries to find the minimum & maximum distance in meters between the vehicle OBU and RSU:

SELECT MIN(ST_Distance(CAST(OBU_location as GEOGRAPHY), CAST(communicationPoint as GEOGRAPHY))) as distance, OBUid,modeTransmission from probeVehicle WHERE (probeVehicle.communicationType = 'RSU') GROUP BY OBUid, modeTransmission ORDER BY distance ASC

SELECT MAX(ST_Distance(CAST(OBU_location as GEOGRAPHY), CAST(communicationPoint as GEOGRAPHY))) as distance, OBUid, modeTransmission from probeVehicle WHERE (probeVehicle.communicationType = 'RSU') GROUP BY OBUid, modeTransmission ORDER BY distance DESC

  • In the following SQL query we found all the routes (LINESTRINGS) that are within a certain distance, say 500 meters, from the communicating RSU and saved the geometric data corresponding to the RSU and the route in a new table.
CREATE TABLE RSU_OBU_routes AS SELECT TestNo, OBUid, modeTransmission, communicationPoint, OBUpath  from probeVehicle WHERE (probeVehicle.communicationType = 'RSU') AND (ST_Distance(CAST(OBU_location as GEOGRAPHY), CAST(communicationPoint as GEOGRAPHY)) < 500.0)

  • We used OmniSci Immerse to create an overlay point and line map from the newly created table to visualize the results.

Use Cases for Intelligent Highway Design

Using OmniSci for dataset visualization enables us to get many attributes of the intelligent highway system design.

  • Fixed infrastructure loading of RSU ( Road Side Units) as geospatial linked timeseries.
  • Enables estimates for capacity planning.
  • Provision for congestion free handoff between RSU (DSRC RoadSide Units) and OBU (On Board Units)
  • Enables new services like Use-based Tax on participating vehicles.
  • Big Data analysis with OmniSci enables a fast and efficient generation of labeled datasets for Machine Learning and AI driven Intelligent Highway System design.
  • Geospatial visualization of link densities allow for detailed traffic analysis, and the heat maps allow for quick hot spot detection.

Veda Shankar

Veda Shankar is a Developer Advocate at HEAVY.AI working actively to assist the user community to take advantage of HEAVY.AI's open source analytics platform. He is a customer oriented IT specialist with a unique combination of experience in product development, marketing and sales engineering. Prior to HEAVY.AI, Veda worked on various open source software defined data center products at Red Hat.