Analyze OpenStreetMap Data with OSMnx and OmniSci Free
Download HEAVY.AI Free, a full-featured version available for use at no cost.GET FREE LICENSE
NOTE: OMNISCI IS NOW HEAVY.AI
Earlier this month, our team shared 4 Simple Ways to Map Vehicle Location Data with OmniSci Free.
In that post, we construct an Enriched Streets map using the geospatial analysis results of OmniSci's recently enhanced spatial relationship functions.
We're following up to show you how we access and load OpenStreetMap (OSM) data into OmniSci Free with python, synthesize streets with vehicle collisions, and deliver meaningful spatial insights.
In this post, you'll learn how to:
- Access OSM street network data via OSMnx, a python package used to retrieve, model, analyze, and visualize street networks from OpenStreetMap.
- Load OSM data into OmniSci with pymapd, a python DB API compliant interface for OmniSci.
- And construct buffer geometries around OSM street network segments and geospatially enrich them with ten years worth of collision data using SQL.
Access OSM Street Network Data via OSMnx
Acquiring street network data can be a costly affair. Either you or your organization is willing to pay premiums for proprietary datasets, or you're paying with the time it takes to track down data curated by a governing body or a local municipality.
On the other hand, OpenStreetMap is freely available and obtainable in a variety of ways. You can download extracts from sites like Geofabrik or use tools like osm2pgsql to load data directly into PostGIS compliant databases.
In this example, we use OSMnx and OmniSci's Data Science Foundation to access an analysis-ready cut of Los Angeles street network data. OmniSci provides deep integration with JupyterLab. Users can access JupyterLab by clicking an icon in Immerse or sending SQL queries within the SQL Editor directly to a notebook.
To begin, build a new JupyterLab connection and import the required packages.
JupyterLab's connection to the database is instantaneous on launch.
Alternatively, a user could connect to an OmniSci instance using pymapd or ibis in a local JupyterLab or Jupyter Notebook environment.
Next, use a short snippet of code to load the street network data for the area of interest. OSMnx offers the ability to access street networks by providing one of the following:
- bounding box
- latitude/longitude plus a distance
- address plus a distance
- street network boundary polygon
- place name or list of place names
Apply the place name and drive network type options to isolate the Los Angeles metro's drivable streets.
OSMnx's plot_graph function is a great way to preview the streets once they've loaded.
Pymapd accepts data loaded into a database table from a Pandas DataFrame, but OSMnx's graph module retrieves spatial network data and models them as NetworkX MultiDiGraphs. To comply, convert the graph nodes and edges to Geopandas GeoDataFrames using the package's graph utility functions and drop the unwanted columns.
Check the coordinate reference system (CRS) to ensure it's in an acceptable format, such as EPSG 4326, before ingesting the streets into OmniSci.
If everything looks good, it's go-time!
Create a table to match the GeoDataFrame using Immerse's SQL Editor.
Finally, use pymapd's load_table_columnar function to load the streets into the la_streets table.
Construct Street Buffers and Geospatially Enrich Them with Ten Years of Collision Data
Apply geometry and spatial relationship functions to buffer the Los Angeles street network data and spatially enrich those buffers to understand the following statistics per road segment better:
- number of individuals who perished
- number of injuries
- average party count of collisions
- number of severe injuries
- number of collisions, deaths, and injuries with;
- number of collisions that involved alcohol
A geometry buffer creates a buffer polygon around the input geometry at a specified distance in meters. In this exercise, we buffer each street segment by 10 meters, providing a catchment area for the collisions and supporting effective visualization post-analysis.
Spatial relationships depend on geometry locations and their topological or distance relationship with one another. For instance, one may want to generate summary statistics for points that intersect with a set of polygons.
Measure the relationship between the Los Angeles street buffers and ten years of California Statewide Integrated Traffic Records System (SWITRS) collisions data by employing the ST_CONTAINS function.
ST_CONTAINS returns true if the first stated geometry object contains the second object. Aggregate a mixture of collision metrics to the street segment buffers and measure the relationship to complete the enrichment.
Bonus: Summarize 550 million+ Vehicle Locations by Census Block Groups
If you've stayed with us this far, we're throwing in a bonus analysis. Let's quickly run through the following:
- Gather (data download link) and ingest Los Angeles, California Census Block Groups with Immerse.
- Summarize 550 million+ vehicle locations by census block groups with SQL.
Drag and drop a Los Angeles metro Census Block Groups shapefile into the Immerse Data Manager's Data Importer using the import data from a local file option.
Once the block groups are loaded, perform a similar analysis to the Los Angeles streets using spatial relationship functions.
The Summarized by Census Block Group map from the 4 Simple Ways to Map Vehicle Location Data with OmniSci Free post results from this bonus analysis.
The bonus analysis highlights the performance of OmniSci's recently enhanced spatial relationship functions against the scale of high-volume vehicle location information.