“Just-in-Time” Spatial Data Integration (or, Bridging Data Silos with Visualization)
If you work with spatial data, odds are good that it’s spread across multiple applications or data stores in a way that isn’t easy to integrate. Often, there is clear value in breaking down barriers between these “silos” (individually useful but isolated mapping solutions) as this may enable new insights or capabilities, or reduce the cost of maintaining redundant data.
I explored two approaches in a previous post, but a recent webcast (with Directions Media, Iowa DOT, and Google) opened my eyes to a new possibility: when synchronizing or centralizing data is too costly, instead use lightweight visualization tools that can pull data from multiple sources on the fly.
In my earlier post, I proposed two high-level strategies: (Option A) store all your data in one place (e.g., a database) and make all your applications work with that, or (Option B) leave existing applications with their separate data stores, but then implement processes to push their data to a central repository (e.g., a data warehouse) and publish that. Spatial Data Infrastructures are good large-scale examples of Option B, where local governments gather and manage local data and then publish it to a national or multinational repository in a common data model.
I found two interesting aspects to the webcast. First, Iowa DOT is aggressively pursuing Option A. They store the bulk of their data in an Oracle Spatial database and have adapted a wide variety of GIS and CAD applications to work with it. Other applications access the data directly via SQL or indirectly through various web services. In spite of this, due to legacy applications and partnerships with other states, they still have data silos.
Second – and this is where it gets really interesting – they presented an approach where multiple data sources are fused together in the visualization stage. The idea is that it is easier to query multiple data sources (e.g., using web services) in the visualization layer than to keep composite or centralized data stores in sync. Merging data this late in the game likely doesn’t make sense in all cases (e.g., if complex analysis or transformation is required), but the basic advantages are clear: users benefit from an integrated display of data in one system when they would previously have consulted two.
Do you have spatial data stored in multiple systems or applications? Do you see value in combining this data for visualization or analysis? If so, how are you planning to proceed? Or, if you’ve already fought these battles, do you have any insights to share?
I’m starting to get a sense of the wide diversity of approaches to breaking down barriers between you and your data. Want to hear some of ours? Tune into the free live stream of our FME 2011 World Tour event this Friday (March 4) or register to attend in-person at one of 25+ cities worldwide.
Related posts:




Good post Paul. I have one example of pushing data from data silos with processes. Inspire directive has been the main push in Europe for national geoportals. In Finland the National Land Survey has built a geoportal that bridges data silos from governmental agencies into one web map service. It’s utilizing OGC standards as WMS and WFS. Citizens or workers using the geoportal can then view and query spatial data coming from separate data providers. At the moment, it supports raster and soon vector data. In the future WPS interface will be added. Thus, users can do even spatial data analysis through the geoportal utilizing the data from governmental agencies. Overall, web mapping standards are used here in bridging data silos.
Hi Lassi,
Great feedback. Thanks for sharing this example of freeing data from silos using web services like WMS and WFS. I’m interested in the connection with INSPIRE. Does the same geoportal (that provides data to citizens and workers) also act as the link to INSPIRE? If so, do these web services handle the transformation to INSPIRE’s common layers and data model on the fly?
Hi Paul,
Thank you. Those web services don’t handle the transformation to INSPIRE’s common layers and data model on the fly. As far as I know, the most popular time of doing the transformation is before publishing the data with WMS and WFS. Many organizations use publishing databases such as PostGIS and GeoServer. On the fly transformation is ideal, but if load on the service can be taken off, then a publishing database is a very good deal.
Hi Lassi,
This trade-off makes a good deal of sense; thanks for the update.