Geographic databases and associated geospatial information systems were created to hold complex information about the world and answer the questions about location that are often the first questions analysts ask of data.
Standard databases can hold basic values like a single number or a snippet of text in tables, but the standard routines for searching or selecting information in a query don’t work with real-world data that occupies two or three dimensions. Geographic databases and geospatial information systems efficiently specialize in this advanced style of data processing.
What is a geographic database?
Geographic databases can store more complex elements needed to describe the world and the roads or buildings built upon it. The basic data element is the point that is the combination of the longitude, latitude, and sometimes the altitude. The points can be joined together into polygons that might represent political boundaries or regions of a map. These polygons can be joined together or subtracted with set operations like union to build complex representations.
Capable geographic databases can compute complex functions like determining the distance between two points or whether one point lies inside a polygon. Some can adjust the answers to account for the curvature of the earth.
Some common uses include:
- Finding the closest entry to a point, a feature that can help find the nearest restaurant.
- Identifying the right demographic unit where a house might sit, a common task for computing real estate taxes.
- Computing the distance along a path or a road.
- Geocoding to convert a street address (or other political coordinates) into latitude and longitude.
- Reverse geocoding to find the best address or other coordinates for a latitude and longitude.
- Organizing hierarchical zones for a region like breaking up the country into census blocks and tracts.
- Supporting scientific investigation based upon location, a crucial feature for answering many economic questions about how geography affects work and health.
- Generating basic maps as well as maps with data overlays to visualize research.
- Tracking the performance of groups like sales teams when the groups are defined by geography.
- Looking for geographic correlations through visualizations.
- Kickstarting fraud detection, especially in areas where fraudsters practice in parallel.
Geographic databases have become increasingly essential in enterprises. Many geographic databases are built as a set of extra features that can be installed in a more traditional database, extending the standard data formats to handle geometry. For example, SQL becomes GEOSQL; XML becomes GML; JSON becomes GeoJSON, and so on. These collections of features take on the form of a GIS (for geographical information system). Many popular tools like ArcGIS rely upon the support of a geographically aware database.
How do established databases approach the area?
All of the major databases have added the ability to store spatial data, either in the main database or through an extension that integrates with it.
One of the most popular options is PostGIS, a version of PostgreSQL with extra geographic functions. The U.S. Census Bureau, for example, distributes large files with the boundaries for all of the census tracts in a format that’s simple to load into PostGIS. The data is often used with the census results to answer questions about the population. Restaurant chains, for example, might analyze potential locations by counting how many people live nearby.
Oracle’s Spatial Database includes extra features for tracking locations in two or three dimensions. It might be used for geographic data, but it can also work with arbitrary three-dimensional collections of points, the kind that might be generated when analyzing a scene for planning the movement of a robot or an autonomous vehicle. It is regularly used to track networks like collections of roads or fiber optic cables that make up the infrastructure of a region.
IBM’s Db2 can be extended to offer the standard GIS features. The company also offers higher level tools like the Maximo Spatial Asset Manager that will track the locations and configurations of infrastructure elements. The database will store the location, and an additional layer will create maps or other visualizations.
Microsoft’s SQL server can store two types of spatial data, the so-called geometry for two-dimensional environments and the geography for three-dimensional parts of the world. The elements can be built out of simpler points or lines or more complex curved sections. The company has also added a set of geographic data formats and indexing to its cloud-based Azure Cosmos DB NoSQL database. It is intended to simplify geographic analysis of your data set for tasks such as computing store performance by location.
Noted for a strong lineage in geographic data processing, ESRI, the creator of ArcGIS, is also expanding to offer cloud services that will first store geographic information and then display it in any of the various formats the company pioneered. ESRI, traditionally a big supplier to government agencies, has developed sophisticated tools for rendering geographic data in a way that’s useful to fire departments, city planners, health departments, and others who want to visualize how a variety of data looks on a map.
What about the upstarts?
There is a rich collection of open source databases devoted to curating geographic information. Some focus on roads and buildings, the information normally found on maps, while others track information that might be shown in colored layers on top of the maps like the population count or the percentage of residents who smoke cigarettes.
GeoServer, for instance, collects many standard datasets in an open source tool written in Java. The software is tightly integrated with a collection of mapping modules and extensions that ease the production of maps from the data.
MapServer is another open source tool that’s optimized for displaying geographic data through web interfaces. It offers cross-platform support and focuses more information on the easy display of geographic information.
The OpenStreetMaps project tracks the location of roads, buildings and street-level infrastructure with an interface that allows anyone to edit these like a wiki. The data set, which is shared widely, is the foundation for many mapping tools and policy analysis packages. The OSM data is easily imported into GIS-compatible databases for further analysis.
Many of the GIS databases are integrated into larger products that can be used to create maps or geographically driven policy documents. SimplyAnalytics, PolicyMap, ZeeMaps, and SocialExplorer, for instance, integrate datasets and make it simpler to pick and choose the right information to display so users can visualize the connection between data values and locations.
Is there anything it can’t do?
Geographic extensions to databases make it simpler to search for and apply basic measurement algorithms to points, lines, and polygons that may be on a flat surface or a curved one like the Earth’s. They can find closest points, intersect polygons, and measure distances or areas. These extra features extend the querying capabilities of the databases for these special data types because regular search queries don’t work with two- or three-dimensional data.
More complicated functions are generally left to tools that are built on top of the databases. Routing algorithms used by navigation systems, for instance, tend to be separate. Tools like MapDust that help find inconsistencies or errors in the geographic data are also appearing as separate tools.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
- up-to-date information on the subjects of interest to you
- our newsletters
- gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
- networking features, and more