Instructor: Brian Klinkenberg

Office: Room 209
Office hours: Tue / Thu
12:30-1:30

TA: Alejandro Cervantes

Office hours: Mon and Tues from 10-11 in Rm 115.

Lab Help: Jose Aparicio

Office: Room 240D

Computer Lab: Rm 115

 

 

Error and Uncertainty

Error, uncertainty, accuracy and precision are very important concepts in GIScience, but the issues associated with them are often overlooked.  This becomes very apparent when you discuss GISystems with the typical GIS user. The idea that there is uncertainty present in the data that could influence analysis outcomes-- a result of the error and uncertainty in the source data and introduced by the processing of that data--is often mentioned only in passing in official reports. However, the issues that arise from error, uncertainty, accuracy and precision are very important.  The quality of the decisions that can be made, based on any set of data, depend to a large extent upon the quality of the information that was used to make those decisions. This situation is not unique to GIScience, of course, but is highlighted when spatial data are used, analyses conducted, and spatial output created. Since it is relatively easy to combine data from a wide range of sources, GIS analyses and presentations (e.g., a map) will often bring to light the poor quality of the information that people have been using over the years.

For a general overview of error, accuracy and precision the notes in the Geographers Craft are worth reviewing. There you can also find a discussion on managing error in a GIS environment.  The new NCGIA Core Curriculum II has a unit (# 187) on managing uncertainty in GIS by G. Hunter that is worth reviewing.  The unit by G. Heuvelink on uncertainty propagation in GIS is also relevant to this topic. Wiley (a textbook publisher) has made available selected chapters from the 1991 "Big Book" of GIS--chapter 12, by N.R. Chrisman, is on the error component in spatial data, and is worth reviewing for those wishing to get a thorough overview of the subject. Here you'll find a simple discussion on numeric precision and accuracy. The entire set of BC Standards for Developing Digital Data is available, which describes both attribute and spatial data and metadata standards.

In order to know whether data are appropriate for an analysis, it is important to look at its metadata (data about data). Using metadata you can reduce the error and uncertainty in your analyses, since you'll know about the source of the data, what modifications have been made to the data, etc. An article that describes the ten top metadata errors is worth reviewing since, by explaining what the common errors are, you get to know what the purpose of metadata is. The Saint Louis University GIS Portal has a good discussion on metadata.

A very good overview of uncertainty was provided by a professor at Cornell (a PDF version of the notes can be found here--these notes describe both data and rule uncertainty). Here you'll find an example of rule uncertainty in DEM generation (taken from my lecture notes on interpolation), and here is an extract from a thesis on Digital Elevation Model (DEM) Uncertainty: Evaluation and Effect on Topographic Parameters by Suzanne Perlitsh Wechsler. A simulation (an mpg file will be downloaded to your PC) of a shortest path down a DEM, where uncertainty in the elevations is modeled using a Monte Carlo approach (originally produced by C. Ehlschlaeger), makes it clear how uncertainty can affect a result. On this page you'll find a good overview of a complex Monte Carlo analysis used to determine fecal coliform levels.

When overlaying two layers obtained from different agencies you often end up with a situation such as this, which leads to slivers. The question is--are the differences meaningful or simply the result of uncertainties associated with data collection (possibly collected at different dates, at different scales) and/or generalization (mandate issues)? What set of polygon boundaries should you accept as 'real', and which should you reject? Or, should you take the average of the two sets of polygons? Most often answers to these questions are left up to the person doing the work, since there are no formal guidelines available.

Learning objectives

  • Understand the concept of uncertainty and how it can arise from our attempts to represent reality.
  • Be aware of how uncertainty is introduced as we conceptualize, measure, represent and analyze spatial data;
  • Understand how and why scale affects measurements and analyses;
  • Understand the importance of metadata.

Text: Chapter 6: Uncertainty [Overheads: 1 per page; 3 per page]

Key words: error, uncertainty, accuracy, RMSE, bias, metadata, sliver polygons, NMAS, attribute accuracy [confusion matrix], sampling considerations, a classification of uncertainty (figure from overheads)