Terrains are represented as digital elevation models (DEMs). Typically, a
DEM represents the spatial distribution of elevations in a terrain, but in
general it can represent any of a number of attributes. Much of the
terrain data encountered in GIS applications is obtained from remote
sensing devices and is in *raster* or *grid* form, in which the
coordinates of the data correspond to a uniform grid, and the elevations
(or other data) are given for each cell in the grid. In addition to grids,
the two other ways of representing DEMs are *triangulated irregular
networks* (TINs) and *contour lines*. Among the terrain
models, grids are the most commonly used because of their simplicity
and because data is readily available in this form. On the other hand,
TINs often use less space than grid-based DEMs.

A grid-based DEM is a 2D-array of elevations sampled uniformly from
the terrain.We refer to each value in the grid as a *cell*.
The *size* of a cell in the grid is the resolution at which the
grid was sampled, and its *perimeter* is four times its size.
The *neighbors* of a grid cell *p* are the 8 (or fewer, if
the cell is on the edge of the grid) cells around *p*. The
gradient of the terrain towards each neighbor is denoted as
*tan beta*, and can be computed as the ratio between the
height difference of the cells and the horizontal distance between
them. A *downslope neighbor* is a neighbor with strictly lower
elevation. The *steepest downslope neighbor* is the downslope
neighbor with the largest gradient.

An *index* of the terrain is a function computed for each cell of the
terrain. Some basic indices in terrain analysis are the *flow
directions* and *flow accumulation*. The *flow
directions* of a cell correspond to the directions in which water
would flow if poured at that cell onto the terrain. Flow accumulation
depends on the flow directions, which need to be known prior to
computing flow accumulation.

Once flow direction and flow accumulation have been computed, many
other indices of the terrain can be estimated based on them, including
watersheds, drainage network and topographic convergence index. The
*drainage network* consists of all grid cells for which the
flow accumulation is higher than a certain threshold, and the *
topographic convergence index* (TCI), which quantifies the
liklihood of saturation, is defined as the logarithm of the ratio of
the flow accumulation to the local slope at a cell.

The *flow directions* of a cell correspond to the
directions in which water would flow if poured at that cell onto the
terrain. The *flow routing* problem can be defined as the
problem of assigning flow directions to all cells in the DEM such
that: (1) flow directions do not induce any cycles; (2) every cell has
a flow path off the edge of the terrain. Although different flow
models may define flow direction differently, the flow directions of a
cell are always towards neighbor cells which are downslope or at the
same height. In other words, water cannot go uphill.

The *single-flow direction* (SFD) model defines the flow
direction to be the direction of the steepest downslope neighbor. The
*multi-flow direction* (MFD) model allows multiple flow
directions for a cell, most commonly to all downslope neighbors. The
single flow approach has the disadvantage that it discretizes the flow
into only one 8 possible directions. Multiple flow routing has the
disadvantage that the flow from a cell is dispersed to all neighbors
of lower elevation, resulting in a more diffuse flow of water.

Flow direction to steepest downslope neighbor (SFD). |
Flow direction to all downslope neighbors (MFD). |
---|

A cell with downslope neighbors may determine its flow
direction(s) by simply examining adjacent cells. A cell without any
strictly downslope neighbors is *flat*. It can not determine
its flow direction just by looking at adjacent cells; a more
complicated procedure is required. A cell is called *flat* if:
(1) it has height less than or equal to *all* its neighbors; or
(2) it has a neighbor of same height which satisfies (1). A *flat
area* is a maximal set of adjacent flat cells. Note that
one-cell-wide ridges are not flat areas, but one-cell-wide channels
are. The flat area has a *spill-point* if it contains a cell
which has a downslope neighbor.

The intuition is that flow should be routed such that water flows
towards the spill-points of the flat area. If the flat area has no
spill-points then we call it a *sink*. For instance, the bottom
of a bowl is a sink. Sinks are dead-ends for flow because it cannot be
routed further. The most commonly used approach is to preprocess the
terrain in order to ``fill'' all the sinks such that each cell has a
valid flow path off the border of the terrain. Thus sinks become
lakes. This step is part of flow routing process and is refered to as
*flooding*.

The *flow accumulation* of a terrain is an index which
estimates the surface runoff for each cell in the terrain. The flow
accumulation of a cell is defined as the total area of the grid cells
which flow through that cell per unit width of contour. To compute
the flow accumulation we assume that every grid cell initially has one
unit of flow (water) and that the flow, initial as well as incoming,
at a grid cell is distributed according to the precomputed flow
directions. The flow accumulation can then be computed as the total
amount of water flowing through each grid cell of the terrain
multiplied by some constant factor.

Kaweah Basin DEM. | Kaweah Basin Flow Accumulation. |
---|

The availability of geospatial data for larger areas at higher resolutions exposes scalability problems with existing GIS algorithms. When processing such large amounts of data (bigger than main memory of the machine being used) the Input/Output communication (or I/O) between fast internal memory and slow external storage such as disks, rather than internal computation time, becomes the bottleneck in the computation. Unfortunately, most GIS algorithms, including the ones in commercial GIS packages, are designed to minimize internal computation time and consequently they often do not scale to large datasets.

<laura@cs.duke.edu> Last modified: Wed Apr 18 22:48:03 2001 by laura.