NOTES
This unit is very long and deals with more advanced
algorithms. Depending on the abilities and interests of your
students, you may want to omit the third and fourth algorithms
included or consider providing this as extra handouts.
Advanced students may be pleased to have the opportunity to
examine the more subtle, complex nature of these advanced
algorithms. The later section on indexes does not depend on
material covered in the earlier sections.
UNIT 37 - QUADTREE ALGORITHMS AND SPATIAL INDEXES
A. INTRODUCTION
- the previous unit defined the basic idea of a quadtree
- this unit examines how quadtrees are used in several
simple processes, including:
- measurement of area
- overlay
- finding adjacent leafs
- measuring the area of contiguous patches
- in addition, this unit will look at how quadtrees can be
used to provide indexes for faster access to vector-coded
objects
- finally, alternative forms of spatial indexing will be
reviewed
Definition
- to traverse a quadtree:
- begin by moving down the leftmost branch to the
first leaf
- after processing each leaf in this branch, move back
up to the previous branching point, and turn right
- this will either lead down to another leaf, or back
to a previous branching point
diagram
overhead - First map
- several of the following examples use this simple
raster and its associated quadtree
B. AREA ALGORITHM
Procedure
- to measure the area of A on the map:
- traverse the tree and add those leafs coded A,
weighted by the area at the level of the leaf
Example
- in the example quadtree, elements at level 0 have area
16, at level 1 - area 4, at level 2 - area 1
C. OVERLAY ALGORITHM
overhead - Second map
- note: this overhead can be physically overlayed on
First map
Procedure
- to overlay the two maps:
- traverse the trees simultaneously, following all
branches which exist in either tree
- where one tree lacks branches (has a leaf where the
other tree has branches), assign the value of the
associated leaf to each of the branches
- e.g. node 3 is branched on map 1, not on map 2
- the leafs derived from this node (30, 31, 32
and 33) have values B, B, A and B on map 1, all
2 on map 2
- the new tree has the attributes of both of the maps, e.g.
A1, B2
Result
overhead - First map + Second map
D. ADJACENCY ALGORITHM
Problem
Definition
- here adjacent means sharing a common edge, not just a
common point
diagram
Two cases
- leaf codes are:
1. same length (same size blocks, e.g. 01 and 02)
or
2. one is longer than the other (different size
blocks, e.g. 03 and 2)
- solving this problem requires the use of:
1. conversion from base 4 to binary and back
- base 4 because of the "rule of 4" used in
constructing quadtrees
2. bit interleaving
3. a new concept called Tesseral Arithmetic
Tesseral Arithmetic
- tesseral arithmetic is an alternate arithmetic useful for
working with the peculiarities of quadtree addressing
- to add binary numbers normally, a "carry" works to the
position to the left
- e.g. adding 1 to 0001 gives 0010
- this is the same as decimal arithmetic except that
carries occur when the total reaches 2 instead of 10
- in tesseral arithmetic, a "carry" works two positions to
the left
- e.g. adding 1 to 0001 gives 0100
- the reverse happens on subtraction
- 1000 less 1 is 0010 not 0111, as the subtraction
affects only the alternate bits
- in other words, if we number the bits from the left
starting at 1
- adding or subtracting 1 affects only the even-
numbered bits
- adding or subtracting 2 (binary 10) affects only the
odd-numbered bits
Determining Adjacency
handout - Determining adjacency
1. same size blocks:
2. different size blocks:
- taking the longer of the two codes:
- convert it from base 4 to binary
- tesseral-add and -subtract 01 and 10 to create
four new codes
- reject any cases where subtracting was not
possible (a "negative" code would have
resulted, or a "carry" would have been
necessary to the left of the leftmost digit)
- discard the excess rightmost digits in the
resulting transformed longer codes
- convert back to base 4 to get the leaf
- the two blocks are adjacent if any of the
transformed and truncated codes are equal to the
shorter code
- example: Are 02 and 2 adjacent?
- example: Are 033 and 2 adjacent?
- example: Find leafs adjacent to 03 in the first map
above
- method: find the codes of adjacent blocks of the same
size, then work down the tree to find the appropriate
leaf
Length of common boundary
- the length of common boundary between the two blocks is
determined by the level of the longer code
- can use this to construct an algorithm to determine
the perimeter of a patch
- e.g. the length of the A/B boundary in the
first example map
diagram
E. AREA OF A CONTIGUOUS PATCH ALGORITHM
Problem
Method
- handout - Area of a contiguous patch (2 pages)
- create a list of leafs, with their associated codes, by
traversing the tree
- allow space for a "pointer" for each leaf, and give
it an initial value of 0 (see handout)
Algorithm
- for each leaf i:
- find all adjacent leafs j with equal or shorter
length codes (4 maximum)
- if the adjacent leaf j has the same value, determine
which of i and j has the higher (larger value)
position in the list, and set its pointer to the
lower position
- (note: if a pointer has already been changed, it may
be changed again or left, the result is the same)
- this produces the final pointer list
Results
1. the number of contiguous patches will be equal to
the number of zeros
- in the example, two pointers are zero, indicating
two contiguous patches
2. the value of each patch can be obtained by looking
up the values of leafs with 0 pointers
- in the example, leafs 00 and 01 have 0 pointers
- these have the values A and B respectively
3. to find the area of each patch, select one of the
zeros and sum its area plus the areas of any leafs which
point to it directly or indirectly
- the component leafs of each patch can be found by
starting at with a leaf at the end (or beginning) of
the list and following the pointers until a 0 is
found
- e.g. leaf at position 10 (code 33) points to 8,
which points to 7, which points to 5, which
points to 2, which has a zero pointer
- therefore, leaf position 10 (code 33) is part
of the same patch as leaf 2 (code 01) and has
the value B
- the areas can be found by summing the leaf areas
F. QUADTREE INDEXES
Indexing using quadtrees
- indexes are used in vector systems to get fast access to
the objects in a particular area of a map
- very useful in searching for potentially overlapping
or intersecting objects
- therefore, they are an essential part of a polygon
overlay operation
- in Unit 34, looked at the usefulness of a simple sort of
objects on one axis (e.g. x) in the moving band operation
for intersection calculations
- now will look at methods which can be thought of as
sorting on both axes simultaneously
- these use 2D coding systems and a simple one dimension
sort
Setting up the index
overhead - Quadtree indexes
- steps are:
1. for each object (point, line, area) in the database,
find the smallest quadtree leaf which encloses the object
- some large objects will have to be classified as
NULL, as they span more than one of the four leafs
in the first branching (0, 1, 2 and 3)
- other smaller objects may be enclosed within a small
leaf, e.g. 031
2. sort or index the objects by the enclosing quadtree
leafs
Using the index
- to find all objects which might intersect an area, line
or point of interest
- find the quadtree leaf enclosing the object of
interest
- starting at this point follow up the quadtree
through all branching points that contain the
original cell and down the quadtree to all branching
points and leafs below the cell
- example: the area of interest is enclosed in leaf 31 of
the original example quadtree (overhead - First map)
- the objects which may intersect the area of interest
are those in leaf 31 and all leafs above it
- thus, these are 3 and the null leaf
- objects in other (remote) leafs cannot
intersect the area of interest, so need not be
checked
- example: the area of interest is enclosed in leaf 0
- the objects which may intersect the area are in leaf
0, the null leaf and all leafs below 0 - 00, 01, 02,
03
- there may be other leafs below these as well such as
010, 011, 012, 013, etc
Generalizations
- quadtree indexing is most effective for small objects,
particularly points
- large objects tend to require large enclosing leafs
even though they may not fill much of the space
(i.e. highway corridors)
- these objects will always need to be checked
for intersection
- it may pay to subdivide objects so that the pieces
fall entirely within smaller leafs
- indexing in this way is intuitively more efficient than
indexing by x or y alone since the quadtree index is
effectively two-dimensional
- the divisions at each branching need not be equal in size
- it may pay to have some blocks of smaller area and
some of larger area, rather than four equal squares
at each branching
- however, for general efficiency the blocks should be
rectangular
G. R-TREE INDEXES
- R-tree indexes are a response to the problem of indexing
large areas
- R stands for "range", a concept similar to MER
Method
overhead - R-tree
- find two, possibly overlapping, rectangles (aligned with
x and y axes) such that:
- as many objects as possible are wholly within one or
the other rectangle
- there are roughly equal numbers of objects wholly
enclosed in each rectangle
- the overlap between the rectangles is minimum
- indexing is determined by the rectangle in which the
object is contained
- objects which are wholly within a rectangle are
associated with that rectangle
- objects which are not wholly within either of the
two rectangles are associated with the undivided map
- apply the procedure recursively, finding two new smaller
rectangles within each existing rectangle
- this creates a tree structure similar to the quadtree
- every object is associated with some node in the
tree
- to find the objects which might intersect a given area of
interest:
- find the smallest rectangle used in the indexing
procedure which wholly encloses the area of interest
- the objects are those associated with this rectangle
and all nodes above and below it in the tree
Problem
- although benchmark tests have shown that R-trees are
generally more efficient than quadtrees and simple 1-D
sorts, they are computationally intensive to construct
REFERENCES
Buchmann, A., O. Gunther, T.R. Smith and Y.-F. Wang. Design
and Implementation of Large Spatial Databases, Unit Notes
in Computer Science 409, Springer Verlag, Berlin.
Contains a collection of papers on spatial data indexing.
Guttman, A, 1984. "R-trees: A dynamic index structure for
spatial searching," ACM SIGMOD, pp. 47-57.
Mark, D.M., and J.P. Lauzon, 1984. "Linear quadtrees for
Geographic Information Systems," Proceedings,
International Symposium on Spatial Data Handling, Zurich,
2:412-430.
Noronha, V., 1988. "A survey of Hierarchical Partitioning
Methods for Vector Images," Proceedings, Third
International Symposium on Spatial Data Handling, Sydney,
Australia, pp. 185-199.
Oosterom, P. van, 1990. "A modified binary space partitioning
tree for geographic information systems," International
Journal of Geographical Information Systems 4(2):133-46.
The two Samet books listed as references for Unit 36 contain
useful discussions of quadtree algorithms.
DISCUSSION AND EXAM QUESTIONS
1. Compare the formal methods of indexing (quadtree, R-tree,
1-D sort) with informal methods in common use (e.g.
continents, nation-states, major civil divisions, ZIP codes,
etc.).
2. How would you design a study to compare the effectiveness
of different indexing schemes? What data would you use?
What measures would you compare?
3. Current vector-based systems use a wide variety of
indexing schemes. Why is there no consensus as to the best?
What methods are best for what purposes and area of
application?
4. Devise a means of measuring the Manhattan distance
between two quadtree blocks (assume the codes have the same
length).
Back to Geography 370 Home Page
Back to Geography 470 Home Page
Back
to GIS & Cartography Course Information Home Page
Please send comments regarding content to: Brian
Klinkenberg
Please send comments regarding web-site problems to: The
Techmaster
Last Updated: August 30, 1997.