123
even a logarithmic time search for each deletion found. In the worst case, many intervals
may have to be examined without even necessarily advancing in the sequence.
Figure 6.4 shows one pattern that causes inefficiency in enumerating deletions. Changes
1, 2, and 3 all overlap in such a way that each must be examined individually, requiring that
3 deletions be found and tested, even though the entire region U1is in fact deleted.
1
234
U1 U2
F F F F iii ig g g gu u u ur r r re e e e 6666....4444:::: OOOOvvvveeeerrrrllllaaaappppppppiiiinnnngggg ddddeeeelllleeeettttiiiioooonnnnssss aaaannnndddd uuuunnnniiiioooonnnn rrrreeeeggggiiiioooonnnnssss
In practice, the cost of checking multiple deletions is probably not too bad, but an alter-
native approach is possible, which makes deletion and enumeration of deleted intervals fast,
at the expense of speed for (the much more infrequent) undo operation. The basic idea is to
retain a full interval tree of all active deletions in a version, but to augment this with an
indexed list of disjoint unionsegments, likeU1andU2. Since these segments are disjoint,
logarithmic update and search time can be guaranteed by indexing them by starting point,
without any need for fancier search structures. New deletions can be added in logarithmic
time by finding the unique segment, if any, that overlaps the starting point of the new seg-
ment as well as the one that overlaps its ending point. If these segments exist, then they
are deleted in logarithmic time, and a new segment representing the union of all three is
inserted in logarithmic time.
When a deletion is undone, it will shorten or split a union segment. The section of the
disjoint deletion list covered by the deletion being undone will become invalid because of
the unknown number of overlapping deletions in the interval it covered. The index could be
rebuilt immediately, or incrementally. Execution time for rebuilding the from scratch is O(n
log n)for an active version withndeletions.
even a logarithmic time search for each deletion found. In the worst case, many intervals
may have to be examined without even necessarily advancing in the sequence.
Figure 6.4 shows one pattern that causes inefficiency in enumerating deletions. Changes
1, 2, and 3 all overlap in such a way that each must be examined individually, requiring that
3 deletions be found and tested, even though the entire region U1is in fact deleted.
1
234
U1 U2
F F F F iii ig g g gu u u ur r r re e e e 6666....4444:::: OOOOvvvveeeerrrrllllaaaappppppppiiiinnnngggg ddddeeeelllleeeettttiiiioooonnnnssss aaaannnndddd uuuunnnniiiioooonnnn rrrreeeeggggiiiioooonnnnssss
In practice, the cost of checking multiple deletions is probably not too bad, but an alter-
native approach is possible, which makes deletion and enumeration of deleted intervals fast,
at the expense of speed for (the much more infrequent) undo operation. The basic idea is to
retain a full interval tree of all active deletions in a version, but to augment this with an
indexed list of disjoint unionsegments, likeU1andU2. Since these segments are disjoint,
logarithmic update and search time can be guaranteed by indexing them by starting point,
without any need for fancier search structures. New deletions can be added in logarithmic
time by finding the unique segment, if any, that overlaps the starting point of the new seg-
ment as well as the one that overlaps its ending point. If these segments exist, then they
are deleted in logarithmic time, and a new segment representing the union of all three is
inserted in logarithmic time.
When a deletion is undone, it will shorten or split a union segment. The section of the
disjoint deletion list covered by the deletion being undone will become invalid because of
the unknown number of overlapping deletions in the interval it covered. The index could be
rebuilt immediately, or incrementally. Execution time for rebuilding the from scratch is O(n
log n)for an active version withndeletions.