Tuesday, August 9, 2011

Red Black Trees: Bottom-Up or Top-Down?

Conventional red-black trees are bottom-up: nodes are inserted/deleted at the bottom and coloring properties are corrected up. Last year I found a tutorial describing a top-down approach claiming it is superior in terms of simplicity, ease of understanding, code size, and theoretical efficiency. Unfortunately these claims are unsubstantiated. Being curious I investigated further. I implemented an iterative bottom-up red-black tree (with parent pointers) and compared it with the top-down approach.

Since complexity increases with code size, I used sloccount to measure the code size. For the top-down approach, the Total Physical Source Lines of Code (SLOC) = 131. For bottom-up, SLOC = 189. So far the evidence does favor a top-down approach for being less complex.

But what about efficiency? (Speed is King!) The tutorial claims that a top-down approach makes a single pass of the tree and so is theoretically more efficient.

I wrote a simple testing program that generates and inserts (pseudo)random numbers as keys in a red-black tree and then removes them. The program takes wall time measurements with gettimeofday before and after both blocks of insertion and removal. Using wall time is crude but serves for this demonstration. By dividing the total time of insertion (removal) by the number of keys I compute the average cost per insertion (removal). I ran the testing program 10 times and averaged the per operation costs for top-down and bottom-up for an increasing tree size from between 100 and one million nodes. The following two images show the results (smaller is better):

The top-down approach has a worse average time for both inserting and removing nodes. I did not bother with statistical rigor, but the gaps were larger than a few standard deviations of the measured data.

Although my experiments were far from complete I opted to use the more established, more familiar, and (apparently) more efficient bottom-up approach in an implementation of red-black trees that is now part of the RTEMS kernel.

If I had to guess why the top-down approach performs worse I would say it is pessimistic in how many tree rotations it does. The bottom-up approach seems to make about as many tree rotations as there are nodes. (I believe the number of rotations is bounded to 2 for insert and 3 for remove, but on average it appears to be 1/2 rotation per insert/remove.) The top-down approach makes about 2-4 times as many (albeit simpler) rotations, and I have no idea if the bounds apply.

5 comments:

  1. I have a counter example. In college, I compared my implementation of the eternally confuzzled top-down approach with std:map and found the top-down implementation easily 2x faster.

    This makes me wonder if it was faster just because my code was much simpler.

    Sadly, I don't have the code or the report anymore.

    ReplyDelete
  2. It would be nice to see the comparison. Perhaps I can re-run this test with std:map which should not take much effort. I would guess the STL does a worse job of memory management, and comparison is a function call (operator<).

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Hi!
    Very interesting... Did you use explicit left and right pointers to the children, or just one "chidren" array with two items? I mean, Julienne Walker uses this trick of using "dir = root->data < data" and "root->link[dir] = ...", instead of "if ( data < root->data )", "root->left ..." and "root->right ...". Because if you used one approach for bottom-up and the other one for top-bottom that could cause some difference because of branch prediction. By change, have you published somewhere the bottom-up approach of yours?

    ReplyDelete
  5. hi Tuom, I have not posted code, and have lost the original source, but a colleague of mine has done some similar benchmarking, you may be interested to see it at https://github.com/sebhub/rb-bench

    I believe my original code used the array-based approach for both. I tried to keep the implementations similar as possible.

    ReplyDelete