Recently I was working with binary search trees (BSTs) and happened to observe anomalous performance between two implementations of a BST (splay tree). One implementation was taking much longer than the other. After some investigation I attributed the variance to two factors: comparison functors and use of left-right BST symmetry. Cursory examination indicated that using a functor instead of direct comparison of keys was increasing execution time around 10-20% for a particular workload. Left-right symmetry was causing about 30-40% increases in execution time!
Comparison functors are important because an implementation of a BST cannot know ahead of time how to compare generic nodes, and—for languages that lack decent templates such as C—a functor is a fact of life. But left-right symmetry is a choice. Some arguments for left-right symmetry are that it reduces the source lines of code (SLOC) and increases correctness. I think the performance problems of using left-right symmetry comes from indexing in an array on top of a pointer dereference, and doing so multiple times. By separating the left and right cases explicitly the programmer specifies the pointer precisely.
Execution time increases for code that exploits left-right symmetry at least for splay trees. How much does the use of left-right symmetry affect red-black tree performance?
Using a search benchmark that I coded from a published description, I evaluated two versions of red-black trees: one that uses left-right symmetry to compress code paths and one that explicitly represents the two sides. (The left-right symmetrical code is the version currently in use in RTEMS.) The benchmark inserts search keys generated randomly until the tree reaches a specific size, and then issues a mix of update and search operations in sets of 100. For this work I used sizes between 16 and 4096 in powers of 2, and about 1000 update/search operations, with the mix varying between pure search and pure updates. Otherwise the benchmark is as described in the paper linked above.
The following charts show representative results from running these benchmarks in Simics/GEMS. The "rbtree" results use explicit left and right child pointers and split the cases; "rbtree sym" compresses the symmetrical cases together and uses an array of two child pointers.
Cycles to insert elements into a red-black tree until reaching the max size. |
Cycles when searching in a red-black tree at certain sizes. 1000 searches at each point. |
Cycles when updating and searching. 500 searches, 500 extractions, and 500 insertions at each point. Note that an extraction uses a search to find the node to remove. |
The percent increase in execution time when using left-right symmetry of insertion is less than that of search: not good for a structure whose purpose is efficient searching!
Ben Pfaff's libavl uses left-right symmetry (and a table-based approach instead of parent-pointers), and comparing the performance of libavl using explicit cases for handling left and right might be interesting. Using his systems benchmarks to evaluate the RTEMS red-black tree implementation would be nice, although of limited value since the benchmarks target general-purpose (*nix) systems. The Linux (lib/rbtree.c) and FreeBSD (sys/sys/tree.h) kernels both have red-black tree implementations with explicit left and right pointers.
I have not evaluated whether the symmetrical case reduces code size (generated machine instructions in the binary). How much is SLOC reduced, and does that translate to reduced program size? If code size goes down due to symmetry then, for an embedded system, the choice of which implementation to use becomes more complicated. But if code size is the same (or increases!) then quantitative reasoning tells us not to use left-right symmetry.