Friday, May 25, 2012

Red-Black Trees Redux: Left-Right Symmetry

I previously compared top-down versus bottom-up red-black trees and concluded that bottom-up is faster and therefore preferred.

Recently I was working with binary search trees (BSTs) and happened to observe anomalous performance between two implementations of a BST (splay tree). One implementation was taking much longer than the other. After some investigation I attributed the variance to two factors: comparison functors and use of left-right BST symmetry. Cursory examination indicated that using a functor instead of direct comparison of keys was increasing execution time around 10-20% for a particular workload. Left-right symmetry was causing about 30-40% increases in execution time!

Comparison functors are important because an implementation of a BST cannot know ahead of time how to compare generic nodes, and—for languages that lack decent templates such as C—a functor is a fact of life. But left-right symmetry is a choice. Some arguments for left-right symmetry are that it reduces the source lines of code (SLOC) and increases correctness. I think the performance problems of using left-right symmetry comes from indexing in an array on top of a pointer dereference, and doing so multiple times. By separating the left and right cases explicitly the programmer specifies the pointer precisely.

Execution time increases for code that exploits left-right symmetry at least for splay trees. How much does the use of left-right symmetry affect red-black tree performance?

Using a search benchmark that I coded from a published description, I evaluated two versions of red-black trees: one that uses left-right symmetry to compress code paths and one that explicitly represents the two sides. (The left-right symmetrical code is the version currently in use in RTEMS.) The benchmark inserts search keys generated randomly until the tree reaches a specific size, and then issues a mix of update and search operations in sets of 100. For this work I used sizes between 16 and 4096 in powers of 2, and about 1000 update/search operations, with the mix varying between pure search and pure updates. Otherwise the benchmark is as described in the paper linked above.

The following charts show representative results from running these benchmarks in Simics/GEMS. The "rbtree" results use explicit left and right child pointers and split the cases; "rbtree sym" compresses the symmetrical cases together and uses an array of two child pointers.

Cycles to insert elements into a red-black tree until reaching the max size.

Cycles when searching in a red-black tree at certain sizes. 1000 searches at each point.

Cycles when updating and searching. 500 searches, 500 extractions, and 500 insertions at each point. Note that an extraction uses a search to find the node to remove.

The percent increase in execution time when using left-right symmetry of insertion is less than that of search: not good for a structure whose purpose is efficient searching!

Ben Pfaff's libavl uses left-right symmetry (and a table-based approach instead of parent-pointers), and comparing the performance of libavl using explicit cases for handling left and right might be interesting. Using his systems benchmarks to evaluate the RTEMS red-black tree implementation would be nice, although of limited value since the benchmarks target general-purpose (*nix) systems. The Linux (lib/rbtree.c) and FreeBSD (sys/sys/tree.h) kernels both have red-black tree implementations with explicit left and right pointers.

I have not evaluated whether the symmetrical case reduces code size (generated machine instructions in the binary). How much is SLOC reduced, and does that translate to reduced program size? If code size goes down due to symmetry then, for an embedded system, the choice of which implementation to use becomes more complicated. But if code size is the same (or increases!) then quantitative reasoning tells us not to use left-right symmetry.

Friday, May 18, 2012

Bike to work day

Today was bike to work day and I normally commute on Friday anyhow so I participated albeit without registering. There were noticeably more people out, particularly in small groups, although not many more than on a nice weekend day: fewer pedestrians though. I wore my new kit, which consists of the bib shorts and jersey of the 2008 Finnish champion (from team Francaise Des Jeux), blue/white helmet, blue gloves, white shoes, and my blue/white bike.

A skinny-tire caught my wheel shortly after I hooked up to the W&OD trail and we paced each other most of the way into DC. It was nice although for the first couple miles I didn't keep his wheel because he was a bit more brazen with rolling through stops. We had a pretty nice pace for the latter half of my ride. He was commuting from Ashburn, which is a sight farther than my own ride.

The ride home was a little rough, but the worst of it came near the end. Shortly after I got back onto the streets from the trail I was passed by a black BMW and the passenger yelled, "Get off the road %$#hole!" I have read that this is rather common, and so is having things thrown at you or being buzzed. The latter two constitute assault with a deadly weapon (vehicle) and can be brought to court.

Since I was only verbally harassed I decided to give a little back. So I chased the car down to the next stop light, pulled alongside, and yelled, "I have as much a right to the road as you!" The passenger, a young man, pretended not to hear because the window was up; the young girl driving seemed either embarrassed or amused, or maybe both. I went off to another lane to resume my ride. The same car passed me again a little later. Some people suck.