A Computer Scientist's Quest: January 2013

Thursday, January 24, 2013

Profiling C++ applications: class composition

In the latter half of last year, my research went down an OOP path especially in the direction of C++; both the hardware containers project and my thesis touch elements of OOP concepts. With the hardware containers we were interested in how C++ developers use inheritance, composition, and encapsulation. For my thesis work, I was interested in how applications use the STL; I wrote an appendix in my thesis describing this aspect. For this post, my focus is on the hardware containers work, which is in my paper pipeline waiting for submission.

I found 21 free and commercial (oxymoron?) open-source C++ programs for this investigation: dimacs-sq, Dijkstra's algorithm; Opal and GEM5, processor simulators; Geant4, physics particle simulator; FlightGear, flight simulator; Wesnoth, video game; OpenCV, computer vision library; Boost, library for C++; MySQL, database server; LibreOffice, office productivity suite; Doxygen, documentation generation; Haiku, OS based on BeOS; ReactOS, OS based on Windows NT; Chromium, web browser; povray, soplex, dealII, namd, xalancbmk, astar, and omnetpp, C++ benchmarks from SPEC CPU 2006.

Using synthetic microbenchmarks, Eugen and I determined that the important C++ code behavior attributes that affect performance when using hardware containers are the ratio of private access specifiers to total number of class variables, depth of inheritance, number of objects a class includes by value, and number of public accessor methods that expose private variables. Nicely enough, all of these can be determined by examining source code: no need to run the C++ programs!

I used two tools to analyze the 21 programs: Understand by SciTools, and CCCC. The former is commercial and comes with an evaluation license, which was enough for me to get the data I wanted; the latter is open-source, which was good enough for me to modify to get data I wanted. I used Understand! to measure the ratio of private:class variables and the depth of inheritance, but the tool did not provide enough data for the objects included by value or for accessor methods. Ultimately I did not get any data for the latter, but I was able to hack CCCC to measure how many objects a class includes by value.

For the composition of classes I used my modifications to CCCC to count how many classes each class includes by value. On average, classes do not include very many objects of other classes by value (perhaps by reference), although large outliers exist. The chart shows the average number in orange, with one standard deviation as an error bar; the maximum among all the classes is in blue.

I used both CCCC and Understand to calculate the inheritance tree depth of these C++ programs. Using both tools showed some difference in the tool quality, since for a metric as simple as inheritance depth the two tools should get identical numbers. The following chart shows the maximum depth of the inheritance tree reported by each of the two tools with CCCC in blue and Understand in orange.

Although the tools occasionally agreed, vastly different numbers were obtained for lots of the programs. The maximum inheritance depth is less than 10 for almost all of these programs, with small (near 1) averages and standard deviations. Thus most classes are decoupled with small inheritance and composition factors, but some strong outliers do exist. For hardware containers, we can treat the outliers as a special case and make safe (enough) assumptions about an object's composition.

Tuesday, January 22, 2013

My new job

As of last week I have started my new full-time job as a Postdoctoral Research Scientist at GWU. I'm working with my thesis co-adviser, and my job is an expanded version of what I was doing before graduating. In addition to continuing my research, I will also be overseeing/assisting the research of some graduate students and writing grant proposals.

The appointment is for the next 18 months or so, and I'm excited about carrying forward the momentum I have in my research.

Looking back to 2012, forward to 2013

Of the many post ideas back-logged in my head, I wanted to recap the past year and perhaps look forward to the new year before making it out of January. For me, 2012 was a busy year in which I

Became a dad.
Defended my Ph.D. dissertation.
Dropped 20 pounds to 18% body fat, BMI 23.
Bicycled 2300 miles, ran another 160 or so.
Published 2 papers and a book chapter.
Continued RTEMS involvement as an admin and mentor for GSOC and GCI.
Went on a Caribbean cruise.
Bought my first car, a minivan.
Repaired/Improved my house: foundation crack sealed, water line repaired, water line replaced, water heater replaced, roof repairs, HVAC heat pump and blower replaced, ventilation ducts cleaned, and kitchen cabinets refaced.

For 2013 some of my goals, resolutions, and expectations are to

Turn 30. Done!
Get a job. Done, more on that soon.
Publish more.
Get more research funding.
Read more.
Learn a new (non-programming) language, probably ASL or suomi.
Continue RTEMS involvement.
Resume regular exercise. I cut way back after my daughter was born.
Finish a century ride, solo or supported. Current best: 88 miles.
Finish a half-marathon, solo or organized. Current best: 8 miles.
Take up swimming.
Go on a family vacation.
Build a "front porch."
Construct a new cat tower.
Get rid of stuff.
Be a better person.