Tuesday, March 30, 2010

Running your code!

One of the challenges of writing code as part of a large project is being able to actually test your code as you develop it. I've been working on some relatively small thread scheduling projects in RTEMS over the past couple of weeks, and my most recent work has just been really depressing. I wrote ~300 lines of code, but haven't been able to test any of it. I still haven't, but I'm really close now -- my test case compiles, and the system runs without crashing, but I have no idea if it is doing what I want it to. It has taken me probably 40+ hours of coding to get to this state. That is a very frustrating feeling, to code for that long without knowing if what you are writing will work.

So what can be done? One way to validate algorithms is to implement them in isolation. This works great, unless you need another 10K lines of code to actually exercise your algorithm. In this case, you want some way to emulate the rest of the system, and create some type of test bench for your code. In this case, it would be an environment for implementing scheduling algorithms for RTEMS that has all of the interfaces to the scheduler. A similar project is LinSched, which is an infrastructure for trying out Linux scheduler algorithms. This is one of the goals of the GSoC project that I am proposing to do, and I will post more on this later. :)

For now, I'm going to try and validate that my code actually works, and doesn't just "run to completion."

Tuesday, March 23, 2010

Extensible Data Structures in C

A lot of systems programming code is done in C, primarily because of the exposure of explicit memory addresses, but for other reasons too. However, C has pretty poor language support for many of the helpful programming constructs in object oriented languages that improve code re-use and readability, for example generics/templates, polymorphism, and inheritance. So systems programmers have developed a few design patterns to emulate such language constructs using C.

Lately I've been looking at how to design data structures in C that can provide flexible storage for data elements that can be used by different consumers. In particular, I'm looking at how a thread control block can provide storage for different implementations of scheduling systems.

There are three features of C that appear useful for this purpose: void pointers, unions, and typedefs.

union works well when you know exactly what type of data the different consumers will use ahead of time, and you are willing to place multiple specifications within the data structure. The advantages of union are that it is easy-to-read and efficient, providing multiplexed storage space to different data structures. Some disadvantages are that the programmer has to know which field of the union to access, and the size consumed by the union is equal to the largest member.

void pointers are type-less memory addresses, so a programmer can assign the value of the pointer to point to any structure, and later retrieve what was stored. However, as with unions, the programmer must know the type of the structure pointed to by the void pointer in order to use the structure. This usually is not a problem for the scenario I'm investigating.

typedefs provide a way to define new types for structures and primitives. They allow programmers to define opaque types whose representation can change, but whose type can always be checked. They have the advantage over union and void pointers of being able to be type-checked. The way to use typedef to provide extensibility is to combine it with C pre-processor conditional compilation to control which typedef is used. This way, multiple typedefs of the same type can be defined, and only one will be used based on pre-processor defines. However, code that uses the generic typedef may need to provide multiple cases based on the different specific typedefs provided.

I may come back to this topic and provide some examples. I'm playing around with typedefs right now, and like this approach because the ugliness can be hidden and the end result is high-performance code with little bloat in the compiled version.

Monday, March 15, 2010

Introduction to RTEMS

I will probably have quite a few posts related to RTEMS, so I thought an introduction would be appropriate. I've been doing a lot of work recently on a project with Eugen, a fellow Ph.D. student, to port the RTEMS Operating System (related blog) to the UltraSPARC T1 Niagara, a 64-bit SPARC-v9 processor.

RTEMS is a real-time operating system (RTOS), which means that its operations can be precisely and accurately timed and that it supports applications that have strict timing requirements.  Classes of such applications range from control systems to streaming data processors. Examples that we see (or don't see) everyday are embedded in such things as planes, trains, and automobiles, or multimedia video and audio devices.

RTEMS is notable for a few reasons. First, it's free and open source. Second, it supports a large number of target architectures (platforms). Third, it is in space!  Some of the platforms that RTEMS supports are radiation-hardened for outer space, and it is on some of the NASA and ESA equipment floating around up there.

But I don't plan to be a rocket scientist. My interest in RTEMS is for its support of a variety of computer architectures, including the SPARC-v8 architecture.  Eugen and I have identified the Niagara as a promising architecture with which to continue our current research direction, and we wanted to have a low-level OS that is small enough to understand.  Thus we chose RTEMS and started porting the existing support for SPARC-v8 to the newer SPARC-v9. This has been an ongoing effort for awhile, but we have made quite a bit of success, and hope to contribute our work back to the RTEMS community.

If you want more info about RTEMS, check out its website, which is newly updated.

Sunday, March 14, 2010

Versioning my work

I've found it helpful to keep around clean and working versions of projects that I edit, especially when I am changing someone else's work.  This is suggested behavior for generating patches for open source projects, but I find it useful for both programming and document editing.

Of course, version control (CVS or subversion) helps, but I don't always want to commit my changes back to the repository. Someday I should also investigate a distributed version control system that lets other users "check out" your branches.  However, most of the projects I work on use cvs and svn. I also like to extract the changes that I've made, in order to keep track of my versions by hand.  This is useful when I reach milestones in projects, and I want to make off-line backups.

Post 0

I've been thinking about starting a blog for awhile, but unlike some of my compulsions, I actually followed through this time.  Although I've never been much into the "blog" scene, this might be a decent way to write down some of my random thoughts. I'll try to focus on issues related to my work, but I'll probably mix in some life-related postings.

What can you expect to find here? My plan is to post reflections, links to topics that interest me, and tips and tricks related to technology and life. Really this will be a repository for myself to look back and see some of the things that I've found to be interesting. But maybe someone else will benefit, too.

I'll do my best not to belabor points, although I may tend to wax philosophical, and I do enjoy writing. But I know that no one wants to read a lot of useless drivel.