Tuesday, October 16, 2012

Critical Bugs and Quality Assurance

Sebastian Huber recently posted a nasty RTEMS bug and fix. While simple, the bug manifested in their application as an increase in one task's latency from 20us to 170us! The cause of the problem was that RTEMS has two kinds of critical sections—dispatch and interrupt—and new SMP-aware code added a dispatch critical section to the thread dispatch code. The problem happens with overlapping a traditional interrupt disable critical section. The SMP dispatch code looks like:
void _Thread_Dispatch(void) {
  Thread_Disable_dispatch();
  SMP_Dispatch_other cores();
  Interrupt_Disable();
  ...
  ... do things, including context switch if needed
  ...
  Interrupt_Enable();
  // problem here!
  Thread_Unnest_dispatch(); // enable
  ...
}
The problem is that an interrupt can occur between Interrupt_Enable and Thread_Unnest_dispatch. Suppose a low-priority task (L) is executing _Thread_Dispatch, and the interrupt enables a high-priority task (H), but
H will not be dispatched because dispatching is disabled. Instead L enables dispatching and resumes executing, which is a priority inversion!

The fix reverts the changes made for SMP. For the SMP code, the priority inversion still exists and is unresolved. (RTEMS currently does not make real-time guarantees for the SMP support, so no one cares yet.)

In the broader picture, the bug seems like it should be easy to detect. The issue with free open-source software (FOSS) is that quality assurance (QA) is almost non-existent: the "many eyeballs" philosophy argues against QA. But what else can FOSS do? No one is going to pay for extensive testing, and if they do they have no incentive to share.

FOSS communities (and corporate developers) need better tools for software QA. This summer RTEMS had a GSOC student who was looking at testing. Testing is probably the first tool in the QA toolbox, and the only one most developers have a clue about; how about static analysis, path coverage, standards conformance, or certification? Some interesting work modeling, proving, and certifying systems is out there: Where is the undergraduate textbook and course on QA?

Saturday, October 13, 2012

Web site update

Last night I decided to provide an html version of my CV, and then I remodeled my website. The old version was ugly and broken; the main problem were my iframes. I simplified the design, tried to make it mobile-friendly, and reduced the content. I kept my basic design elements (boxes), and tried to eliminate cruft. I think the product is leaner and cleaner.

Friday, October 12, 2012

Version control for text/LaTeX?

I often use version control (VC) for text documents—especially LaTeX. Modern VC software is good for code, and even text when working alone, but collaboratively editing text with VC is a nightmare. The biggest challenge for text VC seems to be structure/formatting. Code is well-structured within and between lines: text less so. Words/sentences can be re-arranged within sentences/paragraphs, and lines are meaningless. With VC, line wrapping propagates small changes and frustrates reviewing and merging.

Word processors track revisions, but such tools are restrictive and won't work for markup languages. Cloud tools for collaboration such as Docs exist, including some for LaTeX, but requiring connectivity while editing is a non-starter for me.  I would even be happy with a custom LaTeX solution; the features I desire for text VC are similar to those for code:
  • Non-proprietary, application-agnostic, platform-independent FOSS
  • Revision history: see what has changed.
  • Revert: undo changes back to forever.
  • Lock-free: work in parallel, which requires...
  • Pain-free merge: help resolve conflicting commits.
  • Distributed and offline editing: no active (server) connections
  • External contributions: integrate changes made outside of VC
Most VC tracks changes either with changesets or snapshots. Changesets seem worthless because of the structural problems of text. Snapshots seem viable, but I haven't seen turn-key solutions for plain text or LaTeX, and handling external contributions seems especially troublesome.

Thursday, October 11, 2012

GSOC2012: MMU project and musings

I finally made a pass at merging my student's final code for his GSOC2012 project. The project was the culmination of multiple summer projects and some work I did. I'm excited to try improving it to add sparc64 support and use it in my research.

While revising the student's submission for reviewing and merging, I thought of a few ways to improve future new developer participation:
  • Better github integration. Staying up-to-date with rtems.git, tracking student progress, and getting code review from more developers would be helpful for quicker turnaround on submissions.
  • Style: documentation and code conventions. Clear, consistent guidelines and examples of proper/improper coding style would make reviewing and merging a lot easier.
  • Improved Git Workflow. Teaching students how to make useful branches and commits ahead of time would ease code merging, testing, and revising.
  • More submissions. We need to get code reviewed if not merged in smaller increments; this need is well-known and repeated.
These improvements could be addressed in part when students/developers are new to the scene. For example, instead of just making students prove they can build RTEMS and patch hello world, we could require them to document and fix the style of sample code using a branch, and submit a pull request for RTEMS github that contains their proof as separate commits on the same branch.

For getting students to submit and be reviewed more often will take more work on behalf of mentors, developers, and students. Something that may help would be requiring code review as part of the weekly status meetings we instituted this GSOC. Perhaps each student's weekly commits can be reviewed by their mentors as part of tracking progress and status.

Institutional support from RTEMS mentors and developers would help. For example, github integration requires developers and mentors to use github. Style consistency requires a style guide that we accordingly maintain and abide by. Teaching good workflow, and fixing the bad, takes effort: mentors need to (know how to) identify and correct a student who struggles. Increased submission frequency requires urging and commitment from mentors to review code. These improvements take effort, but I think they could substantially improve the participation, progress, and production of students.

Friday, October 5, 2012

My fall hiatus

I've been busy lately, between child-rearing and working on a grant proposal, and I haven't had the time and energy for much else. I'll be writing my thesis soon, too, but that might encourage me to write more here. Meanwhile, check out some fake ads for programming jobs.