Monday, December 19, 2011

Using partitions as a free list

Yesterday I discussed my search for some help with implementing a free list in RTEMS. Today I'm glad to say I was able to use partitions to achieve my objective of a growable free list—something akin to a slab allocator. A fundamental limit, the maximum number of partitions, still exists but partitions can have an arbitrary size; these limits might be circumvented safely by making each partition much larger than the previous or by combining old partitions into new ones. Further improvements can be made, but for now I'm satisfied.

Allocation
Iterate over available partitions until one returns a free object. If none succeeds, create a new partition (malloc a new set of objects) and get a free object from the new partition.

I have not handled what to do if partition creation fails. Performance could be improved by tracking the non-full partitions.

Deallocation
Return the object to its partition. I made this easy by storing a handle to the owning partition as part of each object. Non-invasive techniques, such as tracking address ranges for each partition, could be used for a generic solution.

Sunday, December 18, 2011

RTEMS free list?

I implemented a free list in an RTEMS application as a pre-allocated array of objects on a linked list, but I foresee the need to have a fail-over case for growing my free list dynamically. Extending my current implementation to support dynamically changing the size of the free list is easy enough, but I'm interested to see if such support exists. I found two likely candidates: Partitions and Regions.

Partitions implement a simple free list of objects and have low overhead in both time and space. However a partition cannot change the size of its memory so an allocation will fail after the partition exhausts the preallocated set of buffers. This behavior is exactly what I already have implemented for my application.

Regions implement a more complex memory manager that allows for carving up a chunk of memory, calling it a region, and allocating segments from within a given region. However since segments can be variable-sized, the region manager uses a first-fit allocator that coalesces free blocks. Such an allocator is overkill for a free list of fixed-size objects.

Neither are quite what I want; where is the middle ground. My suspicion is that no such mechanism exists already because (hard) real-time applications usually have well-known resource requirements; the number of objects in the system are known and the partition manager would be sufficient. But when an unknown number of same-sized objects are employed, the current mechanisms appear to be insufficient.

Interestingly the Object Manager, an internal RTEMS subsystem, does seem to have capabilities to support dynamically extending the size (up to a preconfigured but possibly unlimited maximum) of a free list of fixed-size objects. Unfortunately the Object Manager is not accessible directly from the application level without violating API visibility rules.

Update: See my follow-up post regarding using partitions to implement a growable free list.

Wednesday, December 7, 2011

Science and Government

I just finished reading a brief lecture by C.P. Snow entitled "Science and Government" that, among other things, examines the role of science in decision making.  Anyone who is involved in committees, politics (governmental, office, academic, or otherwise), or decision-making should read this lecture. I found its central story both entertaining and enlightening.

One of the main conclusions of the lecture is that decision-making ought not succumb to the euphoria of secrecy, and that decision-makers ought not be obsessed with gadgets: personally favored creations that have little value and lack a broad base of support. Snow says,
The euphoria of gadgets; the euphoria of secrecy...are the origin of 90 per cent of ill-judged scientific choices. Any scientist who is prone to these euphorias ought to be kept out of government decisions or choice-making, at almost any cost.
And as argument against obsessive covertness he claims that "societies at about the same level of technology will produce similar inventions."

He also discusses the roles of committees in decision-making---regardless of political organiziation committees are almost invariably present---and gives some advice for what may constitute an effective committee, namely
  1. The committee's objective must be clear and not too vast.
  2. The committee must be placed appropriately within the hierarchy of the larger (governmental) system. 
  3. The committee must have powers of action. Advisory committees in particular have no power and therefore cannot make decisions.
I found the read quick, accessible, and relevant despite being 50 years old.

Monday, December 5, 2011

RTEMS Memory Protection Middle Layer

This is a continuation of my prior post in which I described an API for memory protection. Here I will describe the middle layer that glues the user API to the CPU support code that eventually interacts with hardware to enforce the permission attributes.

For those familiar with RTEMS, this code lives in libcpu and the design mimics that of the cache manager. The middle layer is just 5 functions:
/*
 * Initialize the hardware to prepare for
 * memory protection directives.
 */
rtems_status_code _CPU_Memory_protection_Initialize( void );

/*
 * Make sure memory protection permission
 * is valid for this CPU
 */
rtems_status_code _CPU_Memory_protection_Verify_permission(
    rtems_memory_protection_permission permission
);

/*
 * Check if memory protection region size
 * is valid for this CPU
 */
rtems_status_code _CPU_Memory_protection_Verify_size(
    size_t size
);

/*
 * Install (enforce) the memory protection entry mpe
 */
rtems_status_code _CPU_Memory_protection_Install_MPE(
    rtems_memory_protection_entry *mpe
);

/*
 * Uninstall the memory protection entry mpe
 */
rtems_status_code _CPU_Memory_protection_Uninstall_MPE(
    rtems_memory_protection_entry *mpe
);
These functions are called via thin wrappers from the high-level API. The install and uninstall entry functions are called iteratively by the higher-level functions that install and uninstall protection domains. The two "verify" functions are used by the high-level API to avoid trying to install invalid entries proactively. The remaining functions interact with the hardware directly to support the memory protection mechanisms.

Next I will briefly explain what each function does in my sparc64 implementation.
  • _CPU_Memory_protection_Initialize
    • Disables interrupts
    • Selectively demaps pages that were loaded during system startup
    • Flushes the cache
    • Maps new pages for the kernel core services
    • Installs exception handlers for MMU misses
  • _CPU_Memory_protection_Verify_permission
    • Returns true; needs a little refinement.
  • _CPU_Memory_protection_Verify_size
    • Ensures the size (bounds) is a multiple of the supported TLB page size
  • _CPU_Memory_protection_Install_MPE
    • Adds a TLB entry for the memory region specified in the MPE
  • _CPU_Memory_protection_Uninstall_MPE
    • Removes the TLB entry for the specified memory region
These functions call BSP- and CPU-specific functions to handle the hardware (TLB) interactions and comprise the entire middle layer between the API for memory protection and whatever low-level code enforces the protection mechanism.

RTEMS Memory Protection API

I am working on a solution to introduce real-time memory protection to RTEMS. I started from the work of two GSOC projects related to MMU support in RTEMS, and you can see most of the raw code that I have at http://code.google.com/p/gsoc2011-rtems-mmu-support-project/. This post describes the current design of the (high-level) API layer for setting attributes on regions of memory. I plan to post in the future about the middle and low-level interfaces. Any suggestions about design/naming/implementation etc. are welcomed.

The high-level interface comprises a region and attributes to create a memory protection entry (MPE), a set of which constitute a memory protection domain (MPD). Three structures are visible at the API layer, although only regions should ever be directly accessed; MPEs and MPDs are only meant to be used as handles in userland.
/**
 * A region of contiguous memory
 */
typedef struct
{
 char  *name;
 void  *base;
 size_t bounds;
} rtems_memory_protection_region_descriptor;

typedef uint8_t rtems_memory_protection_permission;

/**
 * A memory protection entry (MPE) is a region of contiguous memory
 * (rtems_memory_protection_region_descriptor) with a set of permissions.
 * If it is currently being enforced then the installed field is true.
 */
typedef struct
{
 rtems_chain_node                          node;
 rtems_memory_protection_region_descriptor region;
 bool                                      installed; 
 rtems_memory_protection_permission        permissions;
} rtems_memory_protection_entry;

/**
 * A memory protection domain comprises a chain of active MPEs and 
 * a freelist of idle MPEs. MPE storage is allocated from the Workspace
 * and managed in the MPD.
 */
typedef struct
{
  rtems_memory_protection_entry  *mpe_array;
  size_t                          mpe_array_size;
  rtems_chain_control             active_mpe_chain;
  rtems_chain_control             idle_mpe_chain;    /* free list */
} rtems_memory_protection_domain;

Application developers can create regions, for example
rtems_memory_protection_region_descriptor text_region = {
    .name = "text",
    .base = TEXT_BEGIN,
    .bounds = TEXT_SIZE
  };
Where TEXT_BEGIN and TEXT_SIZE are specified somewhere. The advantage of using a name to identify a region is that eventually we could layer a filesystem over the MPD structure and open memory files with specified attributes, for example: int fd = open("/memory/text", O_RDWR);. I have not given thought to how best to create regions, but that interface should be orthogonal to the protection API.

Regions must be encapsulated within MPEs and added to a MPD to have permissions enforced.  The requisite functions are
/**
 *  Allocates an array of max_number_of_mpes 
 *  memory protection entries from the workspace.
 */
rtems_status_code rtems_memory_protection_initialize_domain(
  rtems_memory_protection_domain *domain,
  size_t max_number_of_mpes
);

/**
 * Create a new memory protection entry for the region with
 * permissions in perms. This function adds the newly created mpe
 * to the active chain of the_domain but does not install it. 
 */
rtems_status_code rtems_memory_protection_create_entry(
  rtems_memory_protection_domain* the_domain,
  rtems_memory_protection_region_descriptor * const region,
  const rtems_memory_protection_permission perms,
  rtems_memory_protection_entry** mpe_ret
);

/**
 * Install all mpes on the active list of the_domain.
 */
rtems_status_code rtems_memory_protection_install_domain(
  rtems_memory_protection_domain* the_domain
);
So to enforce the text_region from above:
rtems_memory_protection_domain *protection_domain;
rtems_memory_protection_entry *mp_entry;
rtems_memory_protection_permission permission;

rtems_memory_protection_initialize_domain( protection_domain, 8 );

permission = RTEMS_MEMORY_PROTECTION_READ_PERMISSION | 
             RTEMS_MEMORY_PROTECTION_EXECUTE_PERMISSION;

rtems_memory_protection_create_entry(
    protection_domain,
    &text_region,
    permission,
    &mp_entry
);

rtems_memory_protection_install_domain(protection_domain);
One aspect I'm not happy with is the permissions field, an 8-bit field with preprocessor macros defining masks. A better solution would improve usability. The remainder of the API layer are a handful of functions to manage MPEs (find mpe by address/MPD, delete mpe from mpd, set permission) and MPDs (uninstall all MPEs and finalize to free resources).

Update: continued here