Efficient CRC calculation with minimal memory footprint

Almost every form of digital information exchange can introduce communication errors. Sometimes, these errors can be ignored (for example, an erroneous pixel in a high-resolution video is merely unnoticeable). However, most of the time, they cannot be ignored, and we want to ensure that the receiver gets an absolutely correct stream of information.

In order to overcome the inherent inaccuracy of information transmission, a few methods for error detection and correction have been developed. Generally speaking, these methods introduce some redundancy to the actual message, which in turn can be used to detect errors and in some cases to recover the original data.

One of the most common methods is the use of the CRC (cyclic redundancy check) function, a family of codes commonly used to ensure data integrity in digital data streams by detecting errors due to noise or dropouts of bits in the message stream. The CRC calculates a series of bits (also commonly referred to as a CRC) that is appended to the data stream and transmitted along with the message. The receiver performs a CRC function on the message and compares the result with the received CRC code to verify the integrity of the message. CRC is commonly used in a number of applications such as digital communications and computer data storage systems.

The CRC is performed through binary polynomial division between the transmitted message and a polynomial divisor and is usually implemented using a linear feedback shift register (LFSR). An LFSR is a shift register where its next state is a linear combination of its previous state and input bits. In our context, the linear operators are Logical XOR and Logical AND. Since the operation of an LFSR circuit is deterministic, and the CRC is shorter than the message, typically few messages are mapped to each CRC value. A well-chosen polynomial ensures an evenly distributed mapping of messages to CRC values (for example, if all messages were mapped to the same CRC, detection of an error in a message bit would be impossible).

The trick in an embedded systems design is to implement this technique in the most efficient way possible and in the smallest possible footprint. In this article, we discuss aspects of the theory and implementation of LFSRs and CRCs, illustrated using a family of instructions on Analog Devices’ Blackfin processor specifically defined for addressing the task of efficient LFSR implementation. We also provide a method for converting an implementation from an Internal LFSR form to an External LFSR form.

Links: http://embedded.com/design/206901030

The Rule of Fifty

Though we all have a gut-sense that working too many hours is counterproductive, a very short paper by John Nevison called “Overtime Hours: The Rule of Fifty” at Project Solutions (where you need to register) cites data from four studies conducted over a half century to show how productivity either drops, or at best maxes out, as overtime increases.

The studies’ results vary quite a bit. One shows that at 50 hours/week workers do about 37 hours work, dropping to just over 30 once the workweek increases to 55. The “best,” if you can call it that, results were from a 1997 survey showed wielding the whip can have ever-increasing productive rewards, edging up to about 52 productive hours for a 70 hour week. But they all show a nearly-impenetrable barrier of around 50 useful hours or less, regardless of overtime.

Unsurprisingly the data shows a sharp drop in results for those working excessive OT for weeks on end, averaging around a 20% drop after 12 weeks of servitude. That means, as the author concludes, the Rule of Fifty is a best case estimate.

The 2005 Circadian Technologies Shiftware Practices survey showed that productivity can decrease by as much as 25% for a 60 hour workweek, which jibes pretty well with Nevison’s data. Circadian’s results also demonstrate that turnover is nearly three times higher among workers putting in a lot of OT, and absenteeism is twice the national average. I’m not sure what that result means, since it’s awfully hard for an absent worker to be putting in overtime.

Fred Brooks claims that the average software engineer devotes about 55% of his week to project work. The rest goes to overhead activities, responding to HR, meetings about the health insurance plan, and supporting other activities.

The German Embassy’s Washington web site claims on its web site that the nominal workweek in Germany is 37.5 hours because “The original reason for introducing this system was to combat rush-hour traffic congestion, but among the more direct gains are an improvement in employee morale, greater productivity, significant decreases in absenteeism, greater flexibility for women who juggle the demands of work, home and children, and the increased sense of individual dignity that the employees get from having a greater say in organizing their own time.”

Links:  http://embedded.com/columns/breakpoint/206902869

Taming software complexity

“Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.”

Folk wisdom, as reflected in Brian Kernighan’s clever riff on clever code, is that insight we all know but generally ignore. But how can we measure “cleverness?” Supreme Court Justice Potter Stewart, trying to characterize pornography, memorably lamented that it somewhat defies definition, but “I know it when I see it.” Likewise, most of us can sense excessively clever code but may disagree when examining specific code chunks.

We do know complicated functions are tough to understand, an agony to debug, and usually require excessive amounts of maintenance.

One of the tragedies of software engineering is that, in industry at least, we’ve successfully avoided the metrics-based quality revolution that transformed manufacturing. Few companies know their pretest bug rates in, say, defects per line of code. Or productivity in lines of code per hour. Or which are the worst functions in a project. In an experiment I conducted surreptitiously last year, only a third of correspondents could answer the seemingly simple question “how big is your code” in any metric they wanted to use (MB, lines of code, lines they developed, value in dollars, or, heck, the weight of the listings).

A management mantra states “if you can’t measure it, you can’t manage it.” Though surely true for building widgets, it’s a lot harder to apply to engineering. But some objective measures provide valuable insight into the art of developing firmware. Since we know complex functions are problematic, it seems logical that we measure and control complexity in functions.

And it turns out that’s pretty easy to do.

But first, there are two kinds of complexity: essential, which is a quality about the problem being solved, and incidental, which is the complexity introduced by our engineering approaches. Here we’re talking about incidental complexity, as that’s the only kind we as developers can manage.

Thirty-one years ago, Thomas McCabe proposed a metric called “cyclomatic complexity” to attach a quantitative measure to a function’s complexity. It’s one of many such measures but is by far the one most used. It measures the number of “edges” and “nodes” in a function, where a node is a computational statement and an edge represents the transfer of control between nodes. Cyclomatic complexity is then:

v(G)=edges–nodes+2

Links:  http://embedded.com/columns/technicalinsights/206901032

How to write secure C/C++ application code

Many flaws are exploitable because the address space of the vulnerable program has memory that is both writable and executable. As a result, an attacker can write to memory using a buffer overflow, for example, and then transfer control to the address of the injected code.

This problem can be addressed by ensuring that memory pages have the minimal permissions required for proper operation (an application of the principle of least privilege). This is not a comprehensive defense against all exploits, but simply a mechanism designed to extend your defense-in-depth strategy by increasing the difficulty for an attacker to exploit a vulnerability.

Several existing systems enforce reduced privileges in the kernel so that no part of the process address space is both writable and executable. This policy has been named “W xor X” or more concisely WAX. Enforcement of this policy results, for example, in a nonexecutable stack.

OpenBSD is an example of a system that implements a WAX policy. In OpenBSD, an application can still request explicit permissions with mprotect() to override the default.

The implementation of a WnX policy depends on the central processing unit (CPU) and memory management unit (MMU) architecture. For processors that feature an execute bit for each page of memory; OpenBSD implements two changes to make this possible:

1. The signal trampoline is removed from the stack. The signal trampoline page is given read and execute permissions, while the stack segment is given read and write permissions. By doing this, the stack becomes nonexecutable.

2. The mapping of shared libraries and heap memory space is changed so that the data segments do not contain objects which are read, write, and executable. This entails moving the shared library global offset table (GOT), the shared library procedure linkage table (PLT), C++ constructors (.ctors), and C++ destructors (.dtors).

The GOT and PLT are mapped onto their own pages, which are made nonwritable. Minor changes to the dynamic linker are needed to accommodate this change. The . dtors and . ctors sections are moved in with the GOT and consequently become nonwritable as well. Making these changes eliminates executable objects from the data segment.

The implementation for architectures that do not support a per-page execute bit (such as IA-32) is more involved but achieves a similar result.

To supplement this scheme, OpenBSD also reduces permissions on readonly strings and pointers stored as constant data. Historically, these objects were stored in the text segment and had permissions of PROT READ IPROT EXEC. This meant that constant data could potentially be executed as shell code.

Because OpenBSD uses the ELF object file format, a new segment, aptly named . rodata, was created to store this data. As a result, this behavior was changed so that now these objects only have PROT READ and they lose PROT EXEC.

Links: http://embedded.com/columns/technicalinsights/202802744

Lilo

LILO is the most used Linux Loader for the x86 flavor of Linux; I’ll call it Lilo rather than LILO here because I don’t appreciate uppercase. This file describes some typical Lilo installations. It’s intended as a supplement to the Lilo User’s Guide. I think examples are informative even if your setup isn’t much like mine. I hope this saves you trouble. Since Lilo’s own documentation is very good, who’s interested in the details is referred to /usr/doc/lilo* (once upon a time said gentlemen like Cameron Spitzer and Alessandro Rubini who have made early versions of this document)

LILO (LInux LOader) is a generic boot loader for Linux.

LILO was originally developed by Werner Almesberger, while its current developer is John Coffman.

LILO does not depend on a specific file system, and can boot an operating system (e.g., Linux kernel images) from floppy disks and hard disks. One of up to sixteen different images can be selected at boot time. Various parameters, such as the root device, can be set independently for each kernel. LILO can be placed either in the master boot record (MBR) or the boot sector of a partition. In the latter case something else must be placed in the MBR to load LILO.

At system start, only the BIOS drivers are available for LILO to access hard disks. For this reason, with very old BIOSes, the accessible area is limited to cylinders 0 to 1023 of the first two hard disks. For later BIOSes, LILO can use 32-bit “logical block addressing” (LBA) to access practically the entire storage of all the harddisks that the BIOS allows access to.

LILO was the default bootloader for most Linux distributions in the years after the popularity of loadlin. Today, most distributions use GRUB as the default bootloader.

Links:

http://tldp.org/HOWTO/LILO.html

http://en.wikipedia.org/wiki/LILO_(boot_loader)