Build a Super Simple Tasker

Almost all embedded systems are event-driven; most of the time they wait for some event such as a time tick, a button press, a mouse click, or the arrival of a data packet. After recognizing the event, the systems react by performing the appropriate computation. This reaction might include manipulating the hardware or generating secondary, “soft” events that trigger other internal software components. Once the event-handling action is complete, such reactive systems enter a dormant state in anticipation of the next event.1

Ironically, most real-time kernels or RTOSes for embedded systems force programmers to model these simple, discrete event reactions using tasks structured as continuous endless loops. To us this seems a serious mismatch–a disparity that’s responsible for much of the familiar complexity of the traditional real-time kernels.

In this article we’ll show how matching the discrete event nature typical of most embedded systems with a simple run-to-completion (RTC) kernel or “tasker” can produce a cleaner, smaller, faster, and more natural execution environment. In fact, we’ll show you how (if you model a task as a discrete, run-to-completion action) you can create a prioritized, fully preemptive, deterministic real-time kernel, which we call Super Simple Tasker (SST), with only a few dozen lines of portable C code.2

Such a real-time kernel is not new; similar kernels are widely used in the industry. Even so, simple RTC schedulers are seldom described in the trade press. We hope that this article provides a convenient reference for those interested in such a lightweight scheduler. But more importantly, we hope to explain why a simple RTC kernel like SST is a perfect match for execution systems built around state machines including those based on advanced UML statecharts. Because state machines universally assume RTC execution semantics, it seems only natural that they should be coupled with a scheduler that expects and exploits the RTC execution model.

We begin with a description of how SST works and explain why it needs only a single stack for all tasks and interrupts. We then contrast this approach with the traditional real-time kernels, which gives us an opportunity to re-examine some basic real-time concepts. Next, we describe a minimal SST implementation in portable ANSI C and back it up with an executable example that you can run on any x86-based PC. We conclude with references to an industrial-strength single-stack kernel combined with an open-source state machine-based framework, which together provide a deterministic execution environment for UML state machines. We’ll assume that you’re familiar with basic real-time concepts, such as interrupt processing, context switches, mutual exclusion and blocking, event queues, and finite state machines.

Links: http://embedded.com/columns/technicalinsights/190302110

Exactly When Do You Need Real Time?

Do most embedded projects still need an real time operating system (RTOS)? It’s a good question, given the speed of today’s high performance processors and the availability of patches for real-time Linux, Windows, and other general purpose operating systems (GPOSs). The answer lies in the very nature of embedded devices.

Devices that, in most cases, are manufactured in the thousands, or millions, of units. Devices where even a $1 reduction in per-unit hardware costs can save the manufacturer a small fortune. Devices, in other words, that can’t afford multi-gigahertz processors or a large memory array. In the automotive telematics market, for instance, the typical 32-bit processor runs at about 200Mhz — a far cry from the 2Ghz or faster processors now common in desktops and servers.

In an environment like this, an RTOS designed to extract extremely fast (and predictable) real-time response times from lower-end hardware offers a serious economic advantage. Savings aside, the services provided by an RTOS make many computing problems easier to solve, particularly when multiple activities compete for a system’s resources.

Consider, for instance, a system where users expect (or need) immediate response to input. With an RTOS, a developer can guarantee that operations initiated by the user will execute in preference to other system activities, unless a more important activity (for instance, an operation that helps protect the user’s safety) must execute first.

Consider also a system that must satisfy quality of service (QoS) requirements, such as a device that presents live video. If the device depends on software for any part of its content delivery, it can experience dropped frames at a rate that users perceive as unacceptable – from the users’ perspective, the device is unreliable. But, with an RTOS, the developer can precisely control the order in which software processes execute and thereby ensure that playback occurs at an appropriate and consistent media rate.

RTOSs Aren’t “Fair”
The need for “hard” real time – and for OSs that enable it – remains prevalent in the embedded industry. The question is, what does an RTOS have that a GPOS doesn’t? And how useful are the realtime extensions now available for some GPOSs? Can they provide a reasonable facsimile of RTOS performance?

Let’s begin with task scheduling. In a GPOS, the scheduler typically uses a “fairness” policy to dispatch threads and processes onto the CPU. Such a policy enables the high overall throughput required by desktop and server applications, but offers no assurances that high-priority, time-critical threads will execute in preference to lower-priority threads.

For instance, a GPOS may decay the priority assigned to a high-priority thread, or otherwise dynamically adjust the priority in the interest of fairness to other threads in the system. A high-priority thread can, as a consequence, be preempted by threads of lower priority. In addition, most GPOSs have unbounded dispatch latencies: the more threads in the system, the longer it takes for the GPOS to schedule a thread for execution. Any one of these factors can cause a high-priority thread to miss its deadlines, even on a fast CPU.

In an RTOS, on the other hand, threads execute in order of their priority. If a high-priority thread becomes ready to run, it can, within a small and bounded time interval, take over the CPU from any lower-priority thread that may be executing. Moreover, the high-priority thread can run uninterrupted until it has finished what it needs to do – unless, of course, it is preempted by an even higher-priority thread. This approach, known as priority-based preemptive scheduling, allows high-priority threads to meet their deadlines consistently, no matter how many other threads are competing for CPU time.

Links: http://embedded.com/columns/technicalinsights/193001454

Introduction to Serial Peripheral Interface

SPI vs. I2C

Both SPI and I2C provide good support for communication with slow peripheral devices that are accessed intermittently. EEPROMs and real-time clocks are examples of such devices. But SPI is better suited than I2C for applications that are naturally thought of as data streams (as opposed to reading and writing addressed locations in a slave device). An example of a “stream” application is data communication between microprocessors or digital signal processors. Another is data transfer from analog-to-digital converters.

SPI can also achieve significantly higher data rates than I2C. SPI-compatible interfaces often range into the tens of megahertz. SPI really gains efficiency in applications that take advantage of its duplex capability, such as the communication between a “codec” (coder-decoder) and a digital signal processor, which consists of simultaneously sending samples in and out.

SPI devices communicate using a master-slave relationship. Due to its lack of built-in device addressing, SPI requires more effort and more hardware resources than I2C when more than one slave is involved. But SPI tends to be simpler and more efficient than I2C in point-to-point (single master, single slave) applications for the very same reason; the lack of device addressing means less overhead.

Inside the box

SPI is a serial bus standard established by Motorola and supported in silicon products from various manufacturers. SPI interfaces are available on popular communication processors such as the MPC8260 and microcontrollers such as the M68HC11. It is a synchronous serial data link that operates in full duplex (signals carrying data go in both directions simultaneously).

Devices communicate using a master/slave relationship, in which the master initiates the data frame. When the master generates a clock and selects a slave device, data may be transferred in either or both directions simultaneously. In fact, as far as SPI is concerned, data are always transferred in both directions. It is up to the master and slave devices to know whether a received byte is meaningful or not. So a device must discard the received byte in a “transmit only” frame or generate a dummy byte for a “receive only” frame.

Links: http://embedded.com/columns/beginerscorner/9900483 

Internet Control Message Protocol

The Internet Control Message Protocol (ICMP) is one of the core protocols of the Internet protocol suite. It is chiefly used by networked computers’ operating systems to send error messages—indicating, for instance, that a requested service is not available or that a host or router could not be reached.

ICMP differs in purpose from TCP and UDP in that it is not used to send and receive data between end systems. It is usually not used directly by user network applications, with some notable exceptions being the ping tool and traceroute.
Internet control message protocol is part of the Internet protocol suite as defined in RFC 792. ICMP messages are typically generated in response to errors in IP datagrams (as specified in RFC 1122) or for diagnostic or routing purposes.

The version of ICMP for Internet Protocol version 4 is also known as ICMPv4, as it is part of IPv4. IPv6 has an equivalent protocol, ICMPv6.

ICMP messages are constructed at the IP layer, usually from a normal IP datagram that has generated an ICMP response. IP encapsulates the appropriate ICMP message with a new IP header (to get the ICMP message back to the original sending host) and transmits the resulting datagram in the usual manner.

For example, every machine (such as intermediate routers) that forwards an IP datagram has to decrement the time to live (TTL) field of the IP header by one; if the TTL reaches 0, an ICMP Time to live exceeded in transit message is sent to the source of the datagram.

Each ICMP message is encapsulated directly within a single IP datagram, and thus, like UDP, ICMP is unreliable.

Although ICMP messages are contained within standard IP datagrams, ICMP messages are usually processed as a special case, distinguished from normal IP processing, rather than processed as a normal sub-protocol of IP. In many cases, it is necessary to inspect the contents of the ICMP message and deliver the appropriate error message to the application that generated the original IP packet, the one that prompted the sending of the ICMP message.

Many commonly-used network utilities are based on ICMP messages. The traceroute command is implemented by transmitting UDP datagrams with specially set IP TTL header fields, and looking for ICMP Time to live exceeded in transit (above) and “Destination unreachable” messages generated in response. The related ping utility is implemented using the ICMP “Echo request” and “Echo reply” messages.

Links:

http://candle.ctit.utwente.nl/wp5/tel-sys/exercises/icmp/icmp.html

http://en.wikipedia.org/wiki/Internet_Control_Message_Protocol

A sign of confusion

In C and C++, the unusual nature of char leaves many programmers puzzled about when to use plain char in preference to an explicitly signed or unsigned char.

All of the integer types in C and C++ come in signed and unsigned variants. In all cases but one, the signed variant is the default. For instance, the type specifier int is short for signed int, and long int is short for signed long int. The exception is the char types.

The plain char type has the same representation and behavior as either signed char or unsigned char, but plain char is nonetheless a distinct type. For example, even with a compiler that implements plain char the same as signed char, the following pointer assignment is an error:

char *pc;
signed char *psc;

pc = psc; // invalid conversion

Many compilers tolerate this conversion, but the language standards consider it to be an error.

The unusual nature of char–that it’s distinct from its signed and unsigned cousins, but not completely so–leaves many programmers puzzled about when to use plain char in preference to an explicitly signed or unsigned char. Too often, programmers guess wrong, and find themselves compounding the error by using casts. The following letter from a reader typifies the problem:

Recently I faced a problem where I was using an object declared as:

signed char *ptr;

I tried to do something such as:

if (ptr[0] == 0xFF)

Using the debugger, I could see that ptr[0] always had the value 0xFF but the condition in the if-statement was always false. When I looked at the disassembled code, the register containing ptr[0] ‘s value showed 0xffffffff.

I solved the problem by casting ptr[0] to unsigned char. Though I got the expression to evaluate to true, I’m not quite sure how it works.

As I’ve explained in past columns1,2, using a cast is often an indication that you’re doing something wrong. That’s the case here.

Here’s what’s happening with that conditional expression. The left operand, ptr[0] , is a signed char. On a typical machine with 8-bit bytes and twos-complement arithmetic, a signed char has values in the range -128 to +127. If ptr[0] contains 0xFF, the decimal arithmetic value of ptr[0] is -1, not 255.

The right operand in the conditional expression, the literal 0xFF, is an int, or more precisely, a signed int.3 It’s not a signed char. As a signed int, 0xFF has the value 255 (decimal).

According to the standard, when an expression compares a signed char with a signed int, the program promotes the signed char to signed int prior to doing the compare. The resulting signed int has the same value as the signed char, which in this case is -1. On a 32-bit twos-complement machine, -1 (decimal) is represented as 0xFFFFFFFF.

In short, ptr[0] is a signed char whose value is -1, and 0xFF is a signed integer whose value is 255, and their values are not equal.

The way to avoid such surprising behavior is to use objects and literals whose types can be combined safely without explicit conversions. For example, when you test the value of a plain char, you should compare it with another plain char or character constant, not with an int.

Links: http://embedded.com/columns/programmingpointers/206107018

Follow

Get every new post delivered to your Inbox.