Introduction to Real Time Operating Systems Part 1

Fabrice Beya
10 min readMay 2, 2021

--

One of the fundamental goals of an operating system is to make writing software easier! This is done by abstracting all the low level software and hardware resources and exposing them through API’s and services which make writing software applications more convenient. Typical operating system functionalities include the management of hardware resources such as memory and CPU, co-ordinating system activities and providing various supporting services to applications. Depending on the software environment and industry, operating systems will vary in how they perform these tasks. In this article we will outline why Realtime Operating Systems are needed.

The Polled Loop Approach

Most embedded software engineers begin their journey with the traditional Polled Loop approach. Given that an embedded device continuously runs a single while loop from power on to power off, the Polled Loop approach involves placing all your software into one main while loop function. Optimisations of this approach involves sub dividing all the different functionalities of your system into separate functions which branch out from the main while loop. This approach works well for simple systems.

Figure 1. Polled Loop Approach

Case Study: An Electric Toothbrush

As one can expect there are many limitations to this approach, to demonstrate this let us take an a basic electric toothbrush for example, below is a system overview of this toothbrush.

Figure 2. Overview of an Electric Toothbrush System

System Description:

When you press the power on button, the toothbrush goes on and the Power LED indicator goes on. When you press the normal speed button the toothbrush motor goes on at 50% of the speed, and when you press the High Speed button the motor runs on 100% of the speed. When you press and hold the power button for more than 3 seconds the Power LED goes off and the toothbrush system goes off.

In typical embedded style we can extrapolate a state diagram which will be the input to our software logic.

Figure 3. State diagram of Electric Toothbrush

So now that we have basic system overview lets apply thePolled Looped approach on this system. Figure 4 below depicts what in pseudo-code how one would apply the Polled Loop Approach on this system.

Figure 4. Polled Approach Pseudo-Code

The Dead Time Problem

A key thing to hold in mind is that most embedded systems execute programming tasks in a synchronous manner ie only one step is executed at a time. If you consider that in some cases you have to listen to 3 possible inputs and based on those inputs determine which outputs to send out, the first challenge emerges, which is to divide the processors input monitoring attention across the 3 buttons so that each button has an equal probability of being detected. This means that there exist an instance of time where one of the buttons will not be detectable because the processor is busy checking for button press of another button. Lets call this Input Dead Time, a period of time where some buttons may be irresponsive, its notable that this dead time will be closely linked to the processor speed, and for simple systems such as our electric toothbrush, this dead time might be very negligible given how cheap and fast modern processors have become.

The Delay Problem

The next problem to consider is an extension of the above problem. Let’s have a look at how an embedded processor manages outputs, particularly when it comes to executing delays. We have a scenario where the processor needs to detect when a button has been pressed for 3 seconds, thus implying that the process needs to execute a 3 second delay. Given the Polled Loop Approach this implies that the processor will be locked into this delay task for 3 seconds preventing it for doing all the other task such as listening for additional button inputs. The overall impact of this is that the overall system becomes unreliable in the sense that depending on which button you press, different inputs will sometimes not be responsive. In summary the Polled Loop Approach’s main weakness is its inability to manage processor attention in a manner that does not result in data being lost, which in turn makes the overall system unpredictable.

Introducing Interrupt Services

One way to resolve the Delay problem presented above is to utilise a common feature available in most processors called An Interrupt Service. Simply put, an interrupt service is a processor feature that allows it to be notified when a specific event has occurred, and thus suspend its current task, and instead move on to execute the tasks in response to the event, and after completion then return to its previously suspended task. This adjust our pseudo-code to something similar to Figure 5, where you still using a Polled Loop, but at any given moment the processor can be interrupted to handle specific events mapped to our system inputs. In our electric toothbrush context this means that we can implement interrupt services for each one of out button inputs, which will increase the responsiveness of the overall system by eliminating our dead time problem. Even if the processor is stuck executing a delay, it will be interrupted from that delay task and execute any task related to a new button input. Another useful property of interrupt handlers is that they can also be interrupt in a nested chain, so in the case of our toothbrush, if a user selects normal speed, and then high speed in quick succession, the processor whilst executing the interrupt handler for normal speed, will halt that execution, and go onto execute the high speed handler. So for the system that we have, the Polled Loop with interrupts could just about be enough.

Figure 5 Polled Loop with Interrupts

Welcome to the 4th Industrial Revolution

Lets’ get with the times and make our electric toothbrush Smart, aka let’s chuck in every technology buzz word into our product so our marketing team can drop the infamous 4IR tagline. So lets start by adding a nice OLED display to provide user friendly feedback to the user, we can also include a speech synthesizer that can provide meaningful audio feedback to the user eg. let the user know if he has brushed his tooth long enough or if more brush time is need to be applied in certain areas, and of course this intelligence speech guidance system must be powered by Machine Learning and AI. Whats 4IR without some form of spying, oops i meant data mining, let’s keep system logs of all our users actions and use that to build behavioural models of their oral hygiene. Lets not forgot IOT, so we need some bluetooth connectivity to extract all the local data from the toothbrush onto a mobile app which will provide a nice user interface for the users to keep tabs on their oral health. This can also feed nicely into existing healthy app which are already doing a great job of tracking all other elements of your health, now they have you oral data, along with frequent reminders to look after their teeth. And of course, this can all feed into a glorious advertising market place, where we can sell our users oral data to Data Brokers, giving our product a solid revenue stream. In exchange our users social medias can now be bombarded with amazing new products to improve their oral health, as we join the Data Industrial Complex.

Figure 6: Overview of 4IR Smart Toothbrush

The Concurrency Problem

Let’s consider whether or not a Polled Loop with interrupts will still be adequate for our new Smart Toothbrush. Our old toothbrush system has a unique property of having only two key outputs that need to be managed; Power LED and Motor Speed. The Power LED is driven by one input being power button, and the Motor Speed is driven by two button inputs. Both these two instances have a unique property of having a Many-To-One-Output property ie that you have one or many inputs that are only driving a single outputs. This is in contrast to our new Smart Toothbrush which has multiple inputs driving multiple concurrent outputs. When the user selects high speed mode, the system needs to log this action, the OLED display needs to render some text indicating this new mode, and the speech synthesiser needs to play some audio indicating that high speed mode is now active, and lets not forget the motor speed which is adjusted to 100%, thus you have a single input which ideally needs to drive 4 simultaneous outputs concurrently. This scenario raises a core issue with the Polled Loop with Interrupts approach, even though you can respond instantaneously to all the input, the processor still needs an efficient way to provide simultaneous outputs, as it stands the processor is only able to drive one output at a time. The more task you add to the system, the more the need for concurrency emerges, thus a more scalable solution is needed to manage concurrency at an embedded level. It turns out the most effective way to resolve this problem is to introduce a Real Time Operating System.

What is Real Time

Not all software universes are equal. There are worlds where being 1 second late goes un noticed, where as in another world it will cost one or many lives. A simple example, when one logs onto their favourite app, the fact that sometimes it loads up in 2 seconds and other times 3seconds, might go un noticed to most users. If you consider a car, in the context of an accident, failure to deploy the airbags 1 second then needed, could result in the loss of life. A deeper dive into this topic lands one in the sphere of having to segment the software universe. A stark distinction exist between consumer products and what are termed Safety & Mission critical products, the latter being generally governed by standards which in many instances are legally enforced. Examples of Safety & Mission critical systems are medical devices, aeroplanes, military systems and almost all automotive systems which include Traction and Braking systems, all of which have the unique property in that incorrect system outputs have a high probability of causing harm both to humans and the environment. To characterise this better, one needs to define what we mean by a correct output.

Systems can be characterised as real time or not by establishing whether the correctness of an output depends on timing as well as well the result. A real time system must have a known maximum time for each of the critical operations it performs.

In the spirit of academia lets get some terminology out of the way:

Determinism:

An application is deterministic if its timing can be guaranteed within a certain margin of error.

Hard Real-Time:

A system that always guarantees a maximum operation time all the time.

Soft Real-Time:

A system that always guarantees maximum operation time most of the time.

Jitter:

The amount of error in the timing of a task over subsequent iterations of a program.

Real-Time Operating System

A Real-time operating systems is an OS that is optimised to provide a low amount of jitter when programmed correctly. Figure 7 depicts a more realistic view of what jitter looks like in most operating systems, one can never guarantee zero jitter in a system, but one can ensure, and most importantly, demonstrate low jitter in a system. As already alluded to a Car would be good example of a system that requires a Hard Real Time Operating system given that a small error in timing aka high jitter, can result in catastrophic consequences. Whilst on the other hand, a smartphone is a good example of a system that only requires a Soft Real Time operating system, given that delays in most of its operations can be mitigated against without much harm to the end user or their environment.

Figure 7: Overview of Jitter

Like all operating systems an RTOS is made up of a core governing software called a Kernel. The kernel governs all the low level system resources such as the processor, along with other hardware and even low level firmware’s which act as proxies to other lower level hardware systems. All activities in a RTOS are segregated into Task, which are managed by the kernel. Task can run concurrently using a feature called Time Slicing, where the kernel can divide processor time in a manner that would result in the task producing what would appear to be a concurrent outputs for the end user. This in turn allows us to specify task priorities along with passing message between task, depending on the RTOS, some tasks will have properties that allow them to run forever, go to sleep, run in periodic intervals and more. All of these features come in very hand when dealing with large system housing multiple features. In the end our final Smart Toothbrush will look something like Figure 9, where we have our kernel managing 8 different task, all running concurrently with different priority levels based on their importance.

Figure 8. Overview of our “Smart Toothbrush” Real Time Operating System

Conclusion

In conclusion, we had a brief look a what an operating system is, built our way from the bottom up by looking at a case study of a system using a Polled Loop Approach. We then looked at the benefits of introducing interrupts to this system and highlighted some of the key challenges with this approach. Lastly we discussed the need for a way to manage concurrent task in a scalable manner, which lead us to defining what a real time operating system is along with the benefits they provide. This article is part of a five part series on realtime OS. In the next article i will be dive deeper in the mechanics of an RTOS, and unpack some of the benefits and drawbacks that come with using an RTOS.

--

--

Fabrice Beya
Fabrice Beya

Responses (1)