RK0

Embedded Real-Time Kernel '0'

Real-Time Model and Service Map in RK0

This post describes the principles that drive RK0 design and its service map (version 0.18.1).

Usage examples and design internals can be found on the Docbook.

Design Rationale

If concurrency is already difficult, when you need to bound it to real-time responsiveness, eventually you will get asking yourself if we have the right tools for the job. Crude answer is we don’t: response time emerges from implementation. To increase our chances of getting it right, we need a simple mental model.

Predictability is paramount in real-time systems, and service semantics affect predictability. This is an established fact, not an opinion.

Real-time system software – or cyber-physical systems – are highly domain-specific and hardware-dependent; yet we keep pursuing commonality on the hardware-software interface, despite the fact that an interface can express syntax, and some semantics. We have seen enough cases to conclude that correctness by interface as a contract is dangerously limited.

RK0 starts from something that has been overlooked: the commonality in cyber-physical system is their concurrency model, which is characterised by urgency and precedence conditions. Because of that, regardless the domain or hardware, the application layer repeatedly deals with coordination problems that are neither exhaustive nor unknown. They can be expressed as services.

Still, many real-time kernels follow a general-purpose habit of over-generalisation: overloaded primitives whose meaning is derived from usage.

The absence of semantics can be desired neutrality, but often turns out to be displacement. Meaning is not removed. It is pushed into the application, usually without framing, where it degenerates easily.

Modelling Events, Signals and Messages

A system has state variables that define how it behaves. A change in a state variable is caused by an event. The periodic hardware interrupt (SysTick) that increments the kernel runtime count is an example.

The notion of execution progress on a digital computer arise from observable changes in state, whatever that state represents. Therefore, given two observation logical instants, if the observed state differs, at least one event must have occurred in (real) time. Note that if there is no difference, we can’t state that no event has happened.

In this sense, an event is a logical construct derived from observed reality. Time runs on a continuum; the computer samples reality with varying granularity. Computation, therefore, always lags behind. Aware of that, a real-time system’s goal is reacting to external stimuli so that a result is delivered to the environment while it is still useful.

On a real-time kernel, execution progress follows the urgency of tasks and precedence conditions. We design concurrent units (Tasks) and use kernel services to coordinate their execution to be ordered, producing a final reponse that is bounded on time.

This coordination is achieved by Inter-task Communication. In RK0 tasks send/receive information in the form of Signals or Messages.

A Signal (or a Signal token) signifies an occurrence. When a task checks for a signal, it is sufficient for the signal to be present for the task to progress. The operation of signalling another task does not affect the sender task. A signal is a notification, never a ‘request’.

Message conveys structured, variable information. The progress emerges from how the sender and receiver handle information in messages, as well as from the mechanism itself as described in this page.


RK0 Services Map

ServicePrimary semanticsSecondary propertiesCore/optional
Task Eventnon-cumulative private occurrencebitwise, consumable, task-localCore
Task Maildirected state transfer; can be retained or notone-slot, task-to-task, latest wins, both blocking/non-blocking recv()Core
Sleep Delayrelative temporal suspensiondelayCore
Sleep Untilcompensated periodic sleeplocal reference timing; returns immediately when overrun, prioritises execution countCore
Sleep and Releasecompensated phase-aligned periodic sleepglobal phase grid reference; skips release if overrun, prioritises phase over execution countCore
Semaphore (binary)non-cumulative occurrence or availabilityone-bit availability / signalling / mutual exclusion (limited)Opt.
Semaphore (counting)cumulative occurrence or availabilitycumulative availability / signallingOpt.
Mutex Lockexclusive ownership of shared statepriority inheritance, explicit ownerOpt.
Sleep Queuepure waiting relationnon-latching, no retained occurrenceOpt.
Message Queuepreserved discrete message historyN:N capability; ring buffered; blocking/non-blocking send()/recv(), ownership allowedOpt.
Channelsynchronous request/reply procedure callrequest pointers only; N:1 server, many clients may waitOpt
MRM1:N communication, asynch, last-data orientednon-blocking send()/recv(); data integrity rule: N buffers = N receivers + 2Opt.
Application Timerdeferred time-triggered callbackcallback in system task contextOpt.

Scheduler and Tasks

Task is the Concurrency Unit. It follows the Thread model.

A static task assumes 3 important states RUNNINGREADY or WAITING.

Knowing the that the READY tasks are within a table of FIFO Queues, and each row is related to a priority:

The highest priority ready task is at the head of highest priority non-empty ready queue.

We can outline the scheduler policy:

  • A task must switch to the READY state before being eligible for scheduling.
  • After a task is dispatched it will keep RUNNING until preempted by a higher priority task, blocked or yielding.
  • A yield will only switch tasks if there is another task with equal (or, very unlikely, higher) priority that is READY.
  • A task will switch from RUNNING to READY if yielding or if being preempted by a higher priority task. Otherwise it can only go to a WAITING state, and eventually switch back to READY.
  • When a task is preempted by a higher priority task, it switches from RUNNING to READY and is placed back on the head position of its Ready Queue. This means that it will be resumed as soon as it is the highest priority ready task again.
  • On the contrary, if a task yields, it tells the scheduler it has completed its cycle. Then, it will be enqueued on the ready queue tail – the last queue position.
  • When a task waits it is suspended until a condition is satisfied.
  • When the condition is satisfied, it switches from WAITING to READY, and is enqueued on the tail.
  • So, tasks with the same priority cooperate by either yielding or waiting.
  • Tasks do not run because they deserve CPU time. They run because there is a reason to run them. If a task is dispatched and never yields or waits, the scheduler will correctly keep it running, while there is no higher priority task to run. This is not incidental starvation; it is the absence of coordinated progress.
  • Tasks with the same priority are initially placed on the Ready Queue associated with that priority in the order they are created.

Dynamic Tasks

Dynamic Tasks are created, after the scheduler has started, by other tasks. They can be TERMINATED by other tasks or by themselves, and the memory dedicated to a terminated task can be reused by a new dynamic task. We understand dynamic tasks should be avoided.


Partition Memory Allocator

The partition allocator provides deterministic allocation of fixed-size blocks.

Its semantics are not those of a general-purpose heap, but of a bounded pool of homogeneous objects.

It exists because, in constrained real-time systems, allocation cost and fragmentation behaviour matter.


Sleep Timers

RK0 keeps temporal semantics distinct.

  • Sleep(delay) expresses relative delay: every call suspends the task for the same amount.

The two mechanisms below recompute sleep time based on the interval between the current call and the previous call, compensating drift. They serve different purposes.

  • SleepUntil(anchor, period)anchor is a task-local reference. The delta between calls uses the last saved anchor to recompute the remaining time for the period. If no time remains (the task is late), the call returns immediately and execution continues. It prioritises run count over phase difference.
  • SleepRelease(period) expresses phase-aligned periodic execution. The reference is global across tasks and defined when the scheduler starts. If compensation cannot meet the periodic rate, that run is skipped and scheduled for the next multiple of period.

Application Timer

An Application Timer expresses time-triggered deferred work.

It is intended for important but short periodic activities that run at high priority, such as soft keep-alive mechanisms: too important to miss, yet too short to justify a dedicated periodic task.


Task Event (Event Flags)

A Task Event is a private, bitwise, consumable signal mechanism.

It represents occurrence under strict semantics:

  • private to the receiving task
  • no payload
  • consumable on reception
  • bitwise composition of occurrences

In practice, it behaves like a private array of binary semaphores.

Task Events are deliberately restricted so that the 32-bit private field is not treated as a general-purpose variable.


Task Mail

In RKO we define a mail as a 1-word message. It is often a pointer but not necessarily.

Task Mail can be seen as the message counterpart of Task Events.

It is a directed, single-message transfer mechanism.

Its semantics are:

  • one task posts a 32-bit value to another task’s mail slot
  • post() is always non-blocking and overwrites unread mail
  • after post(), the slot is FULL
  • the receiver reads from its own slot
  • This read can be destructive (blocking or non-blocking) using a pend() operation. After a successful pend() the slot is EMPTY
  • A non-destructive read uses a peek(). Reading while keeping the slot FULL can be useful in some contracts between sender and receiver.
  • The state of a mail slot of a task is informed using the query() operation.

Standard task mail is a generic pointer (VOID *), so by default messages are passed by reference. Messages that fit within 32 bits can, and should, be passed by copy.


Semaphore

A Semaphore expresses availability.

In binary form, it may represent non-cumulative occurrence or one-bit availability. It can also guard a region where just one task might be active – that is, to behave as a Mutex Semaphore (problems arise if tasks do not have the same priority).

In counting form, it expresses cumulative availability.

Its semantics remain signal-like: what matters is occurrence/count, not transferred content.


Mutex Lock

A Mutex expresses exclusive ownership of shared state. A mutex is not a signalling primitive.

A Mutex has an owner, and that ownership relation matters structurally. Mutexes in RK0 implement fully transitive priority inheritance.


Sleep Queue

A Sleep Queue is used when a task transitions directly into waiting relation associated with that queue, that by its turn is associated to an event on its broad sense. A suitable name for this primitive would be Condition Queue (not used to not be confused with a full Condition Variable primitive). Another task is responsible for signalling or broadcasting to it. Because a Sleep Queue does not retain occurrence, a signal sent to an empty queue is lost.

Unlike a semaphore, the semantics are not “check whether an event has happened.” They are:

from now on, wait until this queue is signalled.

Because it carries no predicate, it finds real use when composed with Mutexes so that predicates over shared state can be protected, evaluated, and waited upon correctly. This yields the Condition-Variable Model, that is, a Monitor-like coordination mechanism.

Condition Variables are not exposed as a single heavy primitive by default because the associated policy is better if left application-specific. Exposing the crude Sleep Queue, leaving signalling discipline and the required atomicity — whether through scheduler lock or interrupt masking — under application control — is consistent with RK0’s general approach: enough semantics are given upfront to make coordination meaningful.

(There are helpers using SchLock/SchUnlock() and a Mesa semantics).


Message Queue

A Message Queue expresses preserved discrete message history up to queue capacity when the producer uses blocking semantics.

If the producer occasionally outruns the consumer, a message queue amortises bursts or consumer lateness without missing data. A faster consumer, on the other hand, will eventually block on an empty queue; it does not drop messages. Drops occur when the producer outruns the consumer and uses non-blocking sends.

In practice, buffering gives time-data correlation only within queue capacity. Over long runs, effective throughput is bounded by the slowest stage, so either the producer blocks on a full queue or the system accepts that some data will be missed.

Rule of thumb: long-term throughput equals the slowest stage; buffers only absorb short-term mismatch and jitter.

Its semantics are:

  • individual messages are preserved
  • ordering matters
  • the receiver consumes discrete transferred units
  • it is public and can be used as an N:N buffered communication object (hardly needed; more often than not a bad idea)
  • ownership can be assigned, so a single task owns the queue and only that task can receive from it (the most common case)

Channel

A Channel expresses synchronous procedure calls with a single server task.

Its semantics are:

  • requests are sent by pointer to a request envelope (from a bounded pool)
  • many clients may be pending on the same server
  • a client blocks until the server completes and calls kChannelDone()
  • while the server executes on behalf of a client, server priority is adjusted accordingly to the client’s priority

Channels carry request pointers only; payload layout is user-defined.


MRM (Most-Recent Message)

MRM expresses latest-state publication on a 1:N, never-blocking communication channel.

It preserves data integrity, relevance and execution rate of sender and the receivers.

MRM is appropriate when:

  • readers should observe the most recent valid state
  • stale data is actually harmful, as it will use information that no longer represents the environment stimuli we should react to

MRM therefore differs sharply from Message Queue semantics. It is particularly suitable for (cascaded) servo-control loops.

(This mechanism was originally coined as Cyclic Asynchronous Buffers (D. Clark, HIC, 1989).)

2 responses to “Real-Time Model and Service Map in RK0”

  1. Interesting! Even if I needed help (from Duck.ai) to understand “The absence of semantics can be desired neutrality, but often turns out to be displacement. Meaning is not removed. It is pushed into the application, usually without framing, where it degenerates easily.”

    Since I am so used to “channels” (and did not much dig into the others) I have some comments:

    Your channel is always synchronous, unbuffered. Like occam and XC. Client(s) block until it or one of them is “taken” by the server. Is it possible to establish “fairness” here, so that no “fast” client (in theory) can reuse the server indefinitely, leaving no time for the others? This is not “priority”, but “fairness”. Often fairness is solved by the server task once the concept is available. This is not really possible in golang (and the designers argue that it is not necessary), but in my experience on small HW I have needed it some times.

    I assume that each client will have its own data and unique pointers to the data it needs to send off. Only filled by the client, it’s ready and unchanged for the server once it’s running. After the kChannelDone() this data space may be reused. When all parts agree on the struct’s semantics, sending over variable length size data is also possible.

    I assume that the priority adjustment is needed in a system that relates to priorities. Could there be situations where the priority of each task may be static? Like with the server always higher than all of the clients? Since clients block until kChannelDone(), is the priority inheritance scheme really needed? Maybe the purpose is to get the server run “as fast as” the client? But one “as fast as”, is it better than any other “as fast as”? A server may again use a channel to get work for another server (thus becoming client), is the priority inhertiance for this also needed to avoid priority inversion, since the OS would then in case need to handle it. Will the scheme cause the priorfity inversion never to kick in?

    Has the concepts been formally verified by some tool?

    There is no one-to-any or any-to-any channel, like in https://www.cs.kent.ac.uk/projects/ofa/jcsp/jcsp-1.1-rc4/jcsp-doc/ I assume? Not that I have missed them ever.

    A comparison with XC select or [[ordered] select and occam’s ALT or PRI ALT would of course (to me) be interesting. But then, I assume that none of us would have the time to do this, albeit interesting. And then, the purpose could be discussed.

    I have once upon a time tried to write something about fairness in https://www.teigfam.net/oyvind/home/technology/049-nondeterminism/

    Like

  2. Hey Øyvind, thank you for taking your time to read this and raise these smart questions.

    I feel I see where you coming from and I will try to answer in CSPish idiom

    RK0 Channel is closer to a bounded synchronous procedure-call path. It is not unbuffered, or a rendezvous channel, see:

    A) It may rendezvous if the server is already blocked waiting when the client calls. Otherwise, the request is enqueued. Each Channel is associated with a server-owned message queue and a bounded pool of request buffers. Those buffers carry the metadata the server uses to execute on behalf of the client.

    B) Fairness is not the concern. Urgency is.
    RK0 execution model is based on progress for a reason, that is expressed on the application: what, when and why. If a client monopolises the server, that is not something to be hidden – the application model or the server’s service discipline is wrong.

    C) The server does not inherit priority in the mutex sense.
    It executes the call at the caller’s urgency. If the server’s nominal priority is higher, it is effectively demoted; if lower, effectively promoted. The purpose is not fairness, but preserving the urgency of the call path

    D) If fairness is required, it should be expressed on the application.

    P.S.:

    Note that: a client dominant blocking time is the waiting for a reply, not waiting to deposit the message. Server does not block to wait for client acknowledgement after replying.

    An unbuffered 1-1 scheme as I understand it, is here: https://antoniogiacomelli.github.io/RK0/#usage_example_unbuffered_message_passing_extended_rendezvous

    An similar approach is used on QNX, with a few differences.

    Like

/* comment here */