The GPIO latency test was merged into the latmus program. This code is paired with a fairly simple Zephyr-based application which implements the latency monitor we need for measuring the response time of a user-space thread running on the system under test to GPIO events the monitoring device generates. Using this combo, we can even measure the response time of non-EVL, strictly single-kernel configurations, to GPIO events.
Also, Dovetail and EVL were upgraded to kernel 5.5-rc4. Trivial merge, no issue.
Dovetail was ported from kernel 5.4 to kernel 5.5-rc2. Not much fuss, except maybe on the ARM side with the rebase of the arch-specific vDSO support (arch/arm/vdso) on the generic vDSO implementation (lib/vdso) which took place upstream. This turned out to be an opportunity to rebase the originally ARM-specific vDSO support for user-mappable clocksources to the generic vDSO implementation. Except for these, other changes were fairly trivial to merge into kernel 5.5-rc2.
As I was debugging some core Xenomai issue on
x86, I stumbled over the reason for
CONFIG_MAXSMP breaking the
interrupt pipelines for ages: enabling this kernel feature extends the
range of interrupt vectors to 512K. Guess what would happen with the
3-level interrupt log which cannot address more than 256K vectors both
the I-pipe and Dovetail were using so far? Both gained a 4-level
interrupt log this week to cover the entire vector space when
CONFIG_MAXSMP is enabled, which did not go without a vague feeling
of wearing a dunce cap.
EVL timers were documented.
EVL cross-buffers were documented.
The week the EVL core eventually got rid of the SMP scalability issue which plagues the Cobalt core. The ugly big spinlock (aka nklock) which still serializes all operations inside the former is now gone from the EVL core, after a long incremental process whick took place over several months, introducing per-CPU serialization in the core scheduler, rewriting portions of the EVL thread synchronization support like waitqueues and mutexes in the same move.
Implemented the EVL shim library which mimics the behavior of the EVL API based on plain POSIX calls from the native *libc. It comes in handy whenever the real-time guarantees delivered by the EVL core are not required for quick prototyping or debugging some application code.
Out-of-band IRQs are now flowing faster through the pipeline until the EVL core can reschedule upon them. We are now on par with the I-pipe performance-wise, but with a cleaner integration, a less intrusive implementation, and a much saner locking model for interrupt descriptors.
Since a companion core exhibits a separate scheduler, there is no point in waiting patiently for the main kernel logic to finish switching its own tasks before preempting it upon out-of-band IRQ. Several micro-benchmarks on ARM and arm64 revealed that a significant portion of the maximum latency was induced by switching contexts atomically. Dovetail now supports generic non-atomic context switching for the in-band stage which allows the EVL core to preempt during this process, reducing even further the wakeup latency of out-of-band threads under high memory stress, especially when sluggish outer cache controllers are part of the picture.
The proxy now supports a ‘write granularity’ feature, in order to define a fixed size for writing bulks of data to the target in-band file. Because of this, the proxy is available for interfacing with in-band files with special requirements, like writing exactly 64bit words to eventfd(2) objects, which turns the proxy into yet another mechanism for synchronizing out-of-band and in-band threads.
Eventually finalized the out-of-band polling services for EVL. Since each instance of an EVL element is represented by a regular file, we can wait for events on the corresponding file descriptor via a common synchronous multiplexing mechanism. For instance, we could wait for an event group to have some event pending, or a proxy to receive some data from the in-band peer.
The GPIOLIB framework can now handle requests from out-of-band EVL threads, which means that any GPIO chip driver based on this generic GPIO core can deliver on ultra-low latency requirements. This is the first illustration that we need no specific driver model with the EVL core in order to cope with real-time duties in drivers: we can leverage the out-of-band operations Dovetail provides us as part of the common file operations defined by the VFS.
Just finished porting Dovetail to x86_64, which almost enabled the EVL core on this architecture in the same move given the very few arch-specific bits this core requires. Enabling libevl on this architecture was a trivial task for the same reason. So we now have support for ARM, arm64 and x86_64.
Added a scheduling policy for temporal partitioning (SCHED_TP). Useful when Arinc653-like scheduling is required.
Implementation-wise, the support for EVL timers was merged as a sub-function of the clock element, which simplifies the code.
EVL logger and mapper features are now merged into the proxy element.
Spent some time writing unit tests for libevl. Not exhilarating, but required anyway.
Thirteen years to the day after I respinned the Xenomai project with Xenomai 2, it seems a good time to articulate the lessons learned from the strengths and weaknesses of Xenomai in a modern real-time core implementation based on Dovetail. After months originally spent working on the ‘steely’ core, I have decided to send the whole thing back to the drawing board instead of continuing this effort: this needs more than a revamping of the implementation, the way such companion core integrates into the main kernel has to be re-designed.
This decision eventually led to developing the EVL core, which originally reused portions of Cobalt’s proven infrastructure such as the scheduler and thread management system. On the other hand, the kernel-to-core and user-to-core interfaces were entirely rewritten, dropping RTDM and POSIX entirely. The portion of inherited bits was expected to decrease over time, especially in the wake of improving SMP scalability, which eventually happened a year later.
The groundwork for Dovetail was laid during this period, by introducing a new high-priority execution stage into the mainline kernel logic. On this so-called out-of-band stage, we can run an interrupt pipeline and provide alternate scheduling support to any companion software core we may want to embed. The main kernel can offload work which has to meet ultra-low latency requirements to such core, without requiring the entire kernel machinery to abide by the real-time rules only for a handful of threads which actually need this.
During this same period, a streamlined version of Xenomai’s Cobalt core codenamed ‘steely’ which would only provide a POSIX API was developed for the purpose of testing the Dovetail interface.
Started working on Dovetail.
Last modified: Sat, 04 Jan 2020 17:15:13 CET