IRQ injection

Sending out-of-band IPIs to remote CPUs

The pipeline exposes two generic IPI vectors which autonomous cores may use in SMP configuration for signaling the following events across CPUs:

  • RESCHEDULE_OOB_IPI, the cross-CPU task reschedule request. This is available to the core’s scheduler for kicking the task rescheduling procedure on remote CPUs, when the state of their respective runqueue has changed. For instance, a task sleeping on CPU #1 may be unblocked by a system call issued from CPU #0: in this case, the scheduler code running on CPU #0 is supposed to tell CPU #1 that it should reschedule. Typically, the EVL core does so from its test_resched() routine.

  • TIMER_OOB_IPI, the cross-CPU timer reschedule request. Because software timers are in essence per-CPU beasts, this IPI is available to the core’s timer management code for kicking the hardware timer programming procedure on remote CPUs, when the state of some software timer has changed. Typically, stopping a timer from a remote CPU, or migrating a timer from a CPU to another should trigger such signal. The EVL core does so from its evl_program_remote_tick() routine, which is called whenever the timer with the earliest timeout date enqueued on a remote CPU, may have changed.

In addition, the pipeline core defines CALL_FUNCTION_OOB_IPI for its own use, in order to implement the smp_call_function_oob() routine. The latter is semantically equivalent to the regular smp_call_function_single() routine, except that its runs the callback on the out-of-band stage.

As their respective name suggests, those three IPIs can be sent from out-of-band context (as well as in-band), by calling the irq_send_oob_ipi() service.


void irq_send_oob_ipi(unsigned int ipi, const struct cpumask *cpumask)

  • ipi

    The IPI number to send. There are only three legit values for this argument: either RESCHEDULE_OOB_IPI, TIMER_OOB_IPI or CALL_FUNCTION_OOB_IPI. This is a low-level service with not much parameter checking, so any other value is likely to cause havoc.

  • cpumask

    A CPU bitmask specifying the target CPU(s) which should receive the IPI. The current CPU is silently excluded from this mask, so the calling CPU cannot send an IPI to itself using this call.

  • In order to receive these IPIs, an out-of-band handler must have been set for them, mentioning the [IRQF_OOB flag]({{ < relref “dovetail/pipeline/irq_handling.md” >}}).

    irq_send_oob_ipi() serializes callers internally so that it may be used from either stages: in-band or out-of-band.


    Injecting an IRQ event for the current CPU

    In some very specific cases, we may need to inject an IRQ into the pipeline by software as if such hardware event had happened on the current CPU. irq_inject_pipeline() does exactly this.


    int irq_inject_pipeline(unsigned int irq)

  • irq

    The IRQ number to inject. A valid interrupt descriptor must exist for this interrupt.

  • irq_inject_pipeline() fully emulates the receipt of a hardware event, which means that the common interrupt pipelining logic applies to the new event:

    • first, any out-of-band handler is considered for delivery,

    • then such event may be passed down the pipeline to the common in-band handler(s) in absence of out-of-band handler(s).

    The pipeline priority rules apply accordingly:

    • if the caller is in-band, and an out-of-band handler is registered for the IRQ event, and the out-of-band stage is unstalled, the execution stage is immediately switched to out-of-band for running the later, then restored to in-band before irq_inject_pipeline() returns.

    • if the caller is out-of-band and there is no out-of-band handler, the IRQ event is deferred until the in-band stage resumes execution on the current CPU, at which point it is delivered to any in-band handler(s).

    • in any case, should the current stage receive the IRQ event, the virtual interrupt state of that stage is always considered before deciding whether this event should be delivered immediately to its handler by irq_inject_pipeline() (unstalled case), or deferred until the stage is unstalled (stalled case).

    This call returns zero on successful injection, or -EINVAL if the IRQ has no valid descriptor.

    If you look for a way to schedule the execution of a routine in the in-band interrupt context from the out-of-band stage, you may want to consider the extended irq_work API which provides a high level interface to this feature.


    Direct logging of an IRQ event

    Sometimes, running the full interrupt delivery logic irq_inject_pipeline() implements for feeding an interrupt into the pipeline may be overkill when we may make assumptions about the current execution context, and which stage should handle the event. The following fast helpers can be used instead in this case:


    void irq_post_inband(unsigned int irq)

  • irq

    The IRQ number to inject into the in-band stage. A valid interrupt descriptor must exist for this interrupt.

  • This routine may be used to mark an interrupt as pending directly into the current CPU’s log for the in-band stage. This is useful in either of these cases:

    • you know that the out-of-band stage is current, therefore this event has to be deferred until the in-band stage resumes on the current CPU later on. This means that you can simply post it to the in-band stage directly.

    • you know that the in-band stage is current but stalled, therefore this event can’t be immediately delivered, so marking it as pending into the in-band stage is enough.

    Interrupts must be hard disabled in the CPU before calling this routine.


    void irq_post_oob(unsigned int irq)

  • irq

    The IRQ number to inject into the out-of-band stage. A valid interrupt descriptor must exist for this interrupt.

  • This routine may be used to mark an interrupt as pending directly into the current CPU’s log for the out-of-band stage. This is useful in only one situation: you know that the out-of-band stage is current but stalled, therefore this event can’t be immediately delivered, so marking it as pending into the out-of-band stage is enough.

    Interrupts must be hard disabled in the CPU before calling this routine. If the out-of-band stage is stalled as expected on entry to this helper, then interrupts must be hard disabled in the CPU as well anyway.


    Extended IRQ work API

    Due to the NMI-like nature of interrupts running out-of-band code from the standpoint of the main kernel, such code might preempt in-band activities in the middle of a critical section. For this reason, it would be unsafe to call any in-band routine from an out-of-band context.

    However, we may schedule execution of in-band work handlers from out-of-band code, using the regular irq_work_queue() and irq_work_queue_on() services which have been extended by the IRQ pipeline core. A work request is scheduled from the out-of-band stage for running on the in-band stage on the issuing/requested CPU as soon as the out-of-band activity quiesces on this processor. As its name implies, the work handler runs in (in-band) interrupt context.

    The interrupt pipeline forces the use of a synthetic IRQ as a notification signal for the IRQ work machinery, instead of a hardware-specific interrupt vector. This special IRQ is labeled in-band work when reported by /proc/interrupts. irq_work_queue() may invoke the work handler immediately only if called from the in-band stage with hard irqs on. In all other cases, the handler execution is deferred until the in-band log is synchronized.


    Last modified: Wed, 20 Sep 2023 18:41:34 +0200