Wait queue

An EVL wait queue is a data structure to manage EVL-controlled threads that are waiting for some condition to become true; similarly to regular kernel wait queues, they are the normal means by which EVL threads sleep in kernel space until some event occurs. When some other thread signals such event, it can request that either one or all waiters are unblocked. To sum up, the wait queue logic is split in two parts:

  • the sleeping side, waiting for some condition to be met.

  • the signaling side, notifying waiters that some condition is met.

Wait queues are primitive constructs which other EVL synchronization mechanisms are built on, such as kernel flags, semaphores and staxes for instance.

Wait queue services

void evl_init_wait(struct evl_wait_queue *wq, struct evl_clock *clock, int flags)

This call initializes a kernel wait queue EVL threads running on the out-of-band context can sleep on, waiting for some condition to become true. When the condition is met, waiters are awaken either by priority order or on a first come, first served basis (FIFO) depending on the flags.

  • wq

    A wait queue descriptor is constructed by evl_init_wait(), which contains ancillary information other calls will need. wq is a pointer to such descriptor of type struct evl_wait_queue.

  • clock

    The descriptor of the EVL clock device all timeouts passed to the timed calls for the new wait queue should be based on. You would use &evl_mono_clock for a steady, monotonic clock based on the clock device the EVL core provides.

  • flags

    Either EVL_WAIT_FIFO if you want threads to wait on the queue in a first come, first served basis, or EVL_WAIT_PRIO if threads should be queued by descending priority order.


  • void evl_destroy_wait(struct evl_wait_queue *wq)

    This call deletes a wait queue, dequeuing and waking up any thread which might be sleeping on it at the time of the call. If threads have been unblocked as a result, evl_destroy_wait() calls the rescheduling procedure internally, i.e. evl_schedule().

    Upon return, wq is stale and must not be accessed any longer.


    int evl_wait_event_timeout(struct evl_wait_queue *wq, ktime_t timeout, enum evl_tmode timeout_mode, condition)

    evl_wait_event_timeout() is a convenience macro which blocks the caller until a condition is met or a timeout occurs, whichever comes first. If such condition is already met on entry, the call returns immediately with a success status code (i.e. zero). This macro ensures that no race happens between event waiters and senders by enforcing proper locking internally. If you need to have fine-grained control over the lock-and-wait sequence, you should have a look at evl_add_wait_queue() instead.

  • wq

    A wait queue descriptor previously initialized by a call to evl_init_wait().

  • timeout

    A time limit for the call. There are two special values: EVL_NONBLOCK tells the core NOT to block the caller if the condition is unmet on entry, EVL_INFINITE means no time limit only if timeout_mode is EVL_REL (See evl_wait_event()).

  • timeout_mode

    A timeout mode, telling the core whether timeout should be interpreted as a relative value representing a delay (EVL_REL), or an absolute date (EVL_ABS).

  • condition

    A C expression returning a value which can be interpreted as a boolean status. If true or non-zero, the condition is deemed met. As soon as the condition is met, one or more threads are unblocked (depending on the wake up call used). This expression is first evaluated on entry, then each time the caller resumes due to a wake up event afterwards.

  • To illustrate, the following is a piece of the EVL kernel semaphore support on the wait side. down_sem() is called on entry, then every time the caller wakes up upon a notification. As long as down_sem() returns a boolean false value and no error happens, the wait continues.

    static bool down_ksem(struct evl_ksem *ksem)
    {
    	if (ksem->value > 0) {
    		--ksem->value;
    		return true;
    	}
    
    	return false;
    }
    
    int evl_down_timeout(struct evl_ksem *ksem, ktime_t timeout)
    {
    	return evl_wait_event_timeout(&ksem->wait, timeout,
    				EVL_ABS, down_ksem(ksem));
    }
    

    evl_wait_event_timeout() returns zero on success, which means the condition was met within the time bound. If the call failed, a negated error code is returned instead:

    • -ETIMEDOUT The call timed out.

    • -EAGAIN EVL_NONBLOCK was given in timeout but the condition was unmet on entry.

    • -EIDRM The wait queue was deleted while the caller was sleeping on it. When this status is returned, the wait queue must be considered stale and should not be accessed anymore.

    • -EINTR The sleep was interrupted or forcibly unblocked.


    int evl_wait_event(struct evl_wait_queue *wq, condition)

    ev_wait_event() is a convenience macro which blocks the caller indefinitely until a condition is met. If such condition is already met on entry, the call returns immediately with a success status code (i.e. zero). This macro ensures that no race happens between event waiters and senders by enforcing proper locking internally. If you need to have fine-grained control over the lock-and-wait sequence, you should have a look at evl_add_wait_queue instead.

  • condition

    A C expression returning a value which can be interpreted as a boolean status. If true or non-zero, the condition is deemed met. As soon as the condition is met, one or more threads are unblocked (depending on the wake up call used). This expression is first evaluated on entry, then each time the caller resumes due to a wake up event afterwards.

  • evl_wait_event() returns zero on success, which means the condition was met. If the call failed, a negated error code is returned instead:

    • -EIDRM The wait queue was deleted while the caller was sleeping on it. When this status is returned, the wait queue must be considered stale and should not be accessed anymore.

    • -EINTR The sleep was interrupted or forcibly unblocked.


    int evl_add_wait_queue(struct evl_wait_queue *wq, ktime_t timeout, enum evl_tmode timeout_mode)

    evl_add_wait_queue() belongs to the inner wait queue interface, which prepares the caller for waiting for a condition. This call is useful whenever you need to deal in a particular way with wait conditions and/or locking constructs, which the convenience evl_wait_event*() macros sugarize.

    The wait queue must be locked by the caller before evl_add_wait_queue() is invoked next. The caller returns immediately after queuing, keeps running until it invokes evl_wait_schedule() for actually sleeping on the wait queue, unless the condition is met on entry.

    This code fragment from the EVL semaphore implementation illustrates a typical reason why we would need to have a fine-grained control over the lock-and-wait sequence which evl_wait_event_timeout() does not allow. Here we want to open code the wait loop using evl_add_wait_queue() and evl_wait_schedule(), polling a list for some free buffer at each round, guarded by the wait queue lock to prevent data races, stopping when there is a buffer available or on error, whichever comes first:

    for (;;) {
    	raw_spin_lock_irqsave(&est->pool_wait.lock, flags);
    
    	if (!list_empty(&est->free_skb_pool)) {
    		skb = list_get_entry(&est->free_skb_pool, struct sk_buff, list);
    		est->pool_free--;
    		break;
    	}
    
    	if (timeout == EVL_NONBLOCK) {
    		skb = ERR_PTR(-EAGAIN);
    		break;
    	}
    
    	evl_add_wait_queue(&est->pool_wait, timeout, tmode);
    
    	raw_spin_unlock_irqrestore(&est->pool_wait.lock, flags);
    
    	ret = evl_wait_schedule(&est->pool_wait);
    	if (ret)
    		return ERR_PTR(ret);
    }
    
    raw_spin_unlock_irqrestore(&est->pool_wait.lock, flags);
    

  • wq

    A wait queue descriptor previously initialized by a call to evl_init_wait().

  • timeout

    A time limit for the call. EVL_INFINITE means no time limit only if timeout_mode is EVL_REL.

  • timeout_mode

    A timeout mode, telling the core whether timeout should be interpreted as a relative value representing a delay (EVL_REL), or an absolute date (EVL_ABS).

  • Passing EVL_NONBLOCK as a timeout value makes no sense with this call since you should not invoke it if the condition is met, as there would be nothing to wait for in the first place. This value would be interpreted as (ktime_t)-1 by evl_wait_schedule(), which would practically amount to an infinite sleep.


    int evl_wait_schedule(struct evl_wait_queue *wq)

    evl_wait_schedule() belongs to the inner wait queue interface, which blocks the caller until the wait queue is signaled by a call to evl_wake_up() or flushed by evl_flush_wait(). Upon return, the caller must check whether the condition is met in case there might be spurious wakeups: for this reason, evl_wait_schedule() is usually part of a wait loop. If the condition is met, or in case evl_wait_schedule() returned with a non-zero error code, the wait loop should be aborted. Otherwise, the caller should requeue itself and go sleeping again.

    Internally, the error codes this call might return are mapped to the optional information bits passed to the corresponding wake up request, describing the reason for wake up, as follows:

    • T_TIMEO yields -ETIMEDOUT
    • T_NOMEM yields -ENOMEM
    • T_RMID yields -EIDRM
    • T_BREAK yields -EINTR

    evl_wait_schedule() reschedules internally, blocking the caller as/if needed.

    For instance, the EVL core puts threads to sleep, waiting for events to be notified for an observable as follows:

    /*
     * observable->wait.lock guards the pending and free
     * notification lists of all observers subscribed to it.
     */
    raw_spin_lock_irqsave(&observable->oob_wait.lock, flags);
    
    for (;;) {
    [1]	if (list_empty(&observer->next)) {
    		/* Unsubscribed by observable_release(). */
    		nfr = ERR_PTR(-EBADF);
    		goto out;
    	}
    [2]	if (!list_empty(&observer->pending_list))
    		break;
    	if (!wait) {
    [3]		nfr = ERR_PTR(-EAGAIN);
    		goto out;
    	}
    [4]	evl_add_wait_queue(&observable->oob_wait, EVL_INFINITE, EVL_REL);
    	raw_spin_unlock_irqrestore(&observable->oob_wait.lock, flags);
    [5]	ret = evl_wait_schedule(&observable->oob_wait);
    	if (ret)
    		return ERR_PTR(ret);
    [6]	raw_spin_lock_irqsave(&observable->oob_wait.lock, flags);
    }
    

    This is a typical loop pattern evl_wait_schedule() may be involved in:

    1. first we check whether some runtime condition might cause the operation to abort [1] (e.g. observable went stale), doing that at each iteration if this might happen while the thread sleeps.

    2. next we check whether some event arrived [2], breaking out from the loop (on success) if so. Otherwise, if the caller asked for a non-blocking wait, we bail out on error [3].

    3. otherwise we queue the caller to the wait queue wrapped into the observable element [4], expecting anyone who sends event(s) to the observable to signal the corresponding wait queue.

    4. eventually we call evl_wait_schedule() to block until a notification arrives for us [5]. On wake up, we branch back to step 1.

    The wait queue is guarded by grabbing its inner spinlock, in order to prevent data races and missed wake ups in the same move [6].

  • wq

    A wait queue descriptor previously initialized by a call to evl_init_wait().

  • evl_wait_schedule() returns zero on success, which means the condition may have been met within the time bound specified by the enqueuing call. If the call failed, a negated error code is returned instead:

    • -ETIMEDOUT The call timed out.

    • -EIDRM The wait queue was deleted while the caller was sleeping on it. When this status is returned, the wait queue must be considered stale and should not be accessed anymore.

    • -EINTR The sleep was interrupted or forcibly unblocked.

    • -ENOMEM The sleep was aborted because no memory was available on the signaling side to complete a normal wake up operation involving memory allocation. As a result, the thread is unblocked on error.


    int evl_wake_up(struct evl_wait_queue *wq, struct evl_thread *waiter, int reason)

    evl_wake_up() unblocks a particular waiter. This call is often paired with evl_for_each_waiter() macro for selectively picking a sleeper, depending on a particular condition. This call locks the wait queue internally while performing the dequeuing operation.

    You must call evl_schedule() - outside of any spinlocked section - to reschedule, accounting for the wake up. Also, there is NO check whatsoever about whether such thread is actually linked to the wait queue: any wrong input will certainly lead to a kernel crash.

  • wq

    A wait queue descriptor previously initialized by a call to evl_init_wait().

  • waiter

    The EVL thread to wake up, which may be a [user-space thread]({{ relref “core/user-api/thread/” }}) or a [kernel thread]({{ relref “core/kernel-api/kthread/” }}) indifferently: both are based on a common thread descriptor type (aka struct evl_thread). NULL is an acceptable value, which tells the core to wake up the thread leading the wait queue.

  • reason

    A bitmask which gives additional information to the resuming thread about the reason why it was unblocked. In the common case, reason should be zero. A non-zero value contains a flag bit matching a particular situation, which translates to a specific error status for evl_wait_schedule() .

  • This call returns the descriptor of the unblocked thread, which is equal to the waiter argument if non-NULL.


    int evl_wake_up_head(struct evl_wait_queue *wq)

    evl_wake_up_head() unblocks the thread leading the wait queue (which is ordered either by priority or FIFO, see evl_init_wait(). If no thread is sleeping on the wait queue, the call is a nop. This is a shorthand for calling evl_wake_up() with default arguments, such as:

    	return evl_wake_up(wq, NULL, 0);
    

    For instance, the EVL core implements the kernel semaphore V operation as:

    raw_spin_lock_irqsave(&ksem->wait.lock, flags);
    ksem->value++;
    evl_wake_up_head(&ksem->wait);
    raw_spin_unlock_irqrestore(&ksem->wait.lock, flags);
    evl_schedule();
    

  • wq

    A wait queue descriptor previously initialized by a call to evl_init_wait().

  • This call returns the descriptor of the unblocked thread.


    void evl_flush_wait(struct evl_wait_queue *wq, int reason)

    evl_flush_wait() unblocks all the threads sleeping on the wait queue at the time of the call.

  • wq

    A wait queue descriptor previously initialized by a call to evl_init_wait().

  • reason

    A bitmask which gives additional information to the resuming threads about the reason why they were unblocked. In the common case, reason should be zero. A non-zero value contains a flag bit matching a particular situation, which translates to a specific error status for evl_wait_schedule() .


  • void evl_flush_wait_locked(struct evl_wait_queue *wq, int reason)

    evl_flush_wait_locked() belongs to the inner interface. Unlike evl_flush_wait() which is based on this call, it expects the caller to hold a lock on the target wait queue. The arguments are the same as evl_flush_wait().


    bool evl_wait_active(struct evl_wait_queue *wq)

    This call belongs to the inner interface: it tests whether threads are currently blocked on a wait queue. The lock guarding the wait queue must be held by the caller.


    struct evl_thread *evl_wait_head(struct evl_wait_queue *wq)

    This call belongs to the inner interface: it returns the descriptor of the thread leading the wait queue, or NULL if none. The lock guarding the wait queue must be held by the caller.


    EVL_WAIT_INITIALIZER(name)

    A macro which expands as a static initializer you can use in a C statement creating an EVL wait queue.

  • name

    The C variable name to which the initializer should be assigned.

  • struct evl_wait_queue foo = EVL_WAIT_INITIALIZER(foo);
    

    evl_for_each_waiter(pos, wq)

    This macro belongs to the inner interface: it iterates over the thread(s) sleeping on the wait queue at the time of the call. The lock guarding the wait queue must be held by the caller. This convenience macro is based on list_for_each_entry() macro from the regular kernel API.

    This macro is NOT suitable for loops which may alter the wait queue state while iterating over it. Typically, you would use evl_for_each_waiter_safe() instead to iterate over a wait queue, calling evl_wake_up() on some/all of the sleeping threads.

  • pos

    A variable which points at the current thread descriptor during the iteration.

  • wq

    A wait queue descriptor previously initialized by a call to evl_init_wait(). iteration.


  • evl_for_each_waiter_safe(pos, npos, wq)

    This macro belongs to the inner interface: it iterates over the thread(s) sleeping on the wait queue at the time of the call, allowing for altering the wait queue state during the loop. The lock guarding the wait queue must be held by the caller. This convenience macro is based on list_for_each_entry_safe() macro from the regular kernel API.

  • pos

    A variable which points at the current thread descriptor during the iteration (struct evl_thread *pos).

  • npos

    A temporary variable used by the macro to enforce safety when scanning the wait queue (struct evl_thread *npos).

  • wq

    A wait queue descriptor previously initialized by a call to evl_init_wait(). iteration.