The EVL network stack in kernel space
communicates with applications running in user-space via the common
socket service access point. Passing the SOCK_OOB
flag to the
socket(2)
system call enables out-of-band services for the new socket, provided
the EVL network stack implements such support for the address and
protocol family specified in the socket creation call. Currently, EVL
implements AF_INET,SOCK_DGRAM
(i.e. IPv4 UDP protocol) and
AF_PACKET,SOCK_RAW
(raw ethernet packet). Custom protocols can be
implemented in the out-of-band network stack.
Since any oob-enabled socket is primarily a genuine socket with extended services provided by the EVL core, the complete set of standard ioctl(2) requests and socket options are available with oob-enabled sockets. In addition, the EVL core handles its own set of requests and options.
In-band options should be set and retrieved using the regular setsockopt(2) and getsockopt(2) system calls, which will be redirected to the EVL network stack from the in-band execution stage. Conversely, out-of-band options should be set and retrieved using the oob_setsockopt() and oob_getsockopt() system calls. The following options are recognized:
SO_TIMESTAMP_OOB
, configures packet timestamping either from the
in-band or out-of-band execution stage, indifferently. See the
discussion.The EVL network stack can timestamp ingress and/or egress traffic upon request from the application. The general form for configuring at socket level the timestamping feature is as follows:
```
int tsflags = <timestamping-selection-flags>;
int ret = setsockopt(s, SOL_SOCKET, SO_TIMESTAMP_OOB, &tsflags, sizeof(tsflags));
int ret = oob_setsockopt(s, SOL_SOCKET, SO_TIMESTAMP_OOB, &tsflags, sizeof(tsflags));
```
The timestamping information is returned in the C `struct iotimes`,
which is defined as follows:
```
struct evl_net_iotimes {
/* Time at device<->netstack boundary (monotonic). */
__u64 device_time;
/* Time at RX/TX thread dequeuing/queuing point (monotonic). */
__u64 queuing_time;
/* Time at kernel/user boundary (monotonic). */
__u64 delivery_time;
};
```
On TX, timestamps are collected asynchronously, and can be
retrieved by the application by passing the `MSG_TIMESTAMP`
operation flag to [oob_recvmsg()](/core/user-api/io/#oob_recvmsg), instead of receiving
traffic data.
```
#include <evl/net/socket.h>
struct evl_net_iotimes iotimes[16] = { 0 }; /* Collect up to 16 timestamps */
struct oob_msghdr msghdr;
struct iovec iov;
ssize_t ret;
iov.iov_base = iotimes;
iov.iov_len = sizeof(iotimes);
msghdr.msg_iov = &iov;
msghdr.msg_iovlen = 1;
msghdr.msg_control = NULL;
msghdr.msg_controllen = 0;
msghdr.msg_name = NULL;
msghdr.msg_namelen = 0;
msghdr.msg_flags = 0;
ret = oob_recvmsg(s, &msghdr, NULL, MSG_TIMESTAMP | MSG_DONTWAIT);
if (ret > 0)
/* Count of TX timestamps received is ret / sizeof(iotimes[0]. */
```
On RX, the timestamp for the received packet is returned by the
[oob_recvmsg()](/core/user-api/io/#oob_recvmsg) call directly, via the
control data area.
```
#include <evl/net/socket.h>
struct evl_net_iotimes iotimes = { 0 };
struct oob_msghdr msghdr;
struct iovec iov[...];
ssize_t ret;
msghdr.msg_iov = iov;
msghdr.msg_iovlen = <number-of-iov-cells>;
msghdr.msg_control = &iotimes;
msghdr.msg_controllen = sizeof(iotimes);
msghdr.msg_name = NULL;
msghdr.msg_namelen = 0;
msghdr.msg_flags = 0;
ret = oob_recvmsg(s, &msghdr, NULL, 0);
if (msghdr.msg_controllen == sizeof(iotimes)) {
/* RX timestamp available along with the packet data (in iov[]). */
}
```
The EVL network stack timestamps ingress and/or egress packets if
`EVL_SOF_TIMESTAMP_RX` and/or `EVL_SOF_TIMESTAMP_TX` are set in
`tsflags` respectively. Conversely, timestamping is disabled for a
direction if the corresponding flag is cleared in
`tsflags`. Therefore, passing zero in `tsflags` disables
timestamping entirely.
if EVL_SOF_TIMESTAMP_DEVICE
is set, every incoming packet is
timestamped when the NIC driver hands it over to the network stack
for receive (EVL_SOF_TIMESTAMP_RX
), or when the network stack
hands it over to the NIC driver for transmit
(EVL_SOF_TIMESTAMP_TX
).
if EVL_SOF_TIMESTAMP_QUEUING
is set along with
EVL_SOF_TIMESTAMP_RX
, every incoming packet is timestamped when
the RX thread in the network stack passes it to the receiving
protocol layer. In this case, queuing_time - device_time
measures
the delay spent waiting in the RX queue. if
EVL_SOF_TIMESTAMP_QUEUING
is set along with
EVL_SOF_TIMESTAMP_TX
, every outgoing packet is timestamped when
the network stack queues it for transmission by the TX thread,
according to the queuing discipline in effect. In this case,
device_time - queuing_time
measures the delay spent waiting in
the TX queue in addition to the qdisc handling.
if timestamping is enabled for a direction, a timestamp is
collected unconditionally at the kernel <-> user boundary, this
information is stored into the delivery_time
field of the
struct evl_net_iotimes
. On RX, the delivery timestamp is taken
when a packet is queued by the protocol layer, for consumption by
oob_recvmsg() later on, therefore
delivery_time - device_time
measures the delay between receipt
from the hardware and availability to the application. On TX, the
delivery timestamp is taken when the network stack starts building
the outgoing packet received from the application via a call to
oob_sendmsg(), therefore
device_time - delivery_time
measures the delay between data
receipt from the application and packet submission to the hardware
for transmit. latmus -E measures the maximum delays observed in both directions.
EVL_SOF_TIMESTAMPS
is a shorthand enabling all timestamping
locations (RX and TX).
This service enables the out-of-band network port of a network device, allowing the calling application to send and/or receive packets from the out-of-band execution stage. A file descriptor to the out-of-band port is returned, so that further configuration may be applied. Enabling an already enabled port is a no-op which returns a success status.
As mentioned earlier, enabling an out-of-band port does not necessarily means that real-time capability is granted: it all depends on whether the driver managing the interface is out-of-band capable.
The name of the network interface to enable for out-of-band I/O (e.g. “eth0”, “eth1.42”).
The number of buffers in the per-device pool maintained by EVL which
should be pre-allocated for out-of-band use. The size of this pool
depends on the volume of the egress traffic, since EVL consumes those
buffers on the transmit path only, the receive buffers are allocated
by the NIC drivers instead. Passing zero tells the core to use the default
value defined by the network stack. You can inspect the current value
with the query
sub-command of the evl-net command.
The size (in bytes) of a buffer from the per-device pool. This value
must accomodate for the largest MTU you intend to use with your device.
Passing zero tells the core to use the default
value defined by the network stack. You can inspect the current value
with the query
sub-command of the evl-net command.
evl_net_enable_port() returns the file descriptor of the enabled out-of-band port. Otherwise, a negated error code is returned:
-EINVAL ifname
is not a valid network interface name.
-ENOTSUP The network stack is not enabled in the EVL core (see CONFIG_EVL_NET.
This is the converse operation to evl_net_enable_port for disabling a port. Once this port is disabled, the traffic flows on the in-band stage exclusively, and the interface cannot be used for out-of-band traffic using the oob_sendmsg and oob_recvmsg calls.
This command may wait until all in-flight TX buffers conveyed on the out-of-band stage have been processed by the interface. Disable an already disable port is a no-op which returns a success status.
A file descriptor received from evl_net_enable_port() or evl_net_open_port().
On success, all the resources associated with the port have been released and evl_net_disable_port() returns zero. Otherwise, a negated error code is returned:
This call obtains a file descriptor to an enabled out-of-band port on the given interface, which may be used for configuring such port. Such descriptor is also returned by the initial call to evl_net_enable_port() .
The name of a network interface providing an active out-of-band I/O port (e.g. “eth0”, “eth1.42”).
On success, the call returns zero. Otherwise, a negated error code is returned:
This call installs or removes an eBPF program on/from an out-of-band port. The program should implement the rules for accepting ingress packets into the EVL network stack received from the device hosting that port.
Packets already accepted by the EVL network stack but still waiting for consumption by out-of-band reader(s) are not impacted by the filter change.
A file descriptor received from evl_net_enable_port() or evl_net_open_port().
The path to the eBPF module to install if non-NULL. If a previous module was installed, the new one replaces it. Passing NULL removes any previously installed filter.
-EBADF fd is not a valid file descriptor to an enabled port.
-ENOENT modpath does not exist.
-EACCESS modpath may not be read.
Other error codes may be returned by the libbpf API.
This call ensures that the routing and MAC information (e.g. outgoing device and MAC address for Ethernet-based networking) needed to send data to a peer is pre-cached in the EVL front cache, before oob_sendmsg is called to talk to that peer. This guarantees that no demotion to in-band sending happens due to a lack of such information. Such demotion would create a weak point in the real-time path, causing unbounded latency. This is typically useful for IPv4 networking protocols as implemented by EVL, such as UDP.
In some case, this request may take a few seconds to complete, this is the time the regular network stack needs to complete the operation before it can feed the EVL front cache eventually.
An out-of-band socket descriptor.
The address of the peer. EVL currently implements solicitation for
sockaddr_in
addresses only (IPv4).
A set of operation modifiers, ORed in a bitmask:
arp -d <entry>
, which will
also affect the EVL front cache.This call returns zero on success. Otherwise, a negated error code is returned:
-EBADF s is not a valid file descriptor to an out-of-band capable socket (SOCK_OOB).
-ENODEV The device selected by the regular network stack for sending packets to this peer is valid, but no out-of-band port is enabled for it. See this document for more information.
-EINVAL peer refers to invalid data.
-EFAULT peer points to invalid memory.
This call retrieves the following statistics from an out-of-band port:
struct evl_net_devstat {
__u64 rx_packets; /* Number of packets received */
__u64 rx_bytes; /* Number of bytes received */
__u64 tx_packets; /* Number of packets transmitted */
__u64 tx_bytes; /* Number of bytes transmitted */
__u32 skb_size; /* Size of a buffer from the per-device pool */
__u32 skb_free; /* Number of free buffers in the per-device pool */
__u32 skb_total; /* Total number of buffers in the per-device pool */
__u8 oob_capable:1; /* Whether NIC driver is oob_capable */
};
The data above only and specifically applies to out-of-band I/O, which is accounted separately from in-band I/O statistics as displayed by ethtool.
A file descriptor received from evl_net_enable_port() or evl_net_open_port().
This call returns zero on success. Otherwise, a negated error code is returned:
-EBADF fd is not a valid file descriptor to an enabled port.
-EFAULT devs points to invalid memory.