Network stack

EVL networking in a nutshell

EVL features a simple network stack which currently implements the UDP protocol from the PF_INET domain (aka IPv4), and raw ethernet datagrams over sockets from the PF_PACKET domain, both over an ethernet medium. The goal is to provide datagram-oriented networking support to applications with real-time communication requirements.

To achieve this, EVL relies on Dovetail which can redirect traffic between NIC drivers and the EVL core directly, either from the in-band or out-of-band stages, depending on whether such drivers have been adapted in order to handle traffic from the out-of-band stage.

EVL can recognize the ethernet traffic it should handle based on two types of input filters:

  • VLAN tagging. Packets which belong to a so-called out-of-band VLAN are redirected to the EVL network stack, others are left for handling by the regular / in-band network stack. The EVL core provides a user interface for enabling a series of VLAN identifiers as out-of-band input ports in a network device.

Since the 802.1Q standard has been around for quite some time by now, most ethernet switches should pass on frames with the ethertype information set to the 802.1Q TPID “as is” to some hardware port, and they should also be able to cope with the four additional octets involved in VLAN tagging without having to lower the MTU everywhere (most equipments even support jumbo frames these days).

  • eBPF filtering. An eBPF program can be installed on a network device controlled by EVL in order to decide whether an ingress packet should be processed by the EVL network stack, the regular / in-band stack, or dropped.

The eBPF program takes precedence over the VLAN matching rule which always applies in absence of such program. By returning the appropriate status code, the filter program can decide to either:

  • Accept the packet for handling by the EVL network stack (EVL_RX_ACCEPT).
  • Hand over the packet to the regular / in-band stack instead (EVL_RX_SKIP).
  • Postpone the decision to applying the VLAN matching rule (EVL_RX_VLAN).
  • Drop the packet, which won’t enter any network stack as a consequence (EVL_RX_DROP).

The following diagram illustrates the flow of an incoming packet from the network interface controller to the service access point in the application, i.e. the socket.

Alt text

Typical use cases

Sharing a network interface between in-band and out-of-band traffic

The hardware platform might have a single network interface available, in which case VLAN tagging may come in handy to discriminate packets, which should be handled by the EVL network stack, which should be left for processing by the regular one. Obviously, this comes at a cost with respect to latency, since the in-band traffic might slow down the out-of-band packets at the hardware level. However, depending on the real-time requirements, that cost may still be within budget for many applications, without requiring any software proxy to mediate between in-band and out-of-band traffic in front of the network interface.

Dedicating a network interface to out-of-band traffic

Any of VLAN tagging or an eBPF program can be used to filter input so as to dedicate a network interface to handling out-of-band traffic. This is useful in order to decrease latency when multiple interfaces are available from the hardware platform, with one or more reserved to real-time ethernet traffic. With an eBPF program, ensuring the filter loaded on the network interface always returns EVL_RX_ACCEPT unconditionally creates a network path dedicated to real-time traffic.

Dealing with complex out-of-band detection rules

An eBPF program allows deep inspection of the packet data before issuing any decision about which network stack should handle the traffic. One may rely on this to implement complex out-of-band traffic detection rules.

Out-of-band support in NIC drivers

Unlike Xenomai 3 with the RTnet stack, EVL provides a network stack which does not require EVL-specific drivers. Instead, the capabilities of the stock NIC drivers can be extended to supporting out-of-band I/O operations for dealing with ingress and egress traffic, based on facilities provided by Dovetail in that area. If a driver does not advertise out-of-band I/O capabilities, EVL automatically offloads the final I/O operations to the in-band network stack for talking to the network device, allowing the application code to keep running on the out-of-band stage in parallel.

Although EVL does not require the NIC driver code to be oob-capable, i.e. conveying ingress and egress traffic directly from the out-of-band execution stage, having such support in place is the only way to have a real-time networking path, from the ethernet wire to the application code. In other words, one may still use stock ethernet controller drivers along with the EVL network stack, at the expense of the real-time performance which would depend on the low-latency capabilities of the host kernel.

The EVL network stack for the impatient

The libevl tree comes with an example illustrating a basic usage of the EVL network stack implementing an ICMPv4(ECHO) responder, named oob-net-icmp. In order to run this example, you need two computers on the same ethernet LAN, one is the target system running EVL (the responder), the other may be any box which can issue ICMPv4 packets (the issuer).

Configuring the ICMPv4 responder

As explained earlier, there are two ways for enabling out-of-band traffic to flow through a network device using EVL. We are going to illustrate the one using VLAN tagging. In this configuration, the following steps are required before you can send/receive network packets through the EVL network stack on the target system:

  1. Create a regular VLAN interface on top of a physical network device attached to the system.

  2. Turn the new VLAN interface into an EVL network port, this can be done either programmatically, or by writing to a device-specific pseudo-file /sysfs. Under the hood, this adds an EVL-specific context to the underlying physical device supporting the VLAN interface, allocating the resources needed for dealing with out-of-band traffic.

  3. Add the new VLAN identifier to the set of out-of-band VLANs EVL monitors, so that it picks incoming packets flowing on those instead of leaving them for processing by the regular network stack.

For instance, enabling out-of-band networking over VLAN #42 on the physical network interface named ’eth0’ would translate to the following shell commands:

Attach a VLAN device with tag 42 to the real 'eth0' device
# ip link add link eth0 name eth0.42 type vlan id 42

Assign an arbitrary address to that VLAN device, e.g. 10.10.10.11
# ip addr add 10.10.10.11/24 dev eth0.42

Tell EVL that the VLAN device is an out-of-band networking port:
# echo 1 > /sys/class/net/eth0.42/oob_port

Eventually, tell EVL to pick packets tagged for VLAN 42 (you could ask EVL
to monitor multiple VLANs by passing a list of tags like '42-45,100,107'
the same way):
# echo 42 > /sys/class/evl/net/vlans

Configuring the ICMPv4 issuer

Now let’s run a ping command from the issuing box to the IP address of the VLAN device created earlier for the responder on the target system (that box does not have to run EVL). All you need is create a peer VLAN device on that box attached to the same ethernet LAN, then ping the responder machine which runs the oob-net-icmp program, e.g. assuming ’eno2’ is the name of the physical network interface on such host:

# sudo ip link add link eno2 name eno2.42 type vlan id 42
# sudo ip addr add 10.10.10.10/24 dev eno2.42
# sudo ip link set eno2.42 up
# ping 10.10.10.11

Some NICs (e.g. Intel e1000) may need a delay between the moment the VLAN filter is updated and the link is enabled in their hardware. If in doubt, make sure to pause for a short while between both operations, especially if the corresponding ‘ip’ commands are part of a shell script.

Eventually, this test program running on the EVL-enabled machine should output traces as it replies to the ICMPv4(ECHO) requests, e.g.:

# /usr/evl/tidbits/oob-net-icmp -i eth0.42
listening to interface eth0.42
[0] count=84, proto=0x800, ifindex=2, type=0, halen=6, mac=xx:xx:xx:xx:xx:xx
[1] count=84, proto=0x800, ifindex=2, type=0, halen=6, mac=xx:xx:xx:xx:xx:xx
[2] count=84, proto=0x800, ifindex=2, type=0, halen=6, mac=xx:xx:xx:xx:xx:xx
[3] count=84, proto=0x800, ifindex=2, type=0, halen=6, mac=xx:xx:xx:xx:xx:xx
[4] count=84, proto=0x800, ifindex=2, type=0, halen=6, mac=xx:xx:xx:xx:xx:xx
[5] count=84, proto=0x800, ifindex=2, type=0, halen=6, mac=xx:xx:xx:xx:xx:xx

Interpreting ping statistics

Even in favorable LAN setup and traffic conditions (small lan, fast switch, few hosts, low collision rate), do not over- or misinterpret the roundtrip times reported by the ping command on the issuer:

  • flooding the responder may cause delays due to verbose tracing because the buffer of the EVL proxy to stdout those traces go through might saturate, blocking the caller until it drains. At the very least, turn on the silent mode (’-s’ option switch) for the responder program to prevent this.
  • when considering the average roundtrip time reported, you need to account for the fact that the ICMPv4 responder to ping is normally the remote kernel, not an application running in user-space. The fact that such application gains real-time support with EVL may not fully compensate for the cost of transitioning from kernel to user-space on average when the responder system is stressed.
  • when considering the maximum roundtrip time reported, you need to factor in whether the NIC driver on the responder system is out-of-band capable or not. If not, then a larger portion of the roundtrip does not benefit from real-time support.

In other words, if you really want to precisely measure and compare the latency involved in running traffic through the EVL network stack or not, you need 1) to have an out-of-band capable NIC driver, 2) to compare strictly identical issuer <-> responder set-ups, 3) to look for the worst-case figure, although the average figure is significant, the maximum latency and jitter figures are key in this context.


Last modified: Tue, 15 Oct 2024 16:38:40 +0200