Skip to main content
Version: 3.27 (latest)

VPP dataplane troubleshooting

Big picture

This page describes the troubleshooting steps for the VPP dataplane. If you did not configure the VPP dataplane, this page is not for you!

If you're encountering issues with the VPP dataplane, feel free to reach out to us either on the #vpp channel on the Calico slack, or by opening a new issue in Github).

Installing calivppctl

calivppctl is a helper bash script shipped alongside vpp container images. It can be installed to your host with the following methods, and helps collecting logs and debugging a running cluster with the VPP dataplane installed.

  • With curl

    curl https://raw.githubusercontent.com/projectcalico/vpp-dataplane/v3.27.0/test/scripts/vppdev.sh | tee /usr/bin/calivppctl
    chmod +x /usr/bin/calivppctl
  • With docker (and a cluster with Calico VPP running)

    vppcontainer=$(docker ps | grep vpp_calico-vpp | awk '{ print $1 }')
    docker cp ${vppcontainer}:/usr/bin/calivppctl /usr/bin/calivppctl
  • With kubectl (and a cluster with Calico VPP running)

    vpppod=$(kubectl -n calico-vpp-dataplane get pods -o wide | grep calico-vpp-node- | awk '{ print $1 }' | head -1)
    kubectl -n calico-vpp-dataplane exec -it ${vpppod} -c vpp -- cat /usr/bin/calivppctl | tee /usr/bin/calivppctl > /dev/null
    chmod +x /usr/bin/calivppctl

Troubleshooting

Kubernetes Cluster

First you need to make sure Kubernetes is up and running.

  • service kubelet status should give you a first hint.
  • Issues should be reported in the kubelet logs, which you can check with this command if you are using systemd: journalctl -u kubelet -r -n200
note

Kubernetes does not run with swap enabled.

Starting calico-vpp-node Daemon set

Once the cluster is correctly started, the next issue can come from the Daemonset configuration. Best is to start by inspecting the pods : are they running correctly ? Usually configuration issues (available hugepages, memory, ...) will be reported here:

kubectl -n calico-vpp-dataplane describe pod/calico-vpp-node-XXXXX
note

If at this point you don't have enough hugepages, you'll have to restart kubelet after allocating them for taking it into account (using for instance service kubelet restart)

Having VPP up and running

Once the pods don't report any issue, the pods should have started. There are two containers for each node : vpp-manager that starts the VPP process and sets up connectivity, and the agent handling pod connectivity, service load balancing, BGP, policies, etc.

First check that VPP is running correctly. If the connectivity configuration, interface naming is not correct, this will be reported here. Once this is running, you should be able to ping your other nodes through VPP.

# Print VPP's log : basic connectivity and NIC configuration
calivppctl log -vpp myk8node1

Then you can check for any issues reported by the agent (e.g. BGP listen issue if the port is already taken, or missing configuration pieces). If this doesn't show any errors, you should be able to nslookup kubernetes.default from pods.

# Print the logs for the Calico VPP dataplane agent, programming serviceIPs, BGP, ...
calivppctl log -agent myk8node1

If all this doesn't play well you can always use the export to generate an export.tar.gz bundle and ask for help on the #vpp channel

calivppctl export

Accessing the VPP cli

For further debugging, tracing packets and inspecting VPP's internals, you can get a vpp shell using the following:

calivppctl vppctl myk8node1

Listing interfaces and basics

To list existing interfaces and basic counters use

vpp# show int
vpp# show int addr

To get more insights on the main interface (e.g. if you're using dpdk) you can check for errors & drops in

vpp# show hardware-interfaces

Other places to look for errors

vpp# show log       # VPP startup log
vpp# show err # Prints out packet counters (not always actual errors, but includes drops)
vpp# show buffers # You should have non zero free buffers, otherwise traffic won't flow

Tracing packets

Internal network layout

For starters, here is a small schematic of how the network looks like: k8-calico-vpp

Container interfaces are named tun[0-9]+. You can find which one belong to which container as follows:

  • Connect to VPP

    calivppctl vppctl NODENAME
  • List interfaces

    vpp# show interface
    Name Idx State MTU (L3/IP4/IP6/MPLS) Counter Count
    avf-0/d8/a/0 1 up 9000/0/0/0 tx packets 2
    tx bytes 216
    local0 0 down 0/0/0/0
    tap0 2 up 0/0/0/0 rx packets 9
    [...]
    tun3 5 up 0/0/0/0 rx packets 5
    rx bytes 431
    tx packets 5
    tx bytes 387
    ip4 5
  • Show the route for address 11.0.166.132

    vpp# show ip fib 11.0.166.132
    ipv4-VRF:0, fib_index:0, flow hash:[src dst sport dport symmetric ] epoch:0 flags:none locks:[adjacency:1, default-route:1, ]
    11.0.166.132/32 fib:0 index:19 locks:5
    cnat refs:1 entry-flags:uRPF-exempt,interpose, src-flags:added,contributing,active, cover:-1 interpose:
    [@0]: [4] cnat-client:[11.0.166.132] tr:0 sess:1
    path-list:[26] locks:3 flags:shared, uPRF-list:24 len:1 itfs:[5, ]
    path:[32] pl-index:26 ip4 weight=1 pref=0 attached-nexthop: oper-flags:resolved, cfg-flags:attached,
    11.0.166.132 tun3 (p2p)
    [@0]: ipv4 via 0.0.0.0 tun3: mtu:9000 next:7
    [...]
  • This one is behind tun3. If you want more info about this interface (name in Linux, queues, descriptors, ...)

    vpp# show tun tun3
    Interface: tun3 (ifindex 5)
    name "eth0"
    host-ns "/proc/17675/ns/net"
    [...]

    tap0 is the interface providing connectivity to the host, using the original interface name on the Linux side (use show tap tap0 and show ip punt redirect).

Capturing traffic inside the cluster

Let's take the case of two pods talking to each other in your cluster (see the schema above). You might want to inspect the traffic at 3 different locations:

  • as it exits the pod (in Linux inside the first pod)
  • as it goes through VPP
  • as it is received in the second pod (in Linux again)

We cover the three cases, first inside VPP (depending on where your traffic is coming from : a pod or outside your host) then inside your pods (usually with tcpdump)

Traffic capture inside VPP

Traffic from a pod

The following snippet will allow you to capture all traffic coming from containers on a particular node, grep from a specific packet, and see what happened to it.

# Make sure that the trace buffer is clean in VPP
calivppctl vppctl NODENAME clear trace
# Add a trace from the virtio-input input-node
calivppctl vppctl NODENAME trace add virtio-input 500
# generate some traffic
calivppctl vppctl NODENAME show trace max 500 > somefile
# Grep for your IPs
cat somefile | grep '1.2.3.4 -> 5.6.7.8' -A40 -B40

Output looks quite cumbersome at first as it contains the whole path of a packet through VPP, from reception to tx.

vpp# show trace
Packet 1

00:09:46:518858: virtio-input
This packet has been received on the interface number #2 (column Idx in `show int`)
and is 688 Bytes long
virtio: hw_if_index 2 next-index 1 vring 0 len 688
hdr: flags 0x00 gso_type 0x00 hdr_len 0 gso_size 0 csum_start 0 csum_offset 0 num_buffers 1
00:09:46:518866: ip4-input
we read TCP header, addresses and ports
TCP: 20.0.0.1 -> 11.0.166.133
tos 0x00, ttl 64, length 688, checksum 0x1bc5 dscp CS0 ecn NON_ECN
fragment id 0x56fd, flags DONT_FRAGMENT
TCP: 6443 -> 34112
seq. 0xa1f93599 ack 0x818eb1c1
flags 0x18 PSH ACK, tcp header: 32 bytes
window 502, checksum 0x00b7
00:09:46:518870: ip4-lookup
fib 0 dpo-idx 5 flow hash: 0x00000000
TCP: 20.0.0.1 -> 11.0.166.133
tos 0x00, ttl 64, length 688, checksum 0x1bc5 dscp CS0 ecn NON_ECN
fragment id 0x56fd, flags DONT_FRAGMENT
TCP: 6443 -> 34112
seq. 0xa1f93599 ack 0x818eb1c1
flags 0x18 PSH ACK, tcp header: 32 bytes
window 502, checksum 0x00b7
00:09:46:518873: ip4-cnat-tx
We need to do some NATing as it's Kubernetes
found: session:[20.0.0.1;6443 -> 11.0.166.133;34112, TCP] => 11.96.0.1;443 -> 11.0.166.133;34112 lb:-1 age:4190
00:09:46:518879: ip4-rewrite
We rewrite the ip packet
mac addresses only when coming / going to a PHY, as tun interfaces are L3-only
tx_sw_if_index 6 dpo-idx 7 : ipv4 via 0.0.0.0 tun4: mtu:9000 next:8 flow hash: 0x00000000
00000000: 450002b056fd40003f0625650b6000010b00a68501bb8540a1f93599818eb1c1
00000020: 801801f620c700000101080a3f906c98fbaaba031703030277413d39
Output happens on the interface `tun4`
00:09:46:518880: tun4-output
tun4
00000000: 450002b056fd40003f0625650b6000010b00a68501bb8540a1f93599818eb1c1
00000020: 801801f620c700000101080a3f906c98fbaaba031703030277413d39b97817c1
00000040: 41392fdbe0e9d4886849851476cdb8986362ee2f789bfefd8a5c106c898d1309
00000060: 4f8f8cb89159d99e986813a48d91334930eb5eb10ca4248c
00:09:46:518881: tun4-tx
buffer 0x24cf615: current data 0, length 688, buffer-pool 1, ref-count 1, totlen-nifb 0, trace handle 0x1000000
ipv4 tcp hdr-sz 52 l2-hdr-offset 0 l3-hdr-offset 0 l4-hdr-offset 20 l4-hdr-sz 32
0x0b60: 40:00:3f:06:25:65 -> 45:00:02:b0:56:fd

Packet 2
[...]

Traffic from the phy

If you want to capture traffic coming from the physical NIC, you should use trace add but with a different source node a.k.a dpdk-input or af-packet-input or af_xdp-input or avf-input instead of virtio-input.

show run should give you a hint of the X-input node you want to trace from.

For example, going by the output of show run below we seem to want to use avf-input as the node to trace from:

vpp# show run
Thread 1 vpp_wk_0 (lcore 25)
Time 1.9, 10 sec internal node vector rate 1.05 loops/sec 1074819.68
vector rates in 7.5356e0, out 7.5356e0, drop 0.0000e0, punt 0.0000e0
Name State Calls Vectors Suspends Clocks Vectors/Call
avf-input polling 2233530 0 0 8.24e1 0.00
ip4-cnat-snat active 1 1 0 5.35e3 1.00
ip4-cnat-tx active 14 15 0 1.18e3 1.07
[...]

So, same as with traffic from a container, you can use

# Make sure that the trace buffer is clean in VPP
calivppctl vppctl NODENAME clear trace
# Add a trace from the virtio-input input-node
calivppctl vppctl NODENAME trace add avf-input 500
# generate some traffic
calivppctl vppctl NODENAME show trace max 500 > somefile
# Grep for your IPs
cat somefile | grep '1.2.3.4 -> 5.6.7.8' -A40 -B40

With Wireshark

Alternatively to the trace, you can do a capture and analyze it inside Wireshark. You can do this with:

vpp# pcap dispatch trace on max 1000 file vppcapture buffer-trace dpdk-input 1000
vpp# pcap dispatch trace off

This will generate a file named /tmp/vppcapture.

Then on your host run:

calivppctl sh vpp NODENAME
root@server:~ mv /tmp/vppcapture /var/lib/vpp/
root@server:~ exit

The file should now be at /var/lib/vpp/vppcapture on your host 'NODENAME'. You can then scp NODENAME:/var/lib/vpp/vppcapture . on your machine and open it with Wireshark. More info about this here.

Traffic received in the pods

To inspect traffic actually received by the pods (if tcpdump is installed in the pod), simply run tcpdump -ni eth0 inside the pod. If tcpdump is not available in the pod, here are two options to still be able to capture pod traffic:

Tcpdump is available on the host

Provided that you have tcpdump installed on the host, you can use nsenter to attach to the pod's network namespace and use the host's tcpdump on the container's interface.

This works on docker as follows:

  • Find the container ID you want to inspect

    docker ps
    CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
    4c01db0b339c ubuntu:12.04 bash 17 seconds ago Up 16 seconds 3300-3310/tcp webapp
  • Get the container PID out of it

    docker inspect --format '{{ .State.Pid }}' 4c01db0b339c
    12345
  • Attach

    nsenter -t 12345 -n bash

No tcpdump, but we have python !

Open an AF_PACKET socket in python with the following code and run it attached to the running namespace as previously.

#!/usr/bin/env python
from socket import *
from struct import unpack

IFNAME = "eth0"
N_PKT = 50
MTU=1500

sock = socket(AF_PACKET, SOCK_DGRAM, 0x0800)
sock.bind((IFNAME, 0x0800))
for _ in range(N_PKT):
data = sock.recvfrom(MTU, 0)[0]
src_addr = inet_ntop(AF_INET, data[12:16])
dst_addr = inet_ntop(AF_INET, data[16:20])
src_port, = unpack("!H", data[20:22])
dst_port, = unpack("!H", data[22:24])
data_len, = unpack("!H", data[24:26])
cksum, = unpack("!H", data[26:28])

print("%s:%d -> %s:%d len %d cs %d" % (src_addr, src_port, dst_addr, dst_port, data_len, cksum))

This requires privileges and thus is usually easier to run from the host. From the host, you can use echo "the python blob above" | nsenter -t <thePID> -n python to execute this code.

Traffic to the kubelet agent

As the kubelet agent runs directly on the host without a network namespace, pods talking to it (e.g. coredns resolvers) would go through a specific path. Packets destined to it will be caught by VPP's punt mechanism, and will be forwarded to the host through a tap interface which will have the same name as the original interface in Linux.

To debug traffic within VPP, use the trace & check that traffic is correctly punted to the tap0 interface.

On the host, you can use tcpdump normally to check the traffic.

Crashes & coredumps

If VPP aborts unexpectedly, it will generate a coredump file, vppcore.vpp_main.<pid>, in the /var/lib/vpp/ dir. If you encounter this situation, please note the exact version of the vpp image that generated the corefile (using the image hash) to facilitate further troubleshooting.

To explore the corefile run:

docker run -it --entrypoint=bash -v /var/lib/vpp/vppcore.vpp_main.12345:/root/vppcore calicovpp/vpp:VERSION

You should now have a shell inside the vpp container.

apt update && apt install -y gdb
gdb vpp ./vppcore