Troubleshoot eBPF mode
This document gives some general troubleshooting guidance for the eBPF dataplane.
Troubleshoot access to services
If pods or hosts within your cluster have trouble accessing services, check the following:
Either Calico’s eBPF mode or
kube-proxymust be active on a host for services to function. If you disabled
kube-proxywhen enabling eBPF mode, verify that eBPF mode is actually functioning. If Calico detects that the kernel is not supported, it will fall back to standard dataplane mode (which does not support services).
To verify that eBPF mode is correctly enabled, examine the log for a
calico-nodecontainer; if eBPF mode is not supported it will log an
ERRORlog that says
BPF dataplane mode enabled but not supported by the kernel. Disabling BPF mode.
If BPF mode is correctly enabled, you should see an
INFOlog that says
BPF enabled, starting BPF endpoint manager and map manager.
In eBPF mode, external client access to services (typically NodePorts) is implemented using VXLAN encapsulation. If NodePorts time out when the backing pod is on another node, check your underlying network fabric allows VXLAN traffic between the nodes. VXLAN is a UDP protocol; by default it uses port 4789.
In DSR mode, Calico requires that the underlying network fabric allows one node to respond on behalf of another.
In AWS, to allow this, the Source/Dest check must be disabled on the node’s NIC. However, note that DSR only works within AWS; it is not compatible with external traffic through a load balancer. This is because the load balancer is expecting the traffic to return from the same host.
In GCP, the “Allow forwarding” option must be enabled. As with AWS, traffic through a load balancer does not work correctly with DSR because the load balancer is not consulted on the return path from the backing node.
Since BPF maps contain binary data, the Calico team wrote a tool to examine Calico’s BPF maps. The tool is embedded in the calico/node container image. To run the tool:
- Find the name of the calico/node Pod on the host of interest using
kubectl get pod -o wide -n calico-system
- Run the tool as follows:
kubectl exec -n calico-system calico-node-abcdef -- calico-node -bpf ...
For example, to show the tool’s help:
$ kubectl exec -n calico-system calico-node-abcdef -- calico-node -bpf help Usage: calico-bpf [command] Available Commands: arp Manipulates arp connect-time Manipulates connect-time load balancing programs conntrack Manipulates connection tracking counters Show and reset counters help Help about any command ipsets Manipulates ipsets nat Manipulates network address translation (nat) routes Manipulates routes version Prints the version and exits Flags: --config string config file (default is $HOME/.calico-bpf.yaml) -h, --help help for calico-bpf --log-level string Set log level (default "warn") -t, --toggle Help message for toggle
(Since the tool is embedded in the main
--helpoption is not available, but running
calico-node -bpf helpdoes work.)
To dump the BPF conntrack table:
$ kubectl exec -n calico-system calico-node-abcdef -- calico-node -bpf conntrack dump ...
Also, it is possible to fetch various counters, like packets dropped by a policy or different errors, from BPF dataplane using the same tool. For example, to dump the BPF counters of
$ kubectl exec -n calico-system calico-node-abcdef -- calico-node -bpf counters dump --iface=eth0 +----------+--------------------------------+---------+--------+-----+ | CATEGORY | TYPE | INGRESS | EGRESS | XDP | +----------+--------------------------------+---------+--------+-----+ | Accepted | by another program | 0 | 0 | 0 | | | by failsafe | 0 | 2 | 23 | | | by policy | 1 | 0 | 0 | | Dropped | by policy | 0 | 0 | 0 | | | failed decapsulation | 0 | 0 | 0 | | | failed encapsulation | 0 | 0 | 0 | | | incorrect checksum | 0 | 0 | 0 | | | malformed IP packets | 0 | 0 | 0 | | | packets with unknown route | 0 | 0 | 0 | | | packets with unknown source | 0 | 0 | 0 | | | packets with unsupported IP | 0 | 0 | 0 | | | options | | | | | | too short packets | 0 | 0 | 0 | | Total | packets | 27 | 124 | 41 | +----------+--------------------------------+---------+--------+-----+ dumped eth0 counters.
Check if a program is dropping packets
To check if an eBPF program is dropping packets, you can use either the
tc command-line tool. For example, if you
are worried that the eBPF program attached to
eth0 is dropping packets, you can use
calico-bpf to fetch BPF counters as described
in the previous section and look for one of the
Dropped counters or you can run the following command:
tc -s qdisc show dev eth0
The output should look like the following; find the
clsact qdisc, which is the attachment point for eBPF programs.
-s option to
tc to display the count of dropped packets, which amounts to the count of packets
dropped by the eBPF programs.
... qdisc clsact 0: dev eth0 root refcnt 2 sent 1340 bytes 10 pkt (dropped 10, overlimits 0 requeues 0) backlog 0b 0p requeues 0 ...
Debug high CPU usage
If you notice
calico-node using high CPU:
kube-proxyis still running. If
kube-proxyis still running, you must either disable
kube-proxyor ensure that the Felix configuration setting
bpfKubeProxyIptablesCleanupEnabledis set to
false. If the setting is set to
true(its default), then Felix will attempt to remove
kube-proxy’s iptables rules. If
kube-proxyis still running, it will fight with
If your cluster is very large, or your workload involves significant service churn, you can increase the interval at which Felix updates the services dataplane by increasing the
bpfKubeProxyMinSyncPeriodsetting. The default is 1 second. Increasing the value has the trade-off that service updates will happen more slowly.
Calico supports endpoint slices, similarly to
kube-proxy. If your Kubernetes cluster supports endpoint slices and they are enabled, then you can enable endpoint slice support in Calico with the
eBPF program debug logs
Calico’s eBPF programs contain optional detailed debug logging. Although th logs can be very verbose (because
the programs will log every packet), they can be invaluable to diagnose eBPF program issues. To enable the log, set the
bpfLogLevel Felix configuration setting to
WARNING! Enabling logs in this way has a significant impact on eBPF program performance.
The logs are emitted to the kernel trace buffer, and they can be examined using the following command:
tc exec bpf debug
Logs have the following format:
<...>-84582  .Ns1 6851.690474: 0: ens192---E: Final result=ALLOW (-1). Program execution time: 7366ns
The parts of the log are explained below:
<...>-84582gives an indication about what program (or kernel process) was handling the packet. For packets that are being sent, this is usually the name and PID of the program that is actually sending the packet. For packets that are received, it is typically a kernel process, or an unrelated program that happens to trigger the processing.
6851.690474is the log timestamp.
ens192---Eis the Calico log tag. For programs attached to interfaces, the first part contains the first few characters of the interface name. The suffix is either
-Eindicating “Ingress” or “Egress”. “Ingress” and “Egress” have the same meaning as for policy:
- A workload ingress program is executed on the path from the host network namespace to the workload.
- A workload egress program is executed on the workload to host path.
- A host endpoint ingress program is executed on the path from external node to the host.
- A host endpoint egress program is executed on the path from host to external host.
Final result=ALLOW (-1). Program execution time: 7366nsis the message. In this case, logging the final result of the program. Note that the timestamp is massively distorted by the time spent logging.
A number of problems can reduce the performance of the eBPF dataplane.
Verify that you are using the best networking mode for your cluster. If possible, avoid using an overlay network; a routed network with no overlay is considerably faster. If you must use one of Calico’s overlay modes, use VXLAN, not IPIP. IPIP performs poorly in eBPF mode due to kernel limitations.
If you are not using an overlay, verify that the Felix configuration parameters
vxlanEnabledare set to
false. Those parameters control whether Felix configured itself to allow IPIP or VXLAN, even if you have no IP pools that use an overlay. The parameters also disable certain eBPF mode optimisations for compatibility with IPIP and VXLAN.
To examine the configuration:
kubectl get felixconfiguration -o yaml
apiVersion: projectcalico.org/v3 items: - apiVersion: projectcalico.org/v3 kind: FelixConfiguration metadata: creationTimestamp: "2020-10-05T13:41:20Z" name: default resourceVersion: "767873" uid: 8df8d751-7449-4b19-a4f9-e33a3d6ccbc0 spec: ... ipipEnabled: false ... vxlanEnabled: false kind: FelixConfigurationList metadata: resourceVersion: "803999"
If you are running your cluster in a cloud such as AWS, then your cloud provider may limit the bandwidth between nodes in your cluster. For example, most AWS nodes are limited to 5GBit per connection.