NAT Traversal Problems Inside Containers
Summary
Joining cluster members via the DH2i MatchMaking service and/or creating tunnels between subnets may fail in Docker/Kubernetes container environments if the container host does not have a firewall configured.
Information
When hosting containers, most Linux container runtimes will configure an internal network for the containers. External connectivity for this network will be controlled using Linux netfilter -- either IPTables or NFTables. The netfilter IP Masquerade functionality allows the containers to be connected to the outside world using NAT, and uses the "conntrack" stateful firewall connection tracking subsystem to keep track of NAT associations. By default, IP Masquerade operates as a permissive port-restricted NAT, and DxEnterprise running inside a container is able to utilize NAT hole-punching for its UDP tunnel traffic.
If an existing cluster member is sitting behind a port-restricted NAT, the NAT will not know about a message path between the existing member and the new member, and messages sent from the new member to the existing member's external UDP endpoints will not be allowed to pass through the existing member's NAT router. The existing member will send a message directly to the new member on its UDP endpoints, typically causing a port-restricted NAT that the existing member sits behind to open itself up to traffic from the new member.
As mentioned earlier, the conntrack module on Linux is responsible for maintaining the table of NAT associations on that host. Two hard rules for conntrack:
-
If an internal host sends a UDP message to an outside address, and doesn't already have a conntrack entry for its path, create one (choosing an unused external port), apply SNAT, and forward the message outside. When choosing an external port, it will prefer the internal port, but if that port is unavailable, it will pick a random port. Also, if the "random-fully" flag is set on the netfilter masquerade rule, it will pick a random port.
-
If a message from outside is sent to the router host, and it matches a conntrack entry for a known message path, apply DNAT and forward the message inside.
If a message from outside is sent to the router host, and does not match a conntrack entry the usual handling is to deliver the message to the router host IP stack, i.e. check for a service running on the router, listening on a UDP socket, and deliver the message to that socket. If no local UDP socket exists on the destination port, drop the message.
However, before checking for a local listening UDP socket on the host, a conntrack entry will be created for the conversation, mapping the remote host/port to the local UDP socket on the host. This conntrack entry prevents the member running on the internal network (i.e. the container) from using the local UDP port to communicate with the remote host. When the member on the internal network sends a message to the remote member, due to the collision with the conntrack entry created earlier, it will be assigned a different external UDP port.
If the host has a stateful firewall running, the conntrack entry will not be created, because the firewall will block the UDP message before it gets to the point of being released to the host IP stack.
Resolution
To resolve this problem, do one of the following:
-
Install a firewall that blocks unsolicited UDP traffic by default.
-
Create an iptables rule to block unsolicited UDP traffic.
sudo iptables -A INPUT -p udp --dport 20000:30000 -m state \! --state RELATED,ESTABLISHED -j DROP