Hadoop and ip_conntrack: table full, dropping packet
I’m pretty sure many folks have seen this specific error across multiple different linux systems specifically when iptables is enabled and the OS has thousands of connections coming in second. In my case I ran into this Examples of this are with Hadoop NameNode. Someone accidentally executed iptables -L to try to get a list to see if systems had iptable rules enabled. This simple command causes the ip_conntrack module to be loaded into the kernel and now requires that the kernel track all connections into and out of the machine and as folks know a NameNode is a very resource intensive process that has thousands of connections coming in since it has to keep track of all HDFS operations in the cluster. This was also seen with the YARN Application Timeline Server and Ambari Metrics Collector. Basically ip_conntrack will drop the connection and the clients will have to resend the packets when the kernel is able to again accept the packet.
Here is an example of a non Hadoop user hitting the issue: https://developers.soundcloud.com/blog/shoot-yourself-in-the-foot-with-iptables-and-kmod-auto-loading
So moral of the story is make sure you disable iptables from loading if YOU are NOT using it both by blacklisting the modules and chkconfig iptables off and chkconfig ip6tables off. That or a new option that was mentioned in the above link is you can try to use PreRouting rules to bypass the connection tracking for specific high use port (iptables -t raw -A PREROUTING -p tcp –dport 8020 -j NOTRACK) say for NameNode RPC Port. Or if you really need connection tracking increase the number to be tracked using /etc/sysctl.conf
# Increase conntracks only needed if someone runs iptables -L.
net.nf_conntrack_max = 10000000
Hope this helps someone else from hitting this..