After successfully running a Large Hadoop Cluster for a period of time. I started to notice strange things occurring initially with the MapReduce PI example task where tasks would be marked as failed. When looking more closely and attempting to logon/su/ssh to a machine with the userid that was running the job the sshd/su would return:
-bash: fork: retry: Resource temporarily unavailable
I ignored this situation and then ran into a similar condition with a Pig job when multiple Pig jobs were run with the same userid not only on the node initiating the job, but out on the YARN NodeManager nodes where the nodes were executing a number of MapTasks just like in the MapReduce examples. The answer to this problem is increasing the nproc value. One would not think this is an issue but this goes back to how the Linux Kernel see’s a TID vs a PID. A TID is a schedulable object according to the kernel and a PID contains the group of TIDS. One in the same. This issue has plagued me with Apache/IBM HTTP Server and Websphere Application Server on Linux since the early 2000’s. The only answer is increasing the nproc for specified set of users to a higher value. This issue bit more people when RHEL6/Cent0s6 folks attempted to mitigate fork bombs by setting nproc to 1024. This may seem like a great idea for someone running simple shell tasks but in a multithread process world this can cause serious issues.
Redhat Bugzilla article discussing it:
ps -eLf | grep -i userId | wc -l
max user processes (-u) 1024
Raised for everyone in hdpusers group:
max user processes (-u) 30000