Category Archives: Hadoop

Hadoop and Tuning related to Hadoop


Over the past few years since working on with Hadoop and HDFS. Two types of requests that came up pretty regularly. One being can we move files from a Windows SMB/CIFS file share into Hadoop/HDFS usually containing 1000’s of CSVs or XLSX/XLS files. The other use case was can we move files from a mainframe… Read More »

Hadoop, Java and HTTPD and /etc/security/limits.d/ nproc/pid-max

After successfully running a Large Hadoop Cluster for a period of time. I started to notice strange things occurring initially with the MapReduce PI example task where tasks would be marked as failed. When looking more closely and attempting to logon/su/ssh to a machine with the userid that was running the job the sshd/su would return: -bash:… Read More »

Hadoop and ip_conntrack: table full, dropping packet

I’m pretty sure many folks have seen this specific error across multiple different linux systems specifically when iptables is enabled and the OS has thousands of connections coming in second. In my case I ran into this Examples of this are with Hadoop NameNode. Someone accidentally executed iptables -L to try to get a list… Read More »

Hadoop and Redhat System Tuning /etc/sysctl.conf

One of the most overlooked things after setting up a Hadoop cluster is probably OS System Tuning. This entry will cover /etc/sysctl.conf aka the Linux Kernel Params that can be tuned. /etc/sysctl.conf: ## ALWAYS INCREASE KERNEL SEMAPHORES especially IF using IBM JDK with SharedClassCache also a separate discussion # Controls the default maxmimum size of… Read More »

Integrating Apache Hadoop and Apache Flume with IBM MQ

Over the past 2 years of working with Apache Hadoop a few things have come up folks wanting to use Apache Kafka which definitely has it’s place in the Hadoop Big Data and Next Generation of Technology spheres. But there is also the need to integrate what a lot of folks would consider Legacy Messaging… Read More »

Why now?

What is the GS Tech Blog! It’s a place for me to discuss technology I’ve worked with over many years. So after working as a Technology Systems Engineer for almost 20 years. I decided it’s time to create a blog to publish some of my personal ideas regarding technology. All opinions are my own and… Read More »