Apache Ranger Audit Logs stored in HDFS parsed with Apache Spark

Using Apache Spark to parse a large HDFS archive of Ranger Audit logs using Apache Spark to find and verify if a user attempted to access files in HDFS, Hive or HBase. This eliminates the need to use a Hive SerDe to read these Apache Ranger JSON Files and to have to create an external… Read More »

Site to Site VPN using Asus Merlin Router and Unifi USG-Pro4

Site to Site VPN using Asus Merlin Router and Unifi USG-Pro4 I decided recently to replace my Asus RT-N66U. It served me well over many years but I had become frustrated that Asus had stopped patching and maintaining the firmware. I also noticed over time strange things would occur at times with the Asus Router.… Read More »

Integrating Apache Nifi with IBM MQ

Integrating Apache Nifi with IBM MQ This would be a continuation of the IBM MQ and Hadoop integration article I first posted a few years ago. This explains how to integrate IBM MQ with Apache Nifi or Hortonworks HDF. IBM MQ is extremely important when attempting to integrate new technologies with legacy environments specifically mainframe environments… Read More »

Benefits of using IBM Java and JDK features

After working many years with IBM WebSphere Application Server on Solaris, Linux on PSeries, XSeries and ZSeries and Z/OS. I came to realize the IBM version of Java has much better debug tools and documentation available to perform debugging and performance tuning. Examples of these features are the IBM AOT Ahead of Time Compiler which… Read More »

Hadoop, Java and HTTPD and /etc/security/limits.d/ nproc/pid-max

After successfully running a Large Hadoop Cluster for a period of time. I started to notice strange things occurring initially with the MapReduce PI example task where tasks would be marked as failed. When looking more closely and attempting to logon/su/ssh to a machine with the userid that was running the job the sshd/su would return: -bash:… Read More »

Hadoop and ip_conntrack: table full, dropping packet

I’m pretty sure many folks have seen this specific error across multiple different linux systems specifically when iptables is enabled and the OS has thousands of connections coming in second. In my case I ran into this Examples of this are with Hadoop NameNode. Someone accidentally executed iptables -L to try to get a list… Read More »