java Archives - GS Tech Blog

Iterating a JSON using Jackson-Databind Library like JDOM for XML

by GS
in Apache Spark, Apache Spark Streaming, Hadoop, Hadoop XML Processing, jackson-databind, java, JSON
on December 17, 2018

0

I recently came across a situation that required me to be able to iterate over a JSON message payload similar to what can be done with JDOM in regards to XML similar to what I do within my Stax XML Mapreduce InputFormat. So basically in this case you need to treat JSONArray’s similar to XML…
Read more

Xml Processing with MapReduce/Spark using an Xml StaX Parser

by GS
in Apache, Apache Hadoop, Apache Spark, Hadoop, Hadoop Mapreduce, Hadoop XML Processing, HDFS, Hortonworks, java, StaX XML Parser, XmlInputFormat, XmlStaxInputFormat
on November 20, 2018

0

XmlStaxInputFormat / XmlStaxFileRecordReader Github Project – https://github.com/gss2002/xml-stax-mr After some time it seemed like a gap that existed with Hadoop MapReduce and Spark that the existing XmlInputFormat classes from Mahout were using fseek and searching for strings as the file is read in from HDFS. The ability to break up a large Xml file becomes extremely important…
Read more

Apache SolrCloud Kerberos Configuration

by GS
in Apache, Apache Solr, Hadoop, java, Kerberos, Linux, SolrCloud
on March 4, 2017

0

I’ve been working on securing Apache SolrCloud with kerberos. This includes configuring Zookeeper. So after struggling and lots of searching I came up with a working kerberized solution forÂ SolrCloud, with Zookeeper, and Apache Ranger for Authorization. First I tried to secure a standalone Solr instance by updating to theÂ Solr 6x branch which is a SNAPSHOT…
Read more

Benefits of using IBM Java and JDK features

by GS
in IBM, IBM HealthCenter, IBM HeapAnalyzer, IBM Java, IBM Javacore, IBM JDK, IBM System Dump, java, Linux, Tuning, Websphere
on March 17, 2016

0

After working many years with IBM WebSphere Application Server on Solaris, Linux on PSeries, XSeries and ZSeries and Z/OS. I came to realize the IBM version of Java has much better debug tools and documentation available to perform debugging and performance tuning. Examples of these features are the IBM AOT Ahead of Time Compiler which…
Read more

Hadoop, Java and HTTPD and /etc/security/limits.d/ nproc/pid-max

by GS
in Apache, Hadoop, java, limits.d, Linux, nproc, pid, tid, Tuning
on March 1, 2016

0

After successfully running a Large Hadoop Cluster for a period of time.Â I started to notice strange things occurring initially with the MapReduce PI example task where tasks would be marked as failed. When looking more closely and attempting to logon/su/ssh to a machine with the userid that was running the job the sshd/suÂ would return: -bash:…
Read more

Integrating Apache Hadoop and Apache Flume with IBM MQ

by GS
in Apache, Flume, Hadoop, IBM, java, Linux, Messaging, MQ
on May 30, 2015

0

Integrating Apache Hadoop and Flume with IBM MQ Over the past 2 years of working with Apache Hadoop a few things have come up folks wanting to use Apache Kafka which definitely has it’s place in the Hadoop Big Data and Next Generation of Technology spheres. But there is also the need to integrate with…
Read more

Category: java

Iterating a JSON using Jackson-Databind Library like JDOM for XML

Xml Processing with MapReduce/Spark using an Xml StaX Parser

Apache SolrCloud Kerberos Configuration

Benefits of using IBM Java and JDK features

Hadoop, Java and HTTPD and /etc/security/limits.d/ nproc/pid-max

Integrating Apache Hadoop and Apache Flume with IBM MQ

Links

Category: java

Iterating a JSON using Jackson-Databind Library like JDOM for XML

Xml Processing with MapReduce/Spark using an Xml StaX Parser

Apache SolrCloud Kerberos Configuration

Benefits of using IBM Java and JDK features

Hadoop, Java and HTTPD and /etc/security/limits.d/ nproc/pid-max

Integrating Apache Hadoop and Apache Flume with IBM MQ

Links

Categories