Category Archives: HDFS

Apache Ranger Audit Logs stored in HDFS parsed with Apache Spark

Using Apache Spark to parse a large HDFS archive of Ranger Audit logs using Apache Spark to find and verify if a user attempted to access files in HDFS, Hive or HBase. This eliminates the need to use a Hive SerDe to read these Apache Ranger JSON Files and to have to create an external… Read More »