Improving Hadoop Security with Host Intrusion Detection (Part 2)

This is a continuation of our previous post on Hadoop security.

As we mentioned in our earlier post, we can use OSSEC to monitor for the file integrity of these existing Hadoop and HBase systems. OSSEC creates logs which a system administrator can use to check for various system events.

It’s worth noting that big data systems of all kinds, not just Hadoop and HBase, produce significant amounts of log data. Installing a big data cluster is non-trivial, to say the least, and these logs play a crucial role in helping IT staff set up clusters and diagnose system problems. Big data system administrators are, in effect, already used to checking log files for potential problems.

Some of the important Hadoop security events that OSSEC can monitor are:

Failed HDFS operations
HBase logins
Kerberos ticket granting
Root logins to nodes

Configuring an OSSEC agent to monitor one or more Hadoop log files involves adding the paths of the log file directories to the agent’s ossec.conf file. For a HDFS namenode we want to monitor the hadoop-hdfs-namenode-{host}.log file, where {host} is the name or IP address of the name node. This file is normally located in the /var/log/hadoop-hdfs/ directory. Similarly, for an HMaster node we are interested in monitoring hbase-hbase-master-{host}.log file in the /var/log/hbase directory. This gets our Hadoop and HBase log files from OSSEC agents to the server.

The next step is to write decoder rules to parse the logs and alert rules to generate alerts based on the content of the logs. Decoders consist of regular expressions that the OSSEC server uses to find log lines of interest and map words to standard fields recognized by the server. Rules enable the server to examine the decoded fields to find content that is indicative of important security events. When event data from a decoder for a given rule is found, the server generates an alert defined by the rule.

Visualizing Hadoop Security Events

The simplest way to visualize OSSEC security alerts is to continually display the alerts log file. Although this sort of works, it’s like looking at raw data in a spreadsheet. It is difficult to impossible to spot trends in the data.

OSSEC comes equipped to send alert data via syslog to any SIEM (security information and event management) tool that provides syslog compatibility. One SIEM that we like to use is Splunk, together with an open source application called Splunk for OSSEC. This can be installed on the OSSEC server directly from the Splunk application console.

Splunk for OSSEC is designed to take OSSEC alerts and then release a summary, as well as perform trend analysis. An example of an OSSEC dashboard on Splunk is shown below. Here you see summaries of events over time, including the HBase and HDFS events discussed earlier.

Figure 2. Splunk for OPSSEC

(Image originally from http://vichargrave.com/securing-hadoop-with-ossec/)

Summary

Big data systems can benefit from the host intrusion detection services provided by a HIDS like OSSEC. These systems ensure the safety and security of big data systems, which is essential to organizations adopting big data. We are contributing the OSSEC rules for Hadoop back to the OSSEC Project to promote their use in the OSSEC and Hadoop communities, in line with our previous support for open source projects.

Post from: Trendlabs Security Intelligence Blog – by Trend Micro

Improving Hadoop Security with Host Intrusion Detection (Part 2)