Future Generation Computer Systems 29 (2013) 27–45
Contents lists available at SciVerse ScienceDirect
Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs
A comprehensive vulnerability based alert management approach for large networks Humphrey Waita Njogu ∗ , Luo Jiawei ∗ , Jane Nduta Kiere, Damien Hanyurwimfura College of Information Science and Engineering, Hunan University, Changsha, Hunan, China
article
info
Article history: Received 1 April 2011 Received in revised form 30 March 2012 Accepted 7 April 2012 Available online 25 April 2012 Keywords: Alert management Alert verification Vulnerability database Alert correlation Intrusion detection system
abstract Traditional Intrusion Detection Systems (IDSs) are known for generating large volumes of alerts despite all the progress made over the last few years. The analysis of a huge number of raw alerts from large networks is often time consuming and labour intensive because the relevant alerts are usually buried under heaps of irrelevant alerts. Vulnerability based alert management approaches have received considerable attention and appear extremely promising in improving the quality of alerts. They filter out any alert that does not have a corresponding vulnerability hence enabling the analysts to focus on the important alerts. However, the existing vulnerability based approaches are still at the preliminary stage and there are some research gaps that need to be addressed. The act of validating alerts may not guarantee alerts of high quality because the validated alerts may contain huge volumes of redundant and isolated alerts. The validated alerts too lack additional information needed to enhance their meaning and semantic. In addition, the use of outdated vulnerability data may lead to poor alert verification. In this paper, we propose a fast and efficient vulnerability based approach that addresses the above issues. The proposed approach combines several known techniques in a comprehensive alert management framework in order to offer a novel solution. Our approach is effective and yields superior results in terms of improving the quality of alerts. © 2012 Elsevier B.V. All rights reserved.
1. Introduction Management of security in large networks is a challenging task because new threats and flaws are being discovered every day. In fact, the number of exploited vulnerabilities continues to rise as computer networks grow. Intrusion Detection Systems (IDSs) are used to manage network security in many organisations. They provide an extra layer of defence by gathering and analysing information in a network in order to identify possible security breaches [1]. There are two broad types of IDSs: signature based and anomaly based. The former uses a database of known attack signatures for detection while the latter uses a model of normative system behaviour and observes deviations for detection [2]. If an intrusion is detected, an IDS generates a warning known as alert or alarm. IDSs are designed with a goal of delivering alerts of high quality to analysts. However, the traditional IDSs have not lived up to this promise. They trigger an overwhelming number of unnecessary alerts that are primarily false positives resulting from non existing intrusions. Analysing the alerts of this nature is a challenging task and therefore the alerts are prone to be misinterpreted, ignored or
∗
Corresponding authors. E-mail addresses:
[email protected] (H.W. Njogu),
[email protected] (L. Jiawei),
[email protected] (J.N. Kiere),
[email protected] (D. Hanyurwimfura). 0167-739X/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.future.2012.04.001
delayed. The important alerts are often buried and hidden among thousands of other unverified, irrelevant and low priority alerts. There are several reasons that lead IDSs to generate huge numbers of alerts such as IDS systems being unaware of the network context they are protecting [3,4]. Signature based IDSs are often run with a default set of signatures. Therefore, alerts are generated for most of the attack attempts irrespective of success or failure to exploit vulnerability in the network under consideration. In fact, signature based IDSs usually do not check the effectiveness of an attack to the local network context thus contributing to a high number of false positive alerts [2]. Further, most of the traditional IDSs have limited observation abilities in terms of network space as well as the kind of attacks they can deal with [5]. Attack evidences against network resources can be scattered over several hosts. In fact, it is a challenging issue to have an IDS with properly deployed sensors able to detect the attacker traces at different spots in the network and be able to find dependencies among them. Research has shown that most of the damages result from vulnerabilities existing on application, services, ports and protocols of hosts and networks [6]. Fixing all the known vulnerabilities before damaging intrusions take place in order to reduce the number of alerts [7,8] may not be effective especially in large networks because of the following limitations:
• Time gap between vulnerability disclosure and the software patch released by software developers.
28
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45
• Updating hosts from different vendors in a large network takes
• Construction of comprehensive and dynamic threat profile
longer thus exposing the hosts to intruders. • Some vulnerabilities are protocol based thus an immediate patch may not be available.
known as Enhanced Vulnerability Assessment (EVA) data. EVA data represents all vulnerabilities present in a network. EVA data is queried to assert information about alerts and the context in which they occur thereby improving the accuracy of alerts. • Introduction of new metrics such as alert relevance, severity, frequency and source confidence thus improving the semantics of alerts in order to offer better discriminative ability than the ordinary alert attributes when evaluating the alerts. • Maintenance of history of alerts that contain the recent and frequent meta alerts. Generally, IDSs produce alerts that may have similar patterns manifested by features such as frequent IP addresses and ports. Therefore, the history of alerts assists in handling the incoming related alerts thus improving the processing speed. • Application of fuzzy based reasoning to determine the interestingness of the validated alerts based on their metric values. This helps to identify the most important alerts.
It is therefore demanding to apply vulnerability analysis to improve network security. The vulnerability based alert management approaches are popular and extremely promising in delivering quality alerts [2,9,10]. These approaches are widely accepted and used by many researchers to improve the quality of alerts especially when processing a huge number of alerts from signature based IDSs. These approaches improve the quality of alerts by eliminating alerts with low relevance in relation to the local context of a given network. They correlate network vulnerabilities with IDS alerts and filter out the alerts that do not have a corresponding vulnerability. Therefore, the vulnerability based approaches are able to remove the need for a complicated attack step library and reduce irrelevant alerts (irrelevant alerts correspond to attacks that target a nonexistent service) [11]. Most of the vulnerability based approaches are able to produce alerts that are useful in the context of the network. However, these approaches are still at the preliminary stage and there are some research gaps that need to be addressed in order to produce better results. So far there is little attention given to the following key issues. The act of validating alerts may not guarantee alerts of high quality because the validated alerts may contain huge volumes of redundant alerts as evidently seen in several vulnerability based works such as [2,4,7,9,12–16]. Generally, the analysts who review the validated alerts may take a longer time to understand the complete security incident because it would involve evaluating each redundant alert. Consequently, the analysts may not only encounter difficulties when taking the correct decision but would also take a longer time to respond against the intrusions. In fact, it is common for attacks to produce thousands of similar alerts hence it is more useful to reduce the redundancy in the validated alerts. There is no practical use in retaining all the redundant alerts. In reference to several vulnerability based approaches such as [2,4,12–16], the validated alerts may also contain a massive number of isolated alerts that are very difficult to deal with. The presence of isolated alerts may hinder the potential of discovering the causal relationship in the validated alerts. Another challenging issue is that numerous vulnerability based approaches such as [12,13] depend on outdated vulnerability data to verify alerts hence likely to contribute to poor alert verification. New attacks and vulnerabilities in networks are discovered everyday hence the need to update the vulnerability data accordingly. In addition, the information found in the validated alerts is basic and insufficient and may not enhance the meaning and semantics of the validated alerts. In fact, the obvious alert features may not adequately describe alerts in terms of their relevance, severity, frequency and the confidence levels of their sources. Therefore, there is need to supplement the information found on the alerts to further understand the validated alerts in order to reduce the overall amount of unnecessary alerts. The primary focus of this paper is to address the above issues in order to improve the effectiveness of vulnerability based approaches. We developed a fast and efficient approach based on several known techniques (such as alert correlation and prioritisation) in a comprehensive alert management framework in order to offer a novel solution. The contributions of this work are summarised as follows:
• Development of an alert correlation engine to reduce the huge volumes of redundant and isolated alerts contained in the validated alerts. Redundant alerts are often generated from the same intrusion event or intrusions carried out in different stages. Thus, the correlation engine helps the analysts to quickly understand the complete security incident and take the correct decision.
The following terminologies are used in this paper:
• Vulnerability: A flaw or weakness in a system which could be exploited by intruders.
• Attack: Any malicious attempt to exploit vulnerability. An • • • •
•
attack may be successful or not in the network under consideration. Relevant attack: An attack which successfully exploits a vulnerability. Non relevant attack: An attack which fails to successfully exploit vulnerabilities in a network. Attack severity: The degree of damage associated with an attack in a network. Alert: A warning generated by IDS on a malicious activity. An alert may be interesting or non interesting. An interesting alert represents a relevant attack. While non interesting alert represents an unsuccessful attack attempt or any other alert considered not important. Meta alert: Summarised information of related alerts.
The rest of the paper is organised as follows: Section 2 describes the related work. Section 3 describes the proposed approach. Section 4 discusses the experiments and performance of the proposed approach. Finally, a conclusion and suggestions for future work are given in Section 5. 2. Related work Over the last few years, the research in intrusion detection has focused on the post processing of alerts in order to manage huge volumes of alerts generated by IDSs. In this section, we analyse several vulnerability based correlation approaches. The traditional IDSs tend to generate too general alerts that are not network specific because of being unaware of the network context they monitor. IDSs are often run with a default set of signatures hence generate alerts for most of the intrusions irrespective of success or failure to exploit vulnerabilities in the network under consideration. According to Morin et al. [9], many alerts (especially false positives) involve actors which are inside the monitored information system and whose properties are consequently also observable. The use of vulnerability data is advocated as an important tool to reduce the noise in the alerts in order to improve the quality of alerts [9,10,13]. The vulnerability data helps to differentiate between successful and failed intrusion attempts. Identifying and eliminating failed intrusion attempts improves the quality of final alerts. Gula [12] illustrates how vulnerability data elicits high quality alerts from a huge number of alerts that are primarily false
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45
positives. The author states that a particular true intrusion targets a particular vulnerability and therefore correlating the alert with vulnerability could reveal whether the vulnerability is exploited or not. The author further describes how alerts can be correlated with vulnerability data. Another similar approach to verify alerts is proposed by Kruegel and Robertson [13] in order to improve the false positive rate of IDSs. The focus of the work is to provide a model for alert analysis and generation of prioritised alert reports for security analysts. Despite some merits with the aforementioned approaches, they are preliminary and have not been integrated in the overall comprehensive correlation process. They also lack useful additional information on host architecture, software and hardware to support better alert verification. These approaches rely on information about the security configuration of the protected network that was collected at an earlier time using vulnerability scanning tools, and do not support dynamic mechanisms for alert verification. In addition, these approaches do not reduce the redundant and isolated alerts after alert verification. Eschelbeck and Krieger [14] propose an effective noise reduction for intrusion detection systems. This technique eliminates the unqualified IDS alerts by correlating them with environmental intelligence about the network and systems. This work provides an overview of correlation requirements with a proposed architecture and solution for the correlation and classification of IDS alerts in real time. The implementation of this scheme has demonstrated a significant reduction of false alerts but it does not reduce the redundant and isolated alerts after alert verification. A vulnerability based alert filtering approach is proposed by Porras et al. [15]. It takes into account of the impact of alerts on the overall mission that a network infrastructure supports. This approach uses the knowledge of the network architecture and vulnerability requirements of different incident types to tag alerts with a relevance metric and then prioritises them accordingly. Alerts representing attacks against non-existent vulnerabilities are discarded. The work does not have a comprehensive vulnerability data model to fully integrate the vulnerability data in the network context. This work processes alerts from different sources such as IDS, firewalls and other devices while ours deals with IDS alerts only. Morin et al. [16] present M2D2 model which relies on a formal description (on information such as network context data and vulnerabilities) of sensor capabilities in terms of scope and positioning to determine if an alert is a false positive. The model is used to verify if all sensors that could have been able to detect an attack agreed during the detection process, making the assumption that inconsistent detections denote the presence of a false alert. Although this approach benefits from a sound formal basis, it suffers from the limitation that false alerts can only be detected for those cases in which multiple sensors are able to detect the same attack and can participate in the voting process. In fact, many real-world IDSs do not provide enough detection redundancy to make this approach applicable. An extension of the M2D2 model, known as the M4D4 data model [9] is proposed to provide reasoning about the security alerts as well as the relevant context in a cooperative manner. The extended model is a reliable and formal foundation for reasoning about complementary evidences providing the means to validate reported alerts by IDSs. Although the aforementioned approaches look promising, they have only been evaluated mathematically but not tested with real datasets. The M4D4 model suffers from some limitations of M2D2. In addition, the two approaches do not reduce the redundant and isolated alerts after alert verification. Valeur et al. [17] propose a general correlation model that includes a comprehensive set of components such as alert fusion, multi-step correlation and alert prioritisation. The authors observe that reduction of alerts is an important task of alert management
29
and note that when an alert management approach receives false positives as input, the quality of the results can be degraded significantly. Failure to exclude the alerts that refer to failed attacks may lead to false positive alerts being misinterpreted or given undue attention. Our approach is inspired by the logical framework proposed by the authors. We verify alerts prior to classification and correlation processes because efforts to improve the quality of an alert should start in the early stages of alert management. We use different correlation schema to handle alerts from different attacks. In addition, our approach is able to handle alerts from unknown attacks. Another interesting work is proposed by Yu et al. [18]. The authors propose a collaborative architecture (TRINETR) for multiple IDSs to work together to detect real-time network intrusions. The architecture is composed of three parts: collaborative alert aggregation, knowledge-based alert evaluation and alert correlation to cluster and merge alerts from multiple IDS products to achieve an indirect collaboration among them. The approach reduces false positives by integrating network and host system information into the evaluation process. However, the approach has a major shortcoming because it has not implemented the alert correlation part which is very crucial in generating condensed alert views. Chyssler et al. [19] propose a framework for correlating syslog information with alerts generated by host based intrusion detection systems (HIDS) and network intrusion Detection systems (NIDS). The process of correlation begins by eliminating alerts from non effective attacks. This involves matching both HIDS and NIDS events in order to determine the success of attack attempt. This approach has some merits, however correlating alerts from HIDS and NIDS is difficult because response times of HIDS and NIDS are different. In addition, the approach does not consider the relevance of attack in the context of network and does not reduce the redundant and isolated alerts after alert verification. To reduce the number of false alerts generated by traditional IDSs, Xiao and Xiao [20] present an alert verification based system. The system distinguishes the false positives from true positives or confirms the confidence of the alert by integrating context information of protected network with alerts. The proposed system has three core components—pre-processing, alert verification and alert correlation. The scheme looks promising but has several shortcomings. It does not give details on how alerts are verified and correlated. Moreover, the work is not accompanied with experiments hence it is difficult to assess its effectiveness. Liu et al. [4] propose a collaborative and systematic framework to correlate alerts from multiple IDSs by integrating vulnerability information. The approach applies contextual information to distinguish between successful and failed intrusion attempts. After the verification process, the alerts are assigned confidence and the corresponding actions are triggered based on that confidence. The confidence values are: 0 for a false alert while 1 is for a true alert. Although the approach has some merits, it has several shortcomings. The scheme does not give details on the procedure used to validate the alerts and does not include details of how alerts are transformed into meta alerts. In addition, the scheme does not differentiate different levels of alert relevance. Further, the scheme only handles alerts with reference numbers thus lowering detection rates. Bolzoni et al. [21] present an architecture for alert verification in order to reduce false positives. The technique is based on anomaly based analysis of the system output which provides useful context information regarding the network services. The assumption of this approach is that there should be an anomalous behaviour seen in the reverse channel in vicinity of alert generation time. If these two are correlated a good guess about the attack corresponding to alert can be made. The effectiveness of this approach depends on the accurate identification of an output anomaly by the output
30
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45
anomaly detector engine, for alerts generated by the IDS. For every attack there may not be a corresponding output anomaly and this could result in false negatives. Further, the time window used to look for the correlation is very critical for correctness of the scheme; a very small time window may lead to missing of attacks while a large time window may result in increase of false positive alerts. More so, this approach does not reduce the redundant and isolated alerts after verification. In an effort to reduce the amount of false positives, Colajanni et al. [7] present a scheme to filter innocuous attacks by taking advantage of the correlation between the IDS alerts and detailed information concerning the protected information systems. Some of the core units of the scheme are filtering and ranking units. The authors extended this work by proposing a distributed architecture [22] to provide security analysts with selective and early warnings. One of its core components is the alert ranking unit that correlates alerts with vulnerability assessment data. According to this approach, alerts are ranked based on match or mismatch of alerts between the alert and the vulnerability assessment data. We considered this type of ranking to be limiting because of two reasons: First, it relies on match or mismatch of only one feature (software or application) to make the decision whether an alert is critical or non critical. Secondly, it does not show the degree of relevance of alerts hence it does not offer much help to the analyst. In our approach, alerts are ranked in different levels according to their degree of interestingness. Further the two aforementioned approaches do not reduce the redundant and isolated alerts contained in the validated alerts. Massicotte et al. [23] propose a possible correlation approach based on integration of Snort, Nessus and Bugtraq databases. The approach uses reference numbers (identifiers) found on Snort alerts. The reference numbers refer to Common Vulnerability Exposure (CVE), Bugtraq and Nessus scripts. The authors show a possible correlation based on the reference numbers. However, it is not effective because not all alerts have reference numbers. In addition, there is no guaranty that the lists provided by CVE and Bugtraq contain complete listing of vulnerabilities. Neelakantan and Rao [24] propose an alert correlation approach comprising of three stages: (i) finding vulnerabilities in the network under consideration, (ii) creating a database of all vulnerabilities having reference numbers (CVE or Bugtraq) and (iii) selecting signatures that correspond to the identified vulnerabilities. This approach requires reconfiguration of the signatures when there is a network change (such as hosts being added or removed) leading to downtime of the IDS engine. Moreover, it only handles alerts with reference numbers thus lowering detection rates due to incompleteness of vulnerability reference numbers. The approach does not reduce the redundant and isolated alerts contained in the validated alerts. There have been efforts to evaluate alerts using some metrics based on vulnerability data but have not been used all together as proposed in our paper. For example, Bakar and Belaton [25] propose an Intrusion Alert Quality Framework (IAQF) for improving alert quality. Central to this approach is the use of vulnerability information that helps to compute alert metrics (such as accuracy, reliability, correctness and sensitivity) in order to prepare them for higher level reasoning. The framework improves the alert quality before alert verification. However, the approach does not reduce the redundant and isolated alerts after alert verification. A similar work is presented by Chandrasekaran et al. [26]. The authors propose an aggregated vulnerability assessment and response against zero-day exploits. The approach uses metrics such as deviation in number of alerts, relevance of the alerts and variety of alerts generated in order to prepare the final alerts. The work is limited to zero day attacks while our work handles different types of attacks. In addition, the authors have not given details on how the
threat profile is constructed and how the alerts are verified. Alsubhi et al. [27] present an alert management engine known as FuzMet which employs several metrics (such as severity, applicability, importance and sensor placement) and a fuzzy logic based approach for scoring and prioritising alerts. Although some of the metrics are similar to the ones proposed in our work, there are several key differences. The scheme tag metrics on the unverified alerts (raw alerts) containing unnecessary alerts while in our work, only the validated alerts are tagged with metrics i.e. metrics are only tagged to the necessary alerts that show a certain degree of relevance to the network context in order to improve the accuracy of the final alerts. In addition, our work employs fewer metrics yet it is very powerful to improve the performance of alert management. Further, we use a simpler procedure to compute for the alert metrics. Njogu and Jiawei [28] propose a clustering approach to reduce the unnecessary alerts. The approach uses vulnerability data to compute for alert metrics. It determines the similarity of alerts based on the alert metrics and the alerts that show similarity are grouped together in one cluster. We have extended this previous work by implementing a better architecture of building a dynamic vulnerability data that draws information from three sources i.e. network resource data, known vulnerability database and scan reports from multiple vulnerability scanners. In addition, our new work correlates alerts based on both alert features and alert metrics. We have also introduced the concepts of alert classification and prioritisation based on alert metrics. A more recent and interesting work which is closer to our work is presented by Hubballi et al. [2]. The authors propose a false positive alert filter to reduce false alerts without manipulating the default signatures of IDSs. Central to this approach is the creation of a threat profile of the network which forms the basis of alert correlation. The method correlates the IDS alerts with network specific threats and filters out false positive alerts. This involves two steps: (i) alerts are correlated with vulnerabilities to generate correlation binary vector generation and (ii) classification of correlation binary vectors using the neural networks. The idea of correlating alerts with vulnerabilities in the work is promising in delivering accurate alerts. However, this approach does not reduce the number of redundant and isolated alerts after alert verification. Further comparisons are presented in Section 3.1.4. Al-Mamory and Zhang [29] note that most of the general alert correlation approaches such as Julisch [30], Valdes and Skinner [31], Debar and Wespi [32], Al-Mamory and Zhang [33], Sourour et al. [34], Jan et al. [35], Lee et al. [36] and Perdisci et al. [37] do not make full use of the information that is available on the network under consideration. The general alert correlation approaches usually rely on the information on the alerts which may not be comprehensive and reliable to understand the nature of the attack and may lead to poor correlation results. In fact, correlating alerts that refer to failed attacks can easily result in the detection of whole attack scenarios that are nonexistent. Thus, it is useful to integrate the network context information in alert correlation in order to identify the exact level of threat that the protected systems are facing. With respect to the related work, we noted that most vulnerability based approaches proposed in the literature are able to validate alerts successfully. However, validating alerts may not guarantee alerts of high quality. For example, the validated alerts may contain massive number of redundant and isolated alerts from the same intrusion event and those carried out in different stages. Such issues have received little attention in the research field. In order to address the shortcomings of vulnerability based approaches, our work has synergised several techniques in a comprehensive alert management framework in order to build a novel solution.
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45
31
the alert is regarded as a suspicious alert and forwarded to the Alert-EVA data verifier. (iv) The Alert-EVA data verifier uses EVA data to: validate the alerts, eliminates the obvious non interesting alerts and computes alert metrics. Alerts are tagged with alert metrics (transformed alerts) and forwarded to stage 2. The verifier has 6 sub verifiers to process the suspicious alerts. Stage 2 (v) Using a corresponding alert sub classifier, the transformed alerts (with alert metrics) are classified into one of the classes according to their alert metrics. The alert classifier has 6 sub classifiers. Alerts contained in these classes are further classified into 2 super classes (alert classes for ideal interesting alerts and alert classes for partial interesting alerts). These alert classes are forwarded to the alert correlator component. Stage 3 (vi) The alert correlator component reduces the redundant and isolated alerts and finds the causal relationships in alerts. The correlator has sub correlators dedicated to each of the alert classes for every group of attack. The correlated alerts are finally presented in form of meta alerts. The frequent meta alerts are forwarded to the meta alert history for two reasons: correlates the future related alerts and assists in modifying the IDS signatures. (vii) The analyst receives the meta alerts and can view alerts in terms of their priority and the nature of attack. 3.1. Stage 1 Stage 1 has 5 sub components that are discussed in the next sub sections: 3.1.1. Alert receiver Generally, IDSs do not produce alerts in an orderly manner. The interesting alerts are buried under heaps of redundant, irrelevant and low priority alerts. The alert receiver unit receives and prepares the raw alerts appropriately. The important alert features such as IP addresses and port numbers are extracted and stored in a relational database for further analysis.
Fig. 1. Proposed alert management framework.
3. Proposed alert management approach In this section, we describe a fast and efficient 3-stage alert management approach. This approach has three stages: Stage 1 involves alert collection, correlation of alerts against the meta alert history and verification of alerts against EVA data; Stage 2 involves classification of alerts based on the alert metrics; and Stage 3 involves correlation of alerts in order to reduce the redundant and isolated alerts. The 3 stages are illustrated below (Refer to Fig. 1): Stage 1 (i) IDS Sensor generates alerts on malicious activities. (ii) Raw alerts are received and pre-processed by alert receiver unit. (iii) The pre-processed alerts are compared with meta alerts (meta alert history) using the Alert-Meta alert history correlator. If the alert under consideration matches with any of the meta alerts then the alert is considered successful and forwarded to the alert correlator component. But if there is no match, then
3.1.2. Meta alert history IDSs produce alerts that may have similar patterns [30,35,38,39]. Similar patterns are manifested by frequent IP addresses, ports and triggered signatures within a period of time. Some of the alert patterns may appear frequent and can last for a relatively longer period of time. Julisch [30] states that: Large group of alerts have a common root cause. Most of the alerts are triggered by only a few signatures and if a signature has triggered many alerts over longer periods of time, it is also likely to do so in the near future generating many similar alerts with the same features. Vaarandi [38] adds that more than 85% of alerts are produced by the most prolific signatures. The meta alert history keeps the recent and frequent alerts (ideal interesting and partial interesting meta alerts) for two main reasons:
• To assist in handling the incoming related alerts as illustrated by the following logic: ‘‘If an incoming alert with signature S originates from A to B AND there was previously an meta alert M with signature S originating from A to B AND M was classified as an ideal interesting (meta) alert THEN the incoming alert is an ideal interesting alert’’. The incoming alert is compared against meta alerts by Alert-Meta alert history correlator. • To help the analysts in specifying the root causes behind the alerts. The analysts are able to identify signatures or vulnerabilities that need to be modified or fixed thus improving
32
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45
the future quality of alerts. A root cause (vulnerability) in the network can instigate an IDS to trigger alerts with similar features. In this work, we focus on the first role of the meta alert history because it is easier to implement and gives better results. Thus, by identifying the frequent patterns of alerts, it is easier to predict and know how to handle any incoming related alert hence improving the efficiency of alert management. The second role may not yield successful results if it is implemented poorly because more false negatives could be introduced if it is not carefully implemented. The task of modifying signatures is very complex in terms of skills and time. Moreover, some vulnerabilities cannot be fixed immediately as noted earlier and therefore the second role is considered as our next future task. The content of the meta alert history includes: meta alert type (either ideal interesting or partial interesting), name of alert, class of attack, source address (IP and port), destination address (IP and port), meta alert class and age of meta alert. To improve the efficiency and scalability of this sub component, we eliminate meta alerts that appear old (aged), have not been referenced to or have not been matched with alerts for a relatively longer time (time value to be determined by the analyst). The new meta alerts are forwarded by the meta alert correlator and appended in the meta alert history. Old meta alerts are replaced by the new ones if they are similar. The details of the formation of meta alerts and how they are forwarded to the meta alert history are discussed later in this section. 3.1.3. Alert-Meta alert history correlator The Alert-Meta alert history correlator uses the following procedure: Input: Pre-processed alerts, meta alerts from meta alert history. Output: Successful alerts (with inherited properties), Suspicious alerts. Step 1: Compare features of pre-processed alert with meta alert (in terms of IP addresses, ports, names). Step 2: Search for potential ideal interesting meta alerts from meta alert history. Get IP destination of alert (Alert.IPdest) to extract the potential meta alerts (MetaAlert.IP) i.e. MetaAlert.IP that match Alert.IPdest. Step 3: Choose the meta alert that best represents the alert. The alert is matched with each of the potential ideal interesting meta alerts and the one showing a perfect match in terms of IP address, port and name is chosen. Step 4: If a perfect match is established in step 3, the successful alert inherits the properties of the parent ideal interesting meta alert such as the name of class and is forwarded to the alert correlator in stage 3. Step 5: If a perfect match is not established in step 3, the pre-processed alert repeats step 2 in order to search for potential partial interesting meta alerts. The alert is forwarded to step 3 in order to choose the partial interesting meta alert that best represents the alert. If a perfect match is established, the successful alert inherits the properties of the parent partial interesting meta alert and is forwarded to meta alert correlator in stage 3. If a perfect match is not established, then the suspicious alert is forwarded to the Alert-EVA data verifier.
Fig. 2. Construction of EVA data.
priority, IP address, port, protocol, class, time and applications. The main purpose of EVA data is to validate alerts and enhance the semantics of alerts to a sufficient level in order to deliver quality alerts. Fig. 2 illustrates how the EVA data is constructed. The network specific vulnerability generator is the engine that constructs EVA data. It does this by establishing a relationship in all the elements of vulnerabilities drawn from three sources: known vulnerability database, scan network reports and network context information. The generator uses an entity relationship (ER) model shown in Fig. 3 to capture and build EVA data. The elements of the ER model include: hosts, ports, applications, port threats, application threats, exploits, vulnerabilities and attack information. We established relationships in all elements as follows: The relation between host and applications is one to many as one host can run more than one application. One host has multiple ports hence the relation is one to many. The vulnerability entity represents known vulnerabilities from various sources such as CVE (Common Vulnerability and Exposure) [40] and Bugtraq [41] to provide complete details of vulnerabilities. One vulnerability can lead to multiple exploits hence the relation is one to many. The attribute ‘age’ refers the age of the vulnerability. The attack information entity contains attack information that is associated with vulnerabilities. The threats are represented in two entities. The port threat entity contains all threats associated with ports while the application threat entity contains all threats associated with applications. Different vulnerability scanners such as Nessus and Protectorplus can help to populate information for some of the entities. Nessus can generate exploits while Protector-plus can generate a list of vulnerabilities for different applications in a network in order to build a comprehensive EVA data. The scanners look for potential loopholes such as missing patches that are associated to particular applications, versions, service packs. The scanners are able to determine the vulnerable software, applications, services, ports and protocols. Different scanners use different techniques to detect vulnerabilities and can run periodically to have a consistent and recent threat view of the network and immediately forward them to the network specific vulnerability generator. Scripting languages such as Perl scripts can easily process the reports from different scanners. The vulnerabilities are dynamic in nature hence the need to install agents in the hosts to track changes in network resources. The agents forward the necessary information to the generator so that the EVA data is updated accordingly. The architecture of building the threat profile in Ref. [2] is almost similar to what we have proposed as shown in Fig. 2. Specifically, we made the following contributions:
• Threat profile of Ref. [2] is drawn from two major sources 3.1.4. EVA data EVA data is a dynamic threat profile of a network. It represents vulnerabilities of the network that are likely to be exploited by attackers. It lists all the vulnerabilities by their reference id, name,
i.e. known vulnerability database and scan reports. Besides these two sources, our work has introduced a database for network resources in order to improve the management of network resources.
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45
33
Fig. 3. Entity relationship diagram.
• Our approach has incorporated the concept of ‘age’ of vulnera-
•
•
•
•
bilities to eliminate vulnerabilities which appear outdated, old and have not been used for validation in a relatively longer time (time varies with different network environment) hence making the approach more efficient and scalable. Ref. [2] uses one verifier to process alerts of different attacks while in our approach, we use 6 sub verifiers tailored to validate alerts from specific group of attacks. Therefore, our approach is able to improve the performance of alert verification process. It is difficult to maintain a dynamic threat profile in Ref. [2] because updating the threat profile is done offline. In our approach, we use host agents to track changes in a network and later relay the necessary information to the network specific vulnerability generator so that the EVA data is updated accordingly. In Ref. [2], correlation binary vectors are generated by neural network based engine to represent the match and mismatch of the corresponding features between alert and vulnerability. An example of the binary vector is (1000001001), where ‘1’ represents a match while ‘0’ represents a mismatch. Our approach uses Alert-EVA data verifier engine to validate alerts and compute the alert metrics as illustrated in the next sub section. The quality of alert verification is highly dependent on various elements of information from IDS alerts and their corresponding vulnerabilities. The threat profile and an IDS product may use different attack details such as reference ids when referring to the same attack. In fact, there are no unique (standard) attack details such as reference ids to reference all types of attacks. In order to have a comprehensive EVA data, we have introduced additional attack reference information from IDS product into EVA data. We use this information to play a complementary role when determining the relevance of alerts. This helps to ensure the completeness of attack details in the EVA data because if attack details are missed then the computation of alert score is affected.
3.1.5. Alert-EVA data verifier Threats are due to vulnerabilities in a network [42,43]. It is believed that each attack has a goal to exploit vulnerability on particular application, service, port or protocol. As mentioned earlier, traditional IDSs run with their default signature databases and do not check the relevance of an intrusion to the local network
context. Thus, IDSs generate huge volumes of raw alerts majority of which are non relevant alerts and not useful in the context of the network. In addition, the information provided by the alerts is basic and inadequate. Relying solely on this information may increase cases of important alerts being misinterpreted, ignored or delayed. Because of the above reasons, we introduce the concept of AlertEVA data verifier to validate alerts and compute the alert metrics. The proposed approach is designed to handle 5 different groups of attacks (DoS, Telnet, FTP, Mysql and Sql). Thus the verifier has 6 sub verifiers tailored to handle alerts from 5 different groups of attacks. The sub verifiers are DoS verifier, Telnet verifier, FTP verifier, Mysql verifier, Sql verifier and Undefined attack verifier. For example, DoS sub verifier handles all alerts reporting DoS attacks and so is FTP, Telnet, Mysql and Sql sub verifiers. While the undefined attack sub verifier handles any alerts reporting new attacks that have not been defined during the design of the alert verification component. The verifier validates the suspicious alerts by measuring similarity with their corresponding vulnerabilities contained in EVA data. Both alert and vulnerabilities have comparable features and hence are easy to measure the similarity. The validation process helps to determine the seriousness of alerts with respect to the network under consideration. In summary, the procedure of validating alerts and computing their alert metrics is illustrated as follows. The sub verifier uses an IP address of a given alert to search for the potential vulnerabilities in EVA data and chooses the vulnerability that best represents the alert (with highest alert score). The sub verifier uses Table 1 to compute for the alert metrics and forwards the transformed alerts to the alert classification component (stage 2). The procedure to validate alerts is further illustrated here below. Input: Suspicious alerts, vulnerabilities in EVA data. Output: Transformed alerts (with alert metrics). Step 1: To determine which sub verifier handles a given alert, the alert verifier component makes this decision based on the attack identifier (class) field in the alert and forwards the alert to the corresponding sub verifier. For example an alert reporting a DoS attack is forwarded to the DoS sub verifier. Step 2: The sub verifier compares the features of pre-processed alert with the corresponding vulnerabilities in EVA data in this order:
34
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45
Table 1 Alert metrics. Alert metric
Attributes and their sources
Metric validation rules
Scale
Alert relevance (importance of an alert)
Reference id, name, priority, IP address, port, protocol, class, time and applications From alert and EVA data
Derived from alert score (step 4 in Section 3.1.5)
0–9
Alert severity (criticality of an alert)
Reference id, priority and timestamp
If Alert’s priority (1) matches with EVA data’s severity (High) then Alert Severity = 1
1–3
From alert and EVA data
Else if Alert’s priority (2) matches with EVA data’s severity (Medium) then Alert Severity = 2 Else if Alert’s priority (3) matches EVA data’s severity (Low) then the alert severity = 3. NB: cases where there are no matches e.g. if Alert’ priority (1 or 2) and EVA data = Low or Medium then refer to the severity of EVA data
Alert frequency (rate of alert occurrence)
Reference id, name, priority, IP address, port, protocol, class, time and applications From alert
Alert (sharing common features) belonging to a particular attack whose count exceeds a certain threshold number of alerts with a certain time is regarded as frequent alert while an alert belonging a particular attack whose count is below a certain threshold is regarded as non frequent alert within a specified period of time (time and threshold number of alerts in this case are dynamic)
0–1
Alert source confidence (the reliability of sensor)
Sensor Id and timestamp from both alert and sensor data
The value of this metric is stored in the alert source data. It is easier to extract the confidence value of sensors because each alert has sensorId field.
0–6
IP => ReferenceId => Port => Time => Application
=> Protocol => Class => Name => Priority. Step 3: The sub verifier searches for potential vulnerabilities from EVA data i.e. get IP destination of alert (Alert.IPdest) to extract the potential vulnerabilities i.e. EVAdata.IP that match (Alert.IPdest). Step 4: The sub verifier chooses the vulnerability that best represents the alert. The alert is matched with each of the potential vulnerabilities and the vulnerability with the highest alert score is chosen. The process of matching alerts and potential vulnerabilities is described below. If Alert.IPdest matches EVA data.IP Then IP_similarity = 1 Else IP_similarity = 0 If Alert.ReferenceId matches EVA data. ReferenceId Then ReferenceId_similarity = 1 Else ReferenceId_ similarity = 0 If Alert.Portdest matches EVA data.Port Then Port_ similarity = 1 Else Port_similarity = 0 If Alert.Time is greater or equal to EVA data.Time Then Time_similarity = 1 Else Time_similarity = 0 If Alert.Application matches EVA data.Application Then Application_similarity = 1 Else Application_similarity = 0 If Alert. Protocol matches EVA data.Protocol Then Protocol_ similarity = 1 Else Protocol_similarity = 0 If Alert.Class matches EVA data.Class Then Class_ similarity = 1 Else Class_similarity = 0 If Alert.Name matches EVA data.Name Then Name_similarity = 1 Else Name_similarity = 0 If Alert.Priority matches EVA data.Priority Then Priority _ similarity = 1 Else Priority_similarity = 0. Alert score = IP_similarity + ReferenceId_similarity + Port_similarity + Time_similarity + Application_similarity + Protocol_similarity + Class_similarity + Name_similarity + Priority_similarity. Step 5: The sub verifier computes for alert metrics (Alert Relevance, Severity, Frequency, Alert Source Confidence) and tag the metrics to the alert. Refer to Table 1. Step 6: The sub verifier forwards the transformed alerts to the alert classification component while the most obvious non interesting alerts (alerts with no similarity with EVA data) are eliminated.
It is important to note that for simplicity reason, any match is awarded a value of 1 while a mismatch is awarded a value of 0. Alert Metrics Although some of the alert metrics have been individually used in previous works [15,26,27,44], they have not been used altogether as proposed in this paper. The originality of our approach is the use of dynamic threat profile (EVA data) to provide a reliable computation ground thus ensuring the alert metrics are accurate enough to represent the alerts in terms of their relevance, severity, frequency and alert source confidence. We used a simple method that does not consume unnecessary resources. The metrics have a better discriminative ability than the ordinary alert features. We compute metrics in this order (Alert Relevance => Alert Severity => Alert Frequency => Alert Source Confidence). The details of alert metrics are as follows (Refer to Table 1): (i) Alert relevance: It indicates the importance of an alert in reference to the vulnerabilities existing in a network. It involves measuring the degree of similarity between alert and EVA data because they contain comparable data types (reference id, name, priority, IP address, port, protocol, class, time and applications). We use the alert score to determine the relevance of alerts. We performed a series of verifications using different alert scores to search for the best threshold that best represent the alerts in terms of relevance. An alert with a higher score indicates that the number of matching fields are high hence more relevant than alert with a lower score. (ii) Alert severity: It indicates the degree of severity (undesirable effect) associated with an attack reported by alert. The verifier compares alert and EVA data in terms of their priority (severity). Alert severity can be high, medium or low. An alert with high alert severity causes a higher degree of damage. While medium alert severity causes a medium degree of damage and low severity has the least degree of damage (trivial). Alert severity is represented in range of 1–3. An alert with a lower score closer to 1 indicates the attack is very severe. (iii) Alert frequency: It reflects the occurrence of related alerts (with common features) that are triggered by particular source(s) or attack within a given period of time. An IDS may generate similar alerts that are manifested by frequent features such as IP addresses and ports within a particular time window. For example, the probe based attack may trigger
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45
35
Table 2 Pre-processed alert snapshot. Reference Id∗
Name∗
SensorId∗ IPdest ∗
Portdest ∗ IPsrc
Portsrc
Priority (severity)∗
Protocol∗ Class∗
CVE-1999-0001 CVE-1999-0527
Teardrop Telnet resolv host conf.
1 2
1238 1026
1238 1026
High Medium
TCP TCP
192.168.1.3 192.168.1.2
192.168.1.1 192.168.1.1
Time∗
Application∗
DoS 9:9:2011:10:12:05 Windows U2R 9:9:2011:10:12:06 Telnet, (Telnet) Windows
Key * used in alert verification. Table 3 EVA data snapshot. Reference Id
Name
IP address
Port
Priority (severity)
Protocol
Class
Time
Application
CVE-1999-0001 CVE-1999-0527
Teardrop Telnet resolv host conf.
192.168.1.3 192.168.1.2
1238 1026
High Medium
TCP TCP
DoS U2R (Telnet)
8:9:2011:14:16:02 8:9:2011:14:16:02
Windows Telnet, Windows
numerous repetitive alerts as the network is being intruded. The verifier has a counter to determine the frequency of alerts. An alert from a particular source (or attack) whose count exceeds a certain threshold within a specified time is regarded as a frequent alert while an alert belonging to a particular source (or attack) whose count is below a certain threshold is regarded as a non frequent alert. An alert above the threshold is assigned value 1 while the one below the threshold is assigned 0. A value closer to 1 is considered very frequent while the one closer to 0 is considered less frequent. (iv) Alert source confidence: It indicates the value an organisation places on an IDS sensor. It is based on sensor’s past performance and reflects the ability of a given sensor to effectively identify an attack hence making it possible to predict how a sensor performs in future. The performance of any sensor is influenced by factors such as accuracy and detection rates, version of signatures and frequency of updates. To further understand the performance of sensors, consider the following aspects: • In large networks, it is very difficult to ensure all sensors are well tuned and updated. Therefore some sensors are likely to be frequently updated and tuned while others are not. Difference in versions and frequency of updates may lead to different accuracy and detection rates of IDS sensors [25]. • IDS sensors are placed in different locations facing varied risks and exposure in a large network. Some locations are more prone to threats than others. For example, the sensors installed facing the Internet direct or being at the front-line of internet connections are exposed to a higher degree of threats than those located within the network. Some sensors are placed in environments with emergency routes carrying network traffic of third parties hence exposed to more threats. • A consistent number of alerts over a long period of time indicates the likelihood of the IDS sensor being stable and with the ability to produce reliable results. While an inconsistent number of alerts indicates the likelihood of the IDS sensor being unstable and may lead to unreliable results [26]. • Consistent ratio of interesting alerts to non interesting alerts over a period of time indicates the likelihood of IDS sensor being stable and able to produce reliable results. While an inconsistent ratio indicates the likelihood of IDS sensor being unstable and may lead to unreliable results. • An acceptable false alert rate indicates the ability of an IDS sensor to produce reliable alerts. While unacceptable false alert rate indicates the likelihood of a sensor to output unreliable alerts. A false alert is caused by legitimate traffic which may lead to a high false rate. The true danger of a high false alert rate lies in the fact that it may cause analysts to
Table 4 Alert sensor data snapshot. Sensor Id
Sensor score
1 2
5 2
ignore legitimate alerts. The goal of any sensor is to have a lower false alert rate. According to several researches, it is indicated that false alerts rate is between 60% and 90% [35]. However, it is possible to have a false alert rate of below 60% under normal conditions. Generally, false alert rate can vary depending on the level of tuning, technology of sensor and the type of traffic on a network. Other works illustrating this concept are contained in [45,46]. • The analysts who review the alerts regularly are in a better position to identify and tell the sources (sensors) that are well known for generating relatively high numbers of true alerts as well as sources (sensors) known for generating relatively high numbers of false alerts depending on their location and their threat levels [38]. Alerts are usually linked to their sensors by the sensorId field and therefore it is easier to extract the value of alert source confidence from the sensor data. The range of source confidence score is 0–6. A sensor with a score closer to 6 is considered to have higher confidence than the one with a score closer to 0. The alert source confidence parameter is very useful in networks with multiple IDS sensors and where analysts have some knowledge on evaluating the performance of sensors. To illustrate how the Alert-EVA data verifier works, consider the following example. Table 2 shows a snapshot of pre-processed alerts. Table 3 shows a section of EVA data while Table 4 shows a section of alert sensor data. In Table 2, the first alert reports teardrop attack targeting a host with IPdest 192.168.1.3 on Portdest 1238. Using the alert IPdest 192.168.1.3, the DoS sub verifier extracts all potential vulnerabilities in EVA data matching the IP address. In this case, EVA data has only one vulnerability matching this particular alert IPdest . The sub verifier then matches other features (reference id, port, severity, protocol, class, time, name and application) of alert against the said vulnerability. The alert score is computed as follows: IP_similarity = 1 because IP addresses are matching and so is the rest of other features; ReferenceId_similarity = 1; Port_similarity = 1; Time_similarity = 1; Application_similarity = 1; Protocol_similarity = 1; Class_similarity = 1;Name_similarity = 1; Priority_similarity = 1. Therefore, the alert score (total number of matches) is 9. The sub verifier tags the alert with the following: Relevance score of 9, Alert severity of 1, Alert frequency of 0 and Source confidence of 5. The validated alert is forwarded to the DoS sub classifier component in this form -DoS (9,1,0,5).
36
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45
Generally, the vulnerability based alert management approaches are not able to deal with unknown vulnerabilities and intrusions [12]. In this work, we designed a policy to reduce the influence of the unknown vulnerabilities and intrusions during alert verification. This policy minimises cases such as: relevant alerts being ignored or considered as irrelevant alerts simply because they fail to match the vulnerabilities in EVA data when they are being validated. This policy addresses the following four cases:
Table 5 Fuzzy rule base. Rule
Relevance
Severity
Frequency
Source confidence
Class
1 2 3 ... N
High High High ... Low
High High High ... Low
High High Low ... Low
High Low High ... Low
Class 1 Class 2 Class 3 ... Class k
• Case 1: IDS issues an alert on non relevant attack but EVA data contains outdated vulnerabilities. As expected the verifier identifies the corresponding vulnerability (that is outdated) in EVA data that matches the alert (and yet the network is not vulnerable). This issue is difficult to address but can be minimised by frequently updating both IDS and EVA data as well removing outdated vulnerabilities. Alerts in this case are eliminated. • Case 2: IDS issues an alert on non relevant attack and EVA data does not contain a corresponding vulnerability because the network is not vulnerable to the non relevant attack. For example, an IDS issues an alert in response to an attack that exploits a well-known vulnerability of a Windows operating system but the system that the IDS is monitoring is a Linux operating system. This is a desired case hence alerts are eliminated. • Case 3: IDS does not produce alert (false negative) on a relevant attack and EVA data has not registered a vulnerability corresponding to the relevant attack. This case is very difficult to address because IDS and EVA data have no idea of the relevant attack. However, this is minimised by frequently updating both IDS and EVA data. • Case 4: IDS issues an alert on relevant attacks but EVA data has not registered a vulnerability corresponding to the relevant attack. This is a worst case scenario and such alerts are eliminated. This is minimised by frequently updating EVA data. 3.2. Stage 2 3.2.1. Alert classification Fuzzy logic based system reasons about the data by using a collection of fuzzy membership functions and rules. Fuzzy logic is applied to make better and clearer conclusions from imprecise information. Fuzzy logic differs from classical logic in that it does not require a deep understanding of the system, exact equations or precise numeric values. With fuzzy logic, it is possible to express qualitative knowledge using phrases like very low, low, medium, high and very high. These phrases can be mapped to exact numeric range. How to determine the interestingness of alerts and later classify the alerts is a typical problem that can be handled by fuzzy logic. We chose fuzzy logic because it is conceptually easy to understand since it is based on natural language, easy to use and flexible. Fuzzy inference is the process of formulating the mapping from a given input to an output. The mapping then provides a basis from which decisions can be made. The process of fuzzy inference involves the following 5 steps: (1) fuzzification of the input variables, (2) application of the fuzzy operators such as AND to the antecedent, (3) implication from the antecedent to the consequents, (4) aggregation of the consequents across the rules, and (5) defuzzification. Our classifier has 6 sub classifiers tailored to handle alerts from 6 different groups of attacks: DoS, Telnet, FTP, Mysql,Sql and Undefined attack groups. For example, the DoS sub classifier handles all alerts reporting DoS attacks, and so do the FTP, Telnet, Mysql and Sql sub classifiers. While the Undefined attack sub classifier handles alerts reporting new attacks that have not been defined during the design of the alert classification component.
The results from the alert verifier (alerts with four metrics) are used as input to the fuzzy logic inference engine in order to determine the interestingness of alerts i.e. we consider the four alert metrics of an alert as the input metrics. The input metrics are fuzzified to determine the degree to which they belong to each of the appropriate fuzzy sets via membership functions. Membership functions are curves that define how each point in the input space is mapped to a degree of membership between 0 and 1. The membership functions of all input metrics are defined in this step. The summary of membership functions in each input metric is: Relevance—Low or High; Severity—Low, Medium or High; Frequency—Low or High and; Source confidence—Low or High. In this work, we used a Guassian distribution to define the membership functions. We used domain experts to define a set of IF THEN rules as illustrated in Table 5. In total, we established 11 classes that represent different levels of alert interestingness for every group of attacks. These rules represent all possible and desired cases in order to improve the efficiency of the inference engine. We eliminated rules that were contradicting and meaningless. Each rule consists of two parts: antecedent or premise (between IF and THEN) and a consequent block (following THEN). The antecedent is associated with input while the consequent is associated with output. The inputs are combined logically with operators such as AND to produce output values for all expected inputs. The general form of the classification rule that takes alert metrics as input and class as output is illustrated below: Rule: IF feature A is high AND feature B is high AND feature C is high AND feature D is high THEN Class = class 1. From Table 5, we can extract a fuzzy rule as follows: R1:IF Relevance is High AND Severity is High AND Frequency is High AND Source Confidence is High THEN class = Class 1. From the above rule, relevance, severity, frequency and source confidence represent the input variables while class is an output variable. In brief, we used the fuzzy logic to determine the interestingness of different alerts contained in the 11 classes for each group of attacks. As illustrated in Fig. 4, the fuzzy inference system takes the input values from the four metrics. The input values take this form DoS (9,1,0,5) where DoS represents the group of attack, 9 for the relevance score, 1 for severity, 0 for frequency and 5 for Source confidence . As previously illustrated in Table 1, the ranges of metrics are: relevance score is 0–9 (least is 0 and highest is 9), severity is 1–3 (1 is most severe and least severe is 3), frequency is 0–1 (least is 0 and highest is 1) and source confidence is 0–6 (least is 0 and highest is 6). The input metrics are fuzzified using the defined membership functions in order to establish the degree of membership. The inference uses the set of rules to process the input metrics. All the outputs are combined and provided in a single fuzzy set. The fuzzy set is then defuzzified to give a value that represents the interestingness of an alert. The alert is then put into the right class. Unlike other classification schemes that label alerts as either true positive or false positive, our classifier determines the interestingness of alerts using several sub classifiers. In addition, most of the existing classification schemes require a lot of
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45
37
training and human expertise and experience. We apply fuzzy based reasoning to determine the interestingness of validated alerts based on their metric values. This brings in the benefit of flexibility and ease when determining the interestingness of alerts. In addition, we use 6 sub classifiers tailored to classify alerts from specific groups of attacks. Therefore, our approach is able to improve the performance of the alert classification process. 3.2.2. Classifying ideal interesting alerts and partial interesting alerts Alerts contained in the above classes are further sorted out and classified into two super classes based on their interestingness score. The score is 0–10. The ideal interesting alert super class has interestingness score of 7 and above while the score of partial interesting alert super class is below 7. Alerts are retained in their respective classes and forwarded to the correlation component. 3.3. Stage 3 3.3.1. Alert correlation component In this sub section, we discuss the details of the alert correlation component. As previously noted the validated alerts of the existing vulnerability based approaches may contain a huge number of redundant alerts that need to be reduced. In the recent years, the trend of multi-step attacks is on the rise leading to unmanageable levels of redundant and isolated alerts. A single attack such as a port scan is enough to generate a huge number of redundant alerts. Analysing each redundant alert after alert verification, may not be practically possible especially in large networks because the analysts are likely to take longer time to understand the complete security incident. Consequently, the analysts would not only encounter difficulties when taking the correct decision but would also take a longer time to respond to the intrusions. As discussed in the previous sub section, the alert classification component puts alerts into alert classes based on the alert metrics. The alerts in the alert classes may be isolated and redundant. The goal of the alert correlation component is to reduce the huge number of redundant and isolated alerts and establish the logical relationships of the classified alerts. The individual redundant alerts representing every step of attack are correlated in order to have a big picture of an attack. The alert correlation component has 6 correlators for 6 groups of attacks (DoS, FTP, Telnet, Mysql, Sql and Undefined attacks). Each alert correlator has 11 sub correlators to handle the classified alerts from 11 different classes in each group of attack as illustrated in Fig. 5. Each class forwards alerts to the corresponding sub correlator. For example, alerts in class 1 (reporting DoS attacks) are forwarded to sub correlator 1 of DoS correlator. The procedure of correlating alerts can be illustrated as follows. The sub correlator uses the IP address(es) of a classified alert to identify the potential meta alerts. To identify the best meta alert representing the alert under consideration, the sub correlator measures the similarity of alert and the existing meta alerts. If a corresponding meta alert (both meta alert and alert have a perfect match in terms of IP, port, time) exists then its details are updated i.e. number of alerts in meta alert is incremented, meta alert update time is updated and meta alert non update time is reset to 0. Only one meta alert can be chosen from potential meta alerts. However, if a corresponding meta alert does not exist, a new meta alert is created. Details of the alert under consideration form the new meta alert i.e. number of alert is set to 1 and timestamp of the alert becomes the meta alert create time. To determine how long the existing meta alert remains active, we used the rules in steps 6 and 7 (illustrated later in this sub section). Meta alerts are later forwarded to the analyst for action. The sub correlator uses exploit cycle time to determine the extent of an attack. It generates meta alerts regarding a particular
Fig. 4. Fuzzy inference process.
Fig. 5. DoS correlation schema.
attack within an exploit cycle time. Generally, a multi-step attack is likely to generate several meta alerts in a given exploit cycle time while a single step attack is likely to generate one meta alert within a given exploit cycle time. However, one meta alert could also represent multi-step attack especially if the intruder knows the network resources to exploit. Before going into details of how alerts are correlated, we define some variables of each meta alert M as follows:
• Meta alert attack class (M.Class)—indicates the specific alert class where the alert is drawn from.
• Meta alert create time (M.createtime )—defines the create time • • •
• •
of meta alert. It is derived from the stamptime of the first alert that formed the meta alert. Number of alerts (M.NbreAlerts )—defines the number of alerts contained in the meta alert. Each time a new alert is fused to a meta alert, this number is incremented. Meta alert update time (M.updatetime )—defines the time of the last update in meta alert. This variable represents the time of the most recent alert fused to the meta alert. Meta alert non update time (M.non-updatetime )—counts the time since the last update made on the meta alert. This time is reset to 0 each time an update of new alert is fused to the meta alert. Meta alert stop time (M.stoptime )—defines the time when the meta alert is assumed to have completely fused the needed alerts. Meta alert time out (M.Timeout )—measures the time that a meta alert should continue waiting for additional alert(s). It varies with class of attack.
38
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45
• Exploit Cycle Time (ECtime )—measures the time that an exploit is believed to take (attack period). It varies with class of attack. • Additional data include source address(M.IPsrc and M.Portsrc ) and destination address for IP address (M.IPdest and M.Portdest ). It should be noted that the M.Timeout parameter ensures the necessary alerts are fused to a given meta alert. While the ECtime ensures all the meta alerts of a given attack are generated accordingly. In our approach, the values set for M.Timeout and ECtime vary with class of attacks. Therefore, each sub correlator has its own M.Timeout and ECtime . Both M.Timeout and ECtime of a given meta alert are not necessary equal. M.Timeout of a given meta alert could be lower than ECtime of the same meta alert because most events forming the different steps of attacks are generated in the early stage of an exploit cycle [47]. Therefore, there is no need to wait for the complete exploit cycle timeframe of an intrusion. Moreover, in environments prone to both single step and multistep attacks, it would be inappropriate for single step attacks to have longer M.Timeout as this could lead to unreasonable delay of meta alerts. However, ECtime of a given attack may expire before all steps of the corresponding attack scenario are executed. This is not a major disadvantage as such; the information collected on meta alert(s) at that time could show the analyst what the intruder has performed and obtained hence the analyst can predict how intruder will perform in future. In addition, the values of M.Timeout and ECtime of a given attack can be adjusted to optimise the detection of all attack steps. The procedure of reducing the redundant and isolated alerts is further illustrated below: Input: Classified alerts from alert classification component. Output: Meta alerts. Step 1: To determine which sub correlator handles a given alert, the correlator component makes this decision based on the class of validated alerts. For example DoS correlator handling an alert in class 1 (reporting a DoS attack) forwards the alert to its sub correlator 1. Step 2: The sub correlator gets IP address (Alert.IPsrc and Alert.IPdest ) of classified alert to extract the potential meta alerts (M.IPsrc ) of meta alerts that match Alert.IPsrc Step 3: The sub correlator measures the similarity between alert and potential meta alerts • IP similarity-compare (Alert.IPsr ,) and (Alert.IPdes ,) If IP addresses match IP similarity=1 else IP similarity=0 • Port similarity-compare (Alert.Portsr ,) and (Alert. Portdes ,). If Ports match then Port similarity = 1 else Port similarity = 0 • Time similarity-If Alert.Time ≥ M.createtime and M.non-updatetime < M.Timeout then Time similarity = 1(time is within close range) else Time similarity = 0 (time not within close range). The sub correlator chooses the meta alert with a perfect match with alert (only one Meta Alert can be chosen) Step 4: If Meta Alert exists that fully corresponds to the current alert under consideration (in step 3) Then the sub correlator updates the details of existing Meta Alert to reflect the details of the current alert e.g. M.NbreAlerts is incremented, M.non-updatetime is reset to 0, M.updatetime is updated. Step 5: If Meta Alert does not exist that corresponds to the current alert under consideration Then the sub correlator creates new Meta Alert (the details of the current alert forms a new meta alert) i.e. M.NbreAlerts is set to 1, M.updatetime is updated, M.non-updatetime is reset to 0, the timestamp of the alert becomes M.createtime , M.Class is updated. The new meta alert starts waiting for additional related alerts.
Step 6: If the existing meta alert(s) M.non-updatetime < M.Timeout Then meta alert(s) in the sub correlator continue waiting for additional related alerts. When the related alerts are received, meta alert in the sub correlator updates its contents. Step 7: If the existing meta alert(s) M.non-updatetime ≥ M.Timeout Then M.stoptime is updated and meta alert stop waiting for additional alerts. Step 8: Meta alerts in the sub correlator are forwarded to the analyst who takes the adequate decision based on the forwarded information. Note: Depending on the nature of attack, alerts can be correlated based on the source address, destination addresses or attack class. We have used the source addresses to illustrate this concept on steps 2 and 3. Our solution has the following distinctive characteristics:
• Our alert correlator is fed with classified alerts that are already validated as input. As discussed earlier, the alert verification process helps to filter out any alert without a corresponding vulnerability. This means that the alert correlator processes alerts of high quality thus giving better results unlike other correlation methods such as [31,34,36,37] which are fed with unverified alerts. As noted earlier, if a correlation system receives unverified alerts (containing false positives) as input, the quality of the correlation results may degrade significantly leading to false positive alerts being misinterpreted or given undue attention [17]. In addition, our approach drastically reduces the overhead (such as processing time and storage) associated with unverified and unnecessary alerts. • Most of the previous correlation methods such as Valdes and Skinner [31] and Lee et al. [36] use only one correlation schema to correlate different alerts. Our correlator is designed with respect to the alert classes of different groups of attacks. An alert class contains alerts with similar metrics that are generated by one group of attack. Our correlation component has 6 correlators and each has eleven sub correlators that corresponds to 11 alert classes i.e. each alert class has a corresponding sub correlator and is adjusted dynamically and automatically accordingly. Handling a specific class of alerts by a corresponding sub correlator definitely simplifies the process of alert correlation. • Our approach is able to detect a wide range of attacks in early stages including single and multi-step attacks and deliver realistic results. Our approach delivers meta alerts that represent the attack dimensions as illustrated later in the next sub section. More so, our solution is able to find the new attacks using the undefined attack sub correlator. • Finally, our solution is able to prioritise alerts accordingly. The alert correlator component separates alerts according to their degree of interestingness as discussed later in the next sub sections. 3.3.2. Attack dimensions We use three general patterns of meta alerts to represent attack dimensions. Alerts are grouped according to: source address, target address and attack id:
• 1:N meta alert: This meta alert is based on source address. It groups alerts in response to attacks that originate from a single source against single or multiple destinations in the network; • N:1 meta alert: This meta alert is based on target address. It groups alerts in response to attacks that originate from multiple sources against a single host in the network; • N:M meta alert: This meta alert is based on attack id (of class of attack). It groups alerts in response to attacks that originate from multiple sources against multiple hosts in the network.
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45
3.3.3. Meta alert prioritisation Meta alerts are further grouped into 2 priority groups based on their specific alert classes: Ideal interesting meta alerts are drawn from classes 1, 2, 3 and 4. These classes have the highest alert interestingness score; Partial interesting meta alerts are drawn from classes 5, 6, 7, 8, 9, 10 and 11. 3.3.4. Identifying the frequent meta alerts The correlator identifies the most frequent meta alerts and forwards these meta alerts to the alert history. The identification process of frequent meta alerts is simple because each meta alert has a feature that defines the number of alerts (M.NbreAlerts ) contained in the meta alert. Our approach is flexible because it allows the analysts to set the threshold value to define the frequent meta alerts. 4. Experiment and discussion 4.1. Overview The proposed approach takes IDS alerts as input, processes and delivers them as output in form of meta alerts. It has three major stages as discussed in Section 3. To validate the proposed solution, a heterogeneous test bed was set up with different operating systems such as Windows 98, Windows server 2003 and 2000, Windows XP, Redhat 4.2, Fedora 9 and Ubuntu. Different applications such as FTP server, Telnet, MySql server and SQL server are installed. The 6 target machines host the operating systems and applications. We used 4 Snort sensors [48] of different alert source confidence to monitor the network traffic. The sensors generate sets of alerts for comparison reasons. We used 1 server to store raw alerts, EVA data and meta alert history. Both attacking and target machines are equipped with an Intel Core duo processor (P8400) running at 2.26 MHz with 2 GB of RAM. The attacking machines are loaded with widely known attacking tools such as the Metasploit tool [49] and Nmap [50] among others in order to generate exploits. Attacks are divided into two categories: Relevant attacks and non relevant attacks. Relevant attacks are capable of exploiting the vulnerabilities of the test bed. The relevant attacks consist of two types of attacks: (i) attacks that fully exploit the vulnerabilities and (ii) attacks that partially exploit the vulnerabilities. The non relevant attacks cannot exploit any vulnerability in the test bed. The attacks are from 5 groups namely: DoS, FTP, SQL, MySql and Telnet. To generate the required sets of alerts, this experiment used the attacking machines to execute attacks using known techniques and exploits on applications, operating systems, ports and protocols. The attacks are executed using different attack dimensions. Some attacks are run repeatedly and for a considerable period of time. The following are examples of specific scenarios for multi-step attacks based on known techniques and exploits. The first scenario is a multi-step attack that exploits the vulnerabilities of an ftp server in the test bed. The attacking machine uses Nmap to try to detect whether an ftp service is running on the target machines (FTP attack). This attack originates from one source to multiple destinations. The second scenario is a multi-step attack that involves attacking machines in order to interrupt the TCP service running in target machines. This attack is emulated with the aid of an ICMP Flooder because it can continuously transfer large packets to the target of the attack hence disrupting TCP service (DoS attack). This attack originates from many sources to multiple destinations. The list of other specific attacks include: DoS attacks (Teardrop, Land, Winnuke, Ping of death and Syndrop); FTP attacks (Finger redirect, Freeftpd username overflow and FTP format string); Telnet attacks (Telnet username buffer overflow
39
and Resolve host conf); SQL attacks(SQL server buffer overflow and SQL injection); MySql (SSL hello message overflow). About 80% of attacks were non relevant meaning they could not exploit any vulnerability of the test bed and about 20% of attacks were relevant meaning that they were able to exploit the vulnerabilities of the test bed. We designed a program in JAVA to implement various components such as Alert-Meta alert history correlator, Alert-EVA data verifier and alert correlator. The program is used together with WEKA tools (FURIA) [51] to classify alerts. 4.2. EVA data EVA data is a comprehensive database implemented on MySql platform. It represents the threat profile of the test bed. EVA data lists network specific vulnerabilities by reference id, name, severity, IP address, port, protocol, class, time and application as discussed in Section 3. To populate EVA data, we used various up-to-date vulnerability scanners such as GFI Languard [52] and Nessus [53]. They capture and report threats and vulnerabilities (in ports, applications etc.) of the test bed. They use different techniques to detect vulnerabilities and are set to run periodically to look for new vulnerabilities. We used Perl scripts to process the reports from the scanners. Because vulnerabilities are dynamic in nature, the special agents (in form of Perl scripts) are installed in the hosts to track changes in applications, services and configuration and then update EVA data. EVA data and IDS product (in this case Snort) may use different attack details such as reference ids and names when referring to the same attack. This could have an impact when computing for alert scores. Actually, standardisation of attack details is a global issue affecting alert management. Therefore, there are no unique attack details such as reference ids to reference all types of attacks. Standardised schemes such as CVE have been developed to uniquely reference and directly map attack information to vulnerabilities. IDS products such as Snort refer their signatures to standard CVE. However, not all attacks have assigned CVE ids as reference ids. For example, only 33% of attacks signatures in Snort 2003 have CVE ids. To address this, we examined the attack details (such as reference ids and names) of vulnerabilities and attacks contained in EVA data and Snort. We determined the coverage overlap of standardised attack details between EVA data and Snort and then included additional attack details based on Snort into EVA data. That is, EVA data uses Snort attack details to complement CVE when referring to attacks. We limited the coverage of this exercise to the attacks used in the test bed. This solution helps to ensure the completeness of attack details in EVA data because if attack details are missed then the computation of alert score is affected negatively. 4.3. Alert sensor data It keeps the profile (such as sensorId, sensor score and confidence value) of all Snort sensors in the test bed. The alert sensor data is implemented on MySql. We carried our experiments with the goal of establishing the confidence values of sensors. We used attack and attack free data in controlled and uncontrolled environments to observe how sensors faired in terms of consistence in alert output, accuracy and detection rates, number of alerts, ratio of interesting alerts to non interesting alerts and false alert rates. In order to have accurate confidence values, we considered aspects such as current versions of signatures, time when the last update occurred, alert output, accuracy and detection rates, the degree of threats and risks, previous experience of the analysts, type of traffic being handled, environment being monitored among other aspects. These aspects explicitly demonstrate the confidence of sensors. We used the following policy that takes into account of the aforementioned aspects:
40
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45
Step 1: If Signatures are frequently updated then assign value = 1 Else assign value = 0 Step 2: If location is not risky then assign value = 1 Else assign value = 0 Step 3: If the number of alerts is consistent then assign value = 1 Else assign value = 0 Step 4: If the ratio of interesting alerts to non interesting alerts is consistent then assign value = 1 else assign value = 0 Step 5: If the opinion of the analyst on reliability is high then assign value = 1 else assign value = 0 Step 6: If false alert rate is below 90% then assign value = 1 else assign value = 0 Step 7: Compute score for Alert source confidence Sensor score = Signatures value + Location value + Consistency of number of alerts value + Consistency of ratio of interesting alerts to non interesting alerts value + Opinion of analyst on reliability value + false alert rate value Note: • For simplicity, we use 1’s and 0’s to represent values of the above steps. • Some parameters (such as false alert rate) are based on observations made by other researchers (Refer to Section 3.1.5) We performed a number of verifications using different scores in order to search for the best thresholds that separate sensors with high confidence from those with low confidence. However, the conditions of the parameters and threshold score can be adjusted to reflect the actual confidence of sensors depending on the level of tuning, the type of traffic and the network environment in general. 4.4. Performance of the system The time taken to generate EVA data was influenced by factors such as the number of applications installed in each host, the type and number of scanners and the number of hosts to be scanned in the network. On average it took 2–6 min to scan one host. After building EVA data, the scanners were configured properly to look for new vulnerabilities in order to minimise the unnecessary overhead. It is important to note that EVA data may require some adjustments especially when network changes and when vulnerabilities are obsolete. Even though this is done by agents, the time taken to do this plays a secondary role. To ensure EVA data and meta alert history were effective, we verified and confirmed that the two were adequately representing the vulnerabilities and meta alerts accordingly. We generated a total of 5506 alerts as shown in Table 6. Each alert has features such as sensor identifier (Id), reference id, source address (IP and port), destination address (IP and port), priority, protocol, timestamp and triggered signature. The alerts were collected and pre-processed in terms of extracting the important features using Perl scripts as discussed earlier in Section 3. The pre-processed alerts were fed as input for stage 1. First, the alerts were correlated with meta alert history. The successful alerts were forwarded to the alert correlator. The remaining alerts (suspicious alerts) were forwarded to the alert verification component for validation and computation of alert metrics. The transformed alerts were forwarded to the alert classification component in stage 2. Alerts were classified according to their metrics and later forwarded to the alert correlation component in stage 3. The following parameters were set for the alert correlation component: M.Timeout = 1–2 min and ECtime = 3–15 min
depending on classes of attacks. Table 6 shows the alerts generated by Snort according to different groups of attacks. 4.4.1. Stage 1—Alert verification and alert history To evaluate the performance of stage 1, we employed two parameters: Detection Rate and Precision. Detection Rate represents the number of attacks that are detected by IDS among all relevant attacks that are generated. Precision represents the percentage of relevant attacks detected versus all cases detected as attacks by IDS. We used the following abbreviations: True Positive (TP) represents the attacks that are correctly detected; False Positive (FP) represents the attacks that are incorrectly detected; False Negative (FN) represents the attacks that are not detected. The equations are listed below: Detection Rate = Precision =
TP TP + FN
TP TP + FN
.
To find how much improvement is achieved in stage 1, we used the above parameters before and after the alerts were processed in stage 1. Table 6 shows the detection rates and precision of raw alerts. The tabulated result indicates that most of the alerts are generated due to ineffectiveness of IDS sensors. Generally, Snort sensors detected most of the relevant attacks but did not perform well on non relevant attacks hence generating high numbers of alerts. The precision of Snort alerts is between 12%–17% as reflected in the same table. Table 7 indicates the precision and detection rates after the alerts were processed in stage 1. Our system is able to improve the precision of Snort alerts by at least 80% while maintaining the detection rates of Snort. The meta alert history significantly improved the performance of the proposed system. It reduces the alert load to be processed by the alert classification component by at least 19%. Table 8 shows at least 19% of input alerts were successfully correlated using Alert-Meta alert history correlator and were forwarded to the alert correlation component. This means that our alert management system does not need to compute for alert metrics and classify alerts that are already reflected in the meta alert history hence saving resources such as memory, CPU and time. This is one the reasons that helps to process alerts within a very short time. Table 8 indicates that most of the raw alerts were considered to be suspicious and forwarded to the alert verifier to be validated. We tested the usefulness of the four alert metrics to see their impact on alert management. Each metric was employed to evaluate its impact on system performance. We noted that all the four metrics made a positive influence when managing the huge volumes of alerts. We observed that the sensors with high alert source confidence were likely to give reliable alert output unlike the sensors with low source confidence when placed in the same environment. However, when the sensors are placed in different environments, the results may be different. To address this, we reviewed the threshold values of the alert metrics to ensure the metrics were truly representative in order to deliver optimal results. The alert verifier successfully tagged the alerts with the correct alert metrics. It is interesting to note that the alert verifier eliminated alerts that failed to show any similarity with EVA data. Such alerts were considered as the most obvious non interesting alerts. In order to avoid cases where important alerts are eliminated simply because they are not reflected in EVA data, we used the policy (a guide on how to deal with alerts represented in four cases) as illustrated in Section 3.1.5. However, the policy is not fully effective and requires some modifications in order to address the cases involving the unknown vulnerability and attacks such as ‘‘IDS issues an alert for
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45
41
Table 6 Precision and detection rate of raw alerts (Snort). Alerts/group
DoS
Telnet
FTP
MySql
Sql
Attacks Relevant attacks Non relevant attacks Snort alerts collected (Unverified) TP FP FN Precision (%) Detection rate (%)
689 107 582 705 101 604 6 14.3 94.3
943 171 772 973 165 808 6 16.9 96.4
1387 247 1140 1408 242 1166 5 17.1 97.9
1190 172 1018 1217 169 1048 3 13.8 98.2
1168 155 1013 1203 149 1054 6 12.3 96.1
Table 7 Precision and detection rate of validated alerts (after stage 1). Alerts/group
DoS
Telnet
FTP
Mysql
Sql
Snort alerts collected (Unverified) Alerts retained TP FP FN Precision (%) Detection rate (%) Alerts eliminated by the verifier (%)
705 101 96 5 11 95.0 89.7 85.6
973 165 160 5 11 96.9 93.5 83
1408 242 236 6 11 97.5 95.5 82.8
1217 169 164 5 8 97.0 95.3 86.1
1203 149 142 7 13 95.3 91.6 87.6
Table 8 Performance result—Stage 1. Percentage/Group
DoS
Telnet
FTP
Mysql
Sql
Percentage of successful alerts correlated by Alert-Meta alert history correlator (%) Percentage of transformed alerts (using Alert-EVA data verifier) (%)
19.8 80.2
20.2 79.8
20.3 79.7
19.2 80.8
21.1 78.9
relevant attack but the EVA data has not registered a corresponding vulnerability for this attack’’. Table 7 indicates that at least 82% of input alerts were eliminated in stage 1. Therefore, the alert load of the alert classification component is significantly reduced. 4.4.2. Stage 2—Alert classification The fuzzy based alert classification component classifies alerts according to the alert metrics. We implemented all aspects (such as rule base) of the alert classifier in WEKA (FURIA). The classifier was trained by replaying the test data over and again to produce the desired results with high accuracy. The classification rules were modified accordingly. It may be noted that this component is trained to handle alerts with alert metrics corresponding to all sets of attacks used in the test bed. Each class contains alerts reporting one group of attacks. We observed that the ideal interesting alerts form a small fraction as compared to the partial interesting alerts. We also noted that the alert classification component successfully classified alerts according to the alert metrics. To demonstrate the importance of the proposed system, we compared the classification results produced manually (manual classification) and results from the system and noted that it did not only take longer time to manually classify the alerts but also there were numerous cases where alerts were misclassified, ignored or misinterpreted. The result in Fig. 6 indicates that ideal interesting alerts represent at least 8% of the classified alerts while the rest are partial interesting alerts. Our classifier successfully separated the ideal interesting alert classes from the partial interesting alert classes.
Fig. 6. Ideal interesting alerts vs. partial interesting alerts.
classified alerts are usually redundant, isolated and unrelated hence need to be correlated in order to get rid of the redundant alerts as well as establish their logical relationships. To further reduce the number of the redundant alerts, the alert correlation component correlates alerts in each class separately using the dedicated 11 sub correlators for every group of attacks. Alerts are correlated and presented in the form of meta alerts to the analysts. The analysts are able to know what kind of attack (1:N, N:1, M:N) is represented by the correlated alerts as depicted in Tables 9 and 10. To evaluate the effectiveness of the alert correlation component, we employed alert reduction rate parameter. It represents the percentage of filtered alerts by the system. We used the following abbreviations: N represents the total number of alerts; and Nf represents the number of alerts reduced (filtered) by the system. The equation is listed below: Reduction Rate =
4.4.3. Stage 3—Alert correlation Alerts are classified before they are correlated in order to improve their quality. The Alert classification component puts alerts into several alert classes based on the alert metrics. The
Nf N
.
From the correlation results shown in Tables 9 and 10, the alert correlation component reduces the redundant alerts contained in the alert classes by at least 80% for the ideal interesting
42
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45 Table 9 Meta alerts based on alert classes for ideal interesting alerts. Group
Ideal interesting alerts
1:N meta alerts
N:1 meta alerts
M:N meta alerts
Reduction rate (%)
DoS Telnet FTP Mysql Sql
16 17 30 19 10
1 1 2 1 1
1 1 2 1 1
1 1 1 1 0
81.2 82.3 83.3 84.2 80.0
Table 10 Meta alerts based on alert classes for partial interesting alerts. Group
Partial interesting alerts
1:N meta alerts
N:1 meta alerts
M:N meta alerts
Reduction rate (%)
DoS Telnet FTP Mysql Sql
85 148 212 150 139
7 9 15 9 11
4 6 8 6 5
4 5 10 7 7
82.3 86.4 84.4 85.3 83.4
instead they are treated as suspicious alerts and forwarded to the Alert-EVA data verifier in order to verify their validity and compute alert metrics. To demonstrate the importance of the proposed system, we compared the results produced manually (manual prioritisation) and results from the system and noted that it did not only take a longer time to manually prioritise the alerts but also there were numerous cases where alerts were wrongly prioritised, ignored or misinterpreted.
Fig. 7. Prioritised meta alert for simulated data.
redundant alerts while the partial interesting redundant alerts by at least 82%. It is important to note that these rates do not take into account of 4680 non interesting alerts that were eliminated during alert verification in stage 1. Therefore, the alert correlation component successfully reduces redundant alerts of different attacks as illustrated in Tables 9 and 10. The alert correlation component is able to put each alert in their sub correlator correctly. We also note that the alert correlation component runs the suitable correlation process for each sub correlator. When evaluating the meta alerts, we noted that long attacks generate more that one meta alert while shorter attacks generate only one meta alert. However, the alert correlation component may not be able to completely reveal attack patterns especially when the real attack period is longer than the set exploit cycle time. This is explained by the fact that our approach only considers meta alerts within the set ECtime value. We also noted that huge volumes of alerts for multi-step attacks were generated by IDS as soon as attacks were launched and this lasted for a short time. To illustrate this, we refer to a specific scenario of multi-step attack that involves exploiting the vulnerabilities of an FTP server. In this scenario, we observed that IDS sensors generated numerous alerts for the first three minutes as compared to the rest of test period. 4.4.3.1. Meta alert prioritisation. Meta alerts are prioritised into two categories. As seen in Fig. 7, at least 3% of the meta alerts are Ideal Interesting Meta Alerts and the rest are Partial Interesting Meta Alerts. Ideal Interesting Meta Alerts are accorded the first priority due to their nature (higher alert interestingness score). Partial Interesting Meta Alerts are considered the second important group of alerts. We manually reviewed the prioritised meta alerts and confirmed that the alert correlator component successfully prioritised the meta alerts. It is important to note that when our alert management system is initialised the meta alert history is empty and the incoming alerts are not correlated using Alert-Meta alert history correlator but
4.5. Real world data To further validate the proposed approach, we used a real world dataset generated from one of the computing centres in our university. The computing centre provides services for local users such as backup applications, network disks and collaboration software. The centre connects to the rest of the university’s intranets and the Internet. We constructed EVA data containing all the vulnerabilities in the computing centre. It took a relatively longer time to build EVA data as compared to the test bed environment because of the size of the network and the number of hosts involved. We used 4 Snort sensors of different alert source confidence to monitor network traffic for a period of two weeks. We placed the sensors in different segments of the computing centre’s network. The experiment was to allow us to see how the proposed approach could process alerts generated by a wide range of attacks unlike in the test bed where we experienced a limited number of attacks. The sensors were configured to forward the raw alerts to a centralised database located in the alert server. We recorded 285,647 alerts in week 1 and 295,487 alerts in week 2. We used the alert server to keep EVA data, meta alert history and alert sensor data. Using a section of alert output, we trained the system until it produced the desired output. Our system processed alerts and delivered the output in form of meta alerts. From the experiment results, we noted that both the real world data and the simulated data revealed a much similar pattern of results. For example, at least 19% of alerts from real world data were successfully correlated using the meta alert history. At least 75% of alerts were considered as the most obvious non interesting alerts hence eliminated during the alert verification. We also noted the patterns of classified, correlated and prioritised alerts in real world data were similar to the simulated data as depicted in Fig. 8. We observed that the interesting alerts were as a result of attacks directed to specific versions of software that were outdated such as web browsers and other applications. The non interesting
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45
43
Table 11 Maximum processing time. Alert stage
Processing time for simulated data (s)
Processing time for real world data (s)
Stage 1: Correlating alerts using Alert-Meta alert history correlator Verifying alerts using Alert-EVA data verifier Stage 2: Alert classification Stage 3: Meta alert correlation Total time (s)
0.012
0.014
0.033 0.004 0.192 0.241 s (via Alert-EVA data verifier) and 0.204 s (successful alert correlated via Alert-Meta alert history correlator)
0.035 0.005 0.199 0.253 s (via Alert-EVA data verifier) and 0.213 s (successful alert correlated via Alert-Meta alert history correlator)
introduce sensible delays and the analysts are able to react early enough to different attacks. The speed can further be improved by implementing better searching algorithms and representation of meta alerts (in the meta alert history) and vulnerabilities (in EVA data). 4.7. Comparison with existing methods
Fig. 8. Prioritised meta alerts for real world data.
alerts such as FULL XMAS scan were generated as result of the sensors thinking that the router was running a scan. In fact, the router was faulty thus causing various unnatural packets to be sent. Other types of alerts with the highest number of alerts were: ICMP redirect host and http://inspect:BAREBYTEUNICODEENCODING. The precision recorded was not as high as that of the test bed. The test bed recorded higher precision because we included the attack details of the attacks used in the test bed into EVA data to serve a complementary role. As a result, there was a positive impact when computing for alert scores during the alert verification process. This was done to demonstrate how enriching the vulnerability data can further assist in delivering quality results. The experiment also revealed that the alert reduction rate greatly depends on the location of the sensors. In fact the sensor located in the intranet triggers huge numbers of non interesting alerts due to network management systems such as antivirus applications present in the computing centre; hence the reduction rate was high. While the sensor located to monitor the Internet traffic triggered alerts that were representing the real attacks. 4.6. Processing time To ensure the proposed approach is efficient, we recorded the maximum time taken to process alerts as shown in Table 11. In simulated data, an alert that is successfully correlated with the meta alert history, takes 0.012 s while the suspicious alert takes 0.045 s (inclusive of time taken when the alert is correlated with meta alert history). An alert in the alert classification component takes 0.004 s (stage 2) and takes 0.192 s in the alert correlation component (stage 3). Thus, an alert that is successfully correlated with the meta alert history can be processed completely in stages 1 and 3 within 0.204 s while the suspicious alert takes 0.241 s in stages 1, 2 and 3. In real world data, an alert that is correlated with the meta alert history takes 0.014 s while the suspicious alert takes 0.049 s. An alert that is successfully correlated with the meta alert history can be processed completely in stages 1 and 3 in 0.213 s while the suspicious alert takes 0.253 s in stages 1, 2 and 3. The processing time for real world data is almost equivalent to that of simulated data. However, the little difference manifested is because of the difference in number of alerts, size of EVA data and meta alert history involved. Our approach does not
The proposed approach offers superior performance. It improves the quality of alerts as well as reduces the redundant and isolated alerts generated by a wide range of attacks. The proposed solution helps the analysts when evaluating and making the right decision about the final alerts. In terms of configuration, this approach is very efficient and easy to use. It can be applied to other signature based IDSs without configuring their default signatures. EVA data has to be updated with the attack reference details of those IDS products. However, our approach experienced some challenges that affected the performance. Generally, it is difficult to measure the confidence of IDS sensors because a sensor may have different confidence values in different network environments. The experiment revealed that sensors with high confidence produced reliable alerts unlike sensors with low confidence when placed in the same environment. It is important to review the parameters (such as false alert rate) and the sensor score thresholds in order to fit in different environments. Other shortcomings that need to be addressed in future are described in the conclusion section. The proposed approach is compared with Hubballi’s approach [2] as shown in Table 12. Both approaches use vulnerability data to validate alerts in order to improve the quality of alerts. Our approach is complimentary to Ref. [2] by not only improving the quality of alerts but also reducing unnecessary alerts. In fact reducing redundant and isolated alerts (unnecessary alerts) is the major focus of our work. The unnecessary alerts in this case refer to redundant ideal interesting and partial interesting alerts. We computed the comparison rates (such as reduction rates, precision and detection rates) using the functions already discussed earlier in this section. The proposed approach has the best reduction rates in terms of false positives, redundant ideal interesting alerts and redundant partial interesting alerts. This is because our system has 11 dedicated sub correlators for each group of attack that are well configured to reduce the redundant and isolated alerts. In addition, the precision and detection rates of the proposed approach were slightly lower than Ref. [2] because we used sensors of different versions (2.0, 2.4, 2.6, 2.9). In fact, the proposed system works well in environment using different versions of IDS sensors. This is explained by the fact that, we included the attack details of the attacks used in the test bed into EVA data to serve a complementary role when alerts are being verified. Therefore, the proposed approach is reliable because sensors of different source confidence have a little impact on the precision. Our work has other additional features: Alerts are prioritised according to their interestingness and the final alerts indicate attack dimensions hence assisting the analysts to manage alerts more effectively.
44
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45 Table 12 Comparison with Ref. [2]. Parameter
Hubballi et al. [2]
Our approach
Technique Initial alerts (input) Reduction rate of false positives (%) Reduction rate of redundant ideal interesting alerts (%) Reduction rate of redundant partial interesting alerts (%) Precision (%) Detection rate (%)
Vulnerability based false alert reduction 1756 (from test bed data) 78.1 – – 97 95.4
Vulnerability based alert management 5506 (from test bed data) 85.02 82.2 84.36 96.34 93.12
5. Conclusion This paper proposes a comprehensive approach to address the shortcomings of the vulnerability based alert management approaches. In this paper, it is noted that the act of validating the alerts may not guarantee the final alerts of high quality. For example, the validated alerts may contain a massive number of redundant and isolated alerts. Therefore, the analysts who review the validated alerts are likely to take longer time to understand the complete security incident because it would involve evaluating each redundant alert. We have proposed a fast and efficient approach that improves the quality of alerts as well as reduces the volumes of redundant alerts generated by signature based IDSs. In summary, our approach has several components that are presented in three stages: Stage 1 involves alert pre-processing, correlation of alerts against the meta alert history and verification of alerts against EVA data; Stage 2 involves classification of alerts based on their alert metrics; and Stage 3 involves correlation of alerts in order to reduce the redundant and isolated alerts as well discover the causal relationships in alerts. We conducted experiments to demonstrate the effectiveness of the proposed approach by processing alerts from five different classes of attacks. We recorded impressive results in alert reduction rates as well as precision rates while closely maintaining the detection rate of IDS. As part of our future work, we are planning to address the following shortcomings. The way IDS products and vulnerabilities reference each other does not allow efficient correlation of alerts and their corresponding vulnerabilities. We designed a mapping facility that maps attack reference details into EVA data which is not as flexible and scalable as we wanted. However, this is the first step towards realising our desired goal of having an automated facility that maps attacks details into EVA data. In addition, we have tried to address the issue of unknown attacks and vulnerabilities in our study however this concept needs to be explored. A better mechanism to explore relationships among meta alerts in order to gain deeper understanding regarding different classes of attacks need to be studied. Finally, we plan to use the meta alert history to modify and tune the IDS signatures in order to further improve the future quality of alerts. Acknowledgments This work is partially supported by Hunan University. We are grateful to the anonymous reviewers for their thoughtful comments and suggestions to improve this paper. References [1] C. Thomas, N. Balakrishnan, Improvement in intrusion detection with advances in sensor fusion, IEEE Transactions on Information Forensics and Security 4 (3) (2009) 542–550. [2] N. Hubballi, S. Biswas, S. Nandi, Network specific false alarm reduction in intrusion detection system, Security and Communication Networks 4 (11) (2011) 1339–1349. John Wiley and Sons. [3] T. Pietraszek, Using adaptive alert classification to reduce false positives in intrusion detection, RAID’04, in: The Proceedings of the Symposium on Recent Advances in Intrusion Detection, vol. 3324, Springer, France, 2004, pp. 102–124.
[4] X. Liu, D. Xiao, X. Peng, Towards a collaborative and systematic approach to alert verification, Journal of Software 3 (3) (2008). [5] I.G.D. Ukon, S.Y.M. Certh, Visual analytic representation of large datasets for enhancing network security, in: Seventh Framework Programme, 2011. [6] Y. Lai, P. Hsia, Using the vulnerability information of computer systems to improve the network security, Computer Communications 30 (2007) 2032–2047. [7] M. Colajanni, D. Gozzi, M. Marcjetti, Selective alerts for the run-time protection of distributed systems, in: The Proceeding of the Ninth International Conference on Data Mining, Protection, Detection and Other Security Technologies, DATAMINING 2008, Spain, May, 2008. [8] J.H. Huh, J. Lyle, C. Namiluko, A. Martin, Managing application whitelists in trusted distributed systems, Future Generation Computer Systems 27 (2011) 211–226. [9] B. Morin, L. Me, H. Debar, M. Ducassé, A logic-based model to support alert correlation in intrusion detection, Information Fusion 10 (2009) 285–299. [10] D.J. Chaboya, R.A. Raines, R.O. Baldwin, B.E. Mullins, Network intrusion detection: automated and manual methods prone to attack and evasion, IEEE Security and Privacy 4 (6) (2006) 36–43. [11] C.V. Zhou, C. Leckie, S. Karunasekera, A survey of coordinated attacks and collaborative intrusion detection, Journal of Computer Security 29 (2010) 124–140. [12] R. Gula, Correlating IDS alerts with vulnerability information, Tenable Network Security, Revision 4, Technical Report, 2011. [13] C. Kruegel, W.K. Robertson, Alert verification determining the success of intrusion attempts, in: The Proceedings of the Detection of Intrusions and Malware and Vulnerability Assessment, DIMVA 2004, Germany, July, 2004, pp. 25–38. [14] G. Eschelbeck, M. Krieger, Eliminating noise from intrusion detection systems, Information Security Technical Report 8 (4) (2003) 26–33. [15] P.A. Porras, M.W. Fong, A. Valdes, A mission impact based approach to INFOSEC alarm correlation, in: The Fifth International Symposium on Recent Advances in Intrusion Detection, RAID 2002, in: Lecture Notes in Computer Science, vol. 2516, Springer, Switzerland, 2002, pp. 95–114. [16] B. Morin, L. Mé, H. Debar, M. Ducassé, M2d2: a formal data model for IDS alert correlation, in: Proceedings of the 5th International Symposium on Recent Advances in Intrusion Detection, RAID, Springer-Verlag, Berlin, Heidelberg, 2002, pp. 115–137. [17] F. Valeur, G. Vigna, C. Kruegel, R.A. Kemmerer, A comprehensive approach to intrusion to intrusion detection alert correlation, IEEE Transactions on Dependable and Secure Computing 1 (3) (2004) 146–169. [18] J. Yu, Y.V.R. Reddy, S. Selliah, S. Reddy, V. Bharadwaj, S. Kankanahalli, TRINETR: an architecture for collaborative intrusion detection and knowledge-based alert evaluation, Advanced Engineering Informatics 19 (2005) 93–101. [19] T. Chyssler, S. Burschka, M. Semling, T. Lingvall, K. Burbeck, Alarm reduction and correlation in intrusion detection systems, in: Proceedings of the International Workshops on Enabling Technologies: Infrastructures for Collaborative Enterprises, 2004, pp. 229–234. [20] M. Xiao, D. Xiao, Alert verification based on attack classification in collaborative intrusion detection, in: Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel or Distributed Computing, Qingdao, China, July 30–August 1, 2007. [21] D. Bolzoni, B. Crispo, S. Etalle, ATLANTIDES: an architecture for alert verification in network intrusion detection systems, 2007. [22] M. Colajanni, D. Gozzi, M. Marcjetti, Selective and early threat detection in large networked systems, in: The Proceeding of the Tenth IEEE International Conference on Computer and Information Technology, CIT 2010, Bradford, UK, June 2010. [23] F. Massicotte, M. Couture, L. Briand, Y. Labiche, Context based intrusion detection using snort, nessus and bugtraq databases, in: Proceedings of the Annual Conference on Privacy, Security and Trust, 2005, pp. 1–12. [24] S. Neelakantan, S. Rao, A threat-aware signature based intrusion detection approach for obtaining network specific useful alarms, in: The Third International Conference on Internet Monitoring and Protection, Romania, June 29–July 5, 2008, pp. 80–85. [25] N.A. Bakar, B. Belaton, Towards implementing intrusion detection alert quality framework, in: The Proceedings of the First International Conference on Distributed Framework for Multimedia Applications, USA, 2005, pp. 196–205. [26] M. Chandrasekaran, M. Baig, S. Upadhyaya, AVARE: aggregated vulnerability assessment and response against zero-day exploits, in: The Proceedings of the 25th IEEE International Performance Computing and Communication Conference, IPCCC, USA, 10–12 April, 2006, pp. 603–610.
H.W. Njogu et al. / Future Generation Computer Systems 29 (2013) 27–45 [27] K. Alsubhi, E. Al-Shaer, R. Boutaba, FuzMet: a fuzzy-logic based alert prioritization engine for intrusion detection systems, International Journal of Network Management (2011) Wiley. [28] H.W. Njogu, L. Jiawei, Using alert cluster to reduce IDS alerts, in: The Third IEEE International Conference on Computer Science and Information Technology, China, 9–11 July, 2010, pp. 467–471. [29] S.O. Al-Mamory, H. Zhang, A survey on IDS alerts processing techniques, in: 6th WSEAS International Conference on Information Security and Privacy, Tenerife, Spain, December 14–16, 2007, pp. 69–77. [30] K. Julisch, Clustering intrusion detection alerts to support root cause analysis, ACM Transactions on Information and System Security 6 (4) (2003) 443–471. [31] A. Valdes, K. Skinner, Probabilistic alert correlation, in: The Fourth International Symposium on Recent Advances in Intrusion Detection RAID, in: Lecture Notes in Computer Science, vol. 3224, Springer, UK, 2001, pp. 54–68. [32] H. Debar, A. Wespi, Aggregation and correlation of intrusion-detection alerts, in: Proceeding International Symposium Recent Advances in Intrusion Detection, 2001, pp. 85–103. [33] S.O. Al-Mamory, H. Zhang, Introduction detection alerts reduction using root cause analysis and clustering, Computer Communications 32 (2) (2009) 419–430. [34] M. Sourour, B. Adel, A. Tarek, Network security alerts management architecture for signature-based intrusion detection systems with a NAT environment, Journal of Network System Management (2010) Springer. [35] N. Jan, S. Lin, S. Tseng, N. Lin, A decision support system for constructing an alert classification model, Expert Systems with Applications 36 (2009) 11145–11155. [36] S. Lee, B. Chung, H. Kim, Y. Lee, C. Park, H. Yoon, Real-time analysis of intrusion detection alerts via correlation, Journal of Computer Security 25 (2006) 169–183. [37] R. Perdisci, G. Giacinto, F. Roli, Alert clustering for intrusion detection systems in computer networks, Engineering Applications of Artificial Intelligence 19 (2006) 429–438. [38] R. Vaarandi, Real-time classification of IDS alerts with data mining techniques, in: The Proceedings of the IEEE MILCOM Conference, vol. 7, 2009, pp. 1786–1792. [39] A. Alharby, H. Imai, IDS false alarm reduction using continuous and discontinuous patterns, in: The Proceedings of ACNS, vol. 3531, 2005, pp. 192–205. [40] Common vulnerability and exposures. http://www.cve.mitre.org/about. [41] Bugtraq. http://www.securityfocus.com. [42] A. Herrero, M. Navarro, E. Corchado, V. Julian, RT-MOVICAB-IDS: addressing real-time intrusion detection, Future Generation Computer Systems 29 (1) (2013) 250–261. [43] B. Chang, D. Kim, H. Kim, J. Na, T. Chung, Active security management based on secure zone cooperation, Future Generation Computer Systems 20 (2004) 283–293. [44] K. Alsubhi, E. Al-Shaer, R. Boutaba, Alert prioritization in intrusion detection systems, in: The 11th IEEE/IFIP Network Operations and Management Symposium, NOMS 2008, April 2008, pp. 33–40. [45] H. El-Taj, O. Abouddalla, A. Manasrah, False positive alerts reduction by correlating the intrusion detection system alerts: investigation study, Journal of Communication and Computers 7 (3) (2010). [46] G.C. Tjhai, S.M. Furnell, M. Papadaki, N.L. Clarke, A preliminary two-stage alarm correlation and filtering system using SOM neural network and K -means algorithm, Journal of Computer Security 29 (2010) 712–723. [47] H.K. Browne, W.A. Arbaugh, J. Mchugh, W.L. Fithen, A trend analysis of exploitations, in: The Proceedings of the 2001 IEEE Symposium on Security and Privacy, May 2001, pp. 214–229. [48] Snort. http://www.snort.org. [49] Metaspoilt. http://www.metasploit.com. [50] Nmap. http://www.nmap.org.
45
[51] WEKA. http://weka.sourceforge.net/doc.packages/fuzzyUnorderedRuleInduction/. [52] GFI Languard. www.gfi.com. [53] Nessus. www.nessus.org.
Humphrey Waita Njogu is currently a Ph.D student at the college of Information Science and Engineering, Hunan University, Changsha, China. He is expected to graduate in July 2012. He received his Masters degree of Engineering in Computer Science in 2009 from Hunan University. He received his B.Sc. in Information Science from Moi University, Kenya. He has worked in several organizations in Kenya and China in the field of computer networks, security and database management. His research interests include most aspects of computer security, with an emphasis on intrusion detection and vulnerability analysis. He has authored many articles in leading international journals.
Luo Jiawei is a full professor and vice dean at the college of Information Science and Engineering, Hunan University, Changsha, China. She holds Ph.D, M.Sc. and B.Sc. degrees in Computer Science. Her research interests include data mining, network security and bio informatics. She has a vast experience in implementing national projects on Bio informatics. She has authored many research articles in leading international journals.
Jane Nduta Kiere is currently a Computer Science student at the college of Information Science and Engineering, Hunan University, Changsha, China. She is expected to graduate in July 2012. She has worked in several organizations in Kenya and China in the field of Information Technology and media. Her research interests include network security, data mining, digital broadcasting and digital media.
Damien Hanyurwimfura is currently a Ph.D student at the college of Information Science and Engineering, Hunan University, Changsha, China. He is a lecturer in Kigali Institute of Science and Technology (KIST), Rwanda. He received his Masters degree of Engineering in Computer Science and Technology from Hunan University in 2010. His research interests include wireless network, security and data mining.