computers & security 26 (2007) 213–218
available at www.sciencedirect.com
journal homepage: www.elsevier.com/locate/cose
Modeling and analyzing the spread of active worms based on P2P systems Tao Lia,*, Zhihong Guana, Xianyong Wua,b a
Department of Control Science and Engineering, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P R China Department of Electronics and Information, Yangtze University, Jingzhou, Hubei 434023, P R China
b
article info
abstract
Article history:
Active worms spread in an automated fashion and can flood the Internet in a very short
Received 2 November 2005
time. Hit-list scanning is a technique for accelerating the initial spread of a worm. Due
Accepted 4 October 2006
to the recent surge of Peer-to-Peer (P2P) systems with large numbers of users, P2P systems can be a potential vehicle for the active worms to achieve fast worm propagation in the In-
Keywords:
ternet too. When the technique of hit-list scanning is used on top of P2P system, some new
Security
characters emerge. In this paper, we have defined an L system and an O system. Based on
Modeling
modeling the spread of active worms, we focused all our attention on analyzing the char-
P2P systems
acteristics of the spread of active worms between the L system and the O system which can
Active worms
help us design and control the P2P systems effectively as well as defend against the propagation of worms. ª 2006 Elsevier Ltd. All rights reserved.
1.
Introducion
In this paper, we analyze and evaluate the worm propagation performance of different systems based on P2P systems. The system including P2P systems is defined as an L system, if not, defined as an O system. Active worms have been a persistent security threat to the Internet since the Morris worm arose in 1988. The Code Red and Nimda worms infected hundreds of thousands of systems, and cost both the public and private sectors millions of dollars (Russell and Machie, 2001; Machie et al., 2001). To speed up the spread of active worms, Weaver presented the ‘hit-list’ idea (Staniford et al., 2002). Long before an attacker releases the worm, he/she gathers a list of potentially vulnerable machines with good network connections. After the worm has been fired onto an initial machine on this list, it begins scanning down the list. Hence, the worm will first start infecting the machines on this list. Once this list has been exhausted, the worm will then start infecting other vulnerable machines. The machines on this list are referred to as the ‘hit-list’. After the worm infects the hit-list
rapidly, it uses these infected machines as ‘stepping stones’ to search for other vulnerable machines. An extension of the hit-list technique creates a flash worm, which appears capable of infecting the vulnerable population in tens of seconds: so fast that no human-mediated counter-response is possible (Staniford et al., 2002). Due to the recent surge of many popular P2P systems with a large number of users (Slyck news), P2P systems can be a potential vehicle for the active worm attacker to achieve fast propagation. The propagation of active worms in the Internet enables one to control thousands of hosts by launching distributed denial of service attacks, accessing confidential information, and destroying valuable data. In this paper, we analyze the impacts of Peer-to-Peer (P2P) systems on active worm propagation in the Internet. The goal of our work is to develop mathematics-based method that can be used to better understand and compare the behavior of active worms between L systems and O systems. We believe that the results of our work can provide important guide for P2P system design and control to address
* Corresponding author. E-mail address:
[email protected] (T. Li). 0167-4048/$ – see front matter ª 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.cose.2006.10.003
214
computers & security 26 (2007) 213–218
the concerns of active worm propagation. The rest of this paper is organized as follows. We introduce the parameters for characterizing their propagation and present the model in Section 2. In Section 3, numerical analysis results and discussions are given. Conclusions of this paper are discussed in Section 4.
Table 1 – Parameters and notations in this paper Parameters T P0 U Pb
2.
Modeling the spread of active worms
We assume that the system IP address space is the IP address space of IPv4, or 232. In the IPv4 address space, some valid IP addresses are not actually utilized, or are non-routable, or are not even applicable to the host (based on the previous statistical result (Zeitoun and Jamin, 2003), only 24% of available addresses are used by active hosts). To simplify our analysis, we assume that the total IP addresses are used by a number of units and each unit has the same number of IP addresses. Within the same unit, worm infected host can scan the target IP addresses without being blocked. The worm scan crossing different units will be blocked with certain probability due to the unit security devices deployed at the unit network edge. When the worm is detected, people will try to slow it down or stop it. A patch, which repairs the security hole of the machines, is used to defend against worms. When an infected or vulnerable machine is patched, it becomes an invulnerable machine. In this paper, we do not consider the time taken for the infected host to find the vulnerability of victims and assume that the worm infecting one victim takes a unit time. At the system’s initial time, we assume that there are certain number of infected hosts for either system and the infected hosts are already in the P2P systems for L system. Table 1 lists all parameters and notations in this paper. In this paper, we focus on comparing random scanning with the hit-list scanning approach based on P2P system. In random scanning, worm infected hosts do not have any prior vulnerability knowledge or active/inactive information of other hosts. The worm host randomly selects the IP addresses of victim targets from the global IP address space and launches the worm attack. When the new host is infected, it continuously attacks the system by using the same method. Hit-list scanning is a technique for accelerating the initial spread of a worm. When the technique is used on top of P2P system, some new characters have been presented. We assume that worms can simultaneously scan many machines and will not re-infect a machine that is already infected. Before the active worms spread (i ¼ 0), M(0) ¼ M and N(0) ¼ h. Theorem 1. In an L system, if there are M(i) vulnerable machines (including the infected ones), and N(i) infected computers, then on average, the next time tick will have: We assume h < MP1, If
i X
SNðjÞ MP1 þ ðTP0 MÞP2 ;
j¼0
" Eði þ 1Þ ¼ ðMP1 ðiÞ NðiÞÞ 1 1
1 MP1 þ ðTP0 MÞP2
SNðiÞ #
N P1 P2 R S h
N (i)
M(i)
E(i) d p
Notations The total IP addresses in the system Probability of the IP address being utilized by the host Number of units in the system and each unit has T/U IP addresses Probability of a scan being blocked by units edge security devices The number of vulnerable hosts in the system Probability of the vulnerable host joining the P2P system Probability of the non-vulnerable host joining the P2P system Size of a P2P system (R ¼ M P1 þ (TP0 M )P2) Scan rate of worm infecting host Size of hit-list (the number of infected machines at the beginning of the spread of active worms) The number of infected hosts at the time i (N(0) is the number of initial infected hosts in the system. N(0) ¼ h) The number of vulnerable hosts at the time i (M(0) is the number of vulnerable hosts which can be infected at the system initial time) The number of newly infected hosts added at step i (E(0) ¼ 0) The rate at which an infection is detected on a machine and eliminated without patching The rate at which an infected or vulnerable machine becomes invulnerable
h i Nði þ 1Þ ¼ ð1 d pÞNðiÞ þ ð1 pÞi MP1 NðiÞ " SNðiÞ # 1 1 1 MP1 þ ðTP0 MÞP2 i X
If
SNðjÞ MP1 þ ðTP0 MÞP2 ;
j¼0
Eði þ 1Þ ¼ ðMð1 P1 ÞðiÞ NðiÞÞ " SNðiÞ # 1 þ ðU 1Þð1 Pb Þ 1 1 UðT ðMP1 þ ðTP0 MÞP2 ÞÞ h i Nði þ 1Þ ¼ ð1 d pÞNðiÞ þ ð1 pÞi Mð1 P1 Þ NðiÞ " SNðiÞ # 1 þ ðU 1Þð1 Pb Þ 1 1 UðT ðMP1 þ ðTP0 MÞP2 ÞÞ
And MP1(i > L) ¼ 0, M(1 P1)(i L) ¼ 0, where L ¼ min(i) for j¼0 SNðjÞ MP1 þ ðTP0 MÞP2 . This theorem refers Theorem 3 in Yu et al. (2004).
Pi
Proof. In order to prove the above theorem, in the L system, we classify the worm scan into following two steps: 1. attack the P2P system; 2. attack the rest system with random scanning. Step1: as there are M1(0) ¼ MP1 vulnerable hosts in the P2P system and total P2P size is MP1 þ (TP0 M )P2. Since there are
computers & security 26 (2007) 213–218
215
M1(i) N(i) vulnerable hosts that have not been infected, Let E(i) denote the number of newly infected machines at time tick i(i 0). N(i) infected machines can generate SN(i) scans in an attempt to infect other machines. So if we can prove Eði þ 1=kÞ ¼ M1 ðiÞ NðiÞ ½1 ð1 ð1=ðMP1 þ ðTP0 MÞP2 ÞÞÞk for any k (k > 0) scans, then the equation also holds when k ¼ SN(i). We prove the above equation by induction on k. When k ¼ 1, since there are (M1(i)N(i)) vulnerable machines that have not yet been infected, the probability that one scan can add a newly infected machine is ðM1 ðiÞ NðiÞÞ=ðMP1 þðTP0 MÞP2 Þ, which is equivalent to ðM1 ðiÞ NðiÞÞ½1 ð1 ð1=ðMP1 þ ðTP0 MÞP2 ÞÞÞ1 . Suppose that the theorem is true for k ¼ j. we have: Eði þ 1=k ¼ jÞ ¼ ðM1 ðiÞ NðiÞÞ½1 ð1 ð1=ðMP1 þ ðTP0 MÞP2 ÞÞÞj . Then, when k ¼ j þ 1, we divide j þ 1 scans into two parts: the first j scans and the last scan. There are two possibilities for the last scan: adding a newly infected machine or not. Let the variable Y ¼ 1 if the last scan hits a vulnerable machine that has not yet been infected and let Y ¼ 0 otherwise. Then,
where i 0 and N(0) ¼ h < MP1. The recursion process will stop when there are no more vulnerable machines left or when the worm cannot increase the total number of infected machines in the ‘P2P’ system. Pi When j¼0 SNðjÞ MP1 þ ðTP0 MÞP2 , all the attack resources are applied to attack the rest of system, M1(i > L) ¼ 0, Pi M2(i L) ¼ 0, where L ¼ min(i) for j¼0 SNðjÞ MP1 þ ðTP0 MÞP2 . Step 2: after attacking the P2P system, all infected hosts continuously attack the rest system. As the total P2P size is MP1 þ ðTP0 MÞP2 . The number of IP addresses not having been attacked is T ðMP1 þ ðTP0 MÞP2 Þ. Since there are (M2(i) N(i)) vulnerable machines that have not been infected at time tick i, M2(i) ¼ M(1 P1). The probability that one scan can add a newly infected machine is ðM2 ðiÞ NðiÞÞðð1þ ðU 1Þð1 Pb ÞÞ=ðUðT ðMP1 þ ðTP0 MÞP2 ÞÞÞÞ. Similarly, we have:
Eði þ 1=k ¼ j þ 1Þ ¼ ðEði þ 1=k ¼ jÞ þ 1ÞPðY ¼ 1Þ þ Eði þ 1=k ¼ jÞPðY ¼ 0Þ
Eði þ 1Þ ¼ ðM2 ðiÞ NðiÞÞ " SNðiÞ # 1 þ ðU 1Þð1 Pb Þ 1 1 UðT ðMP1 þ ðTP0 MÞP2 ÞÞ
¼ ðEði þ 1=k ¼ jÞ þ 1Þ M1 ðiÞ NðiÞ Eði þ 1=k ¼ jÞ MP1 þ ðTP0 MÞP2 M1 ðiÞ NðiÞ Eði þ 1=k ¼ jÞ þ Eði þ 1=k ¼ jÞ 1 MP1 þ ðTP0 MÞP2
¼
M1 ðiÞ NðiÞ 1 þ 1 MP1 þ ðTP0 MÞP2 MP1 þ ðTP0 MÞP2 Eði þ 1=k ¼ jÞ "
¼ ðM1 ðiÞ NðiÞÞ 1 1
1 MP1 þ ðTP0 MÞP2
jþ1 #
Which means that when k ¼ j þ 1, it is also true. Therefore, Pi when k ¼ SN(i), we have results: if j¼0 SNðjÞ MP1 þ Eði þ 1Þ ¼ ðM1 ðiÞ NðiÞÞ½1 ð1 ð1=ðMP1 þ ðTP0 ðTP0 MÞP2 , MÞP2 ÞÞÞSNðiÞ . That is, on the next time tick there will be ðM1 ðiÞ expected newly NðiÞÞ½1 ð1 ð1=ðMP1 þ ðTP0 MÞP2 ÞÞÞSNðiÞ infected machines. Given death rate d and patching rate p, on the next time tick there will be dN(i) þ pN(i) infected machines that will change to either vulnerable machines without being infected or invulnerable machines, and the total number of vulnerable machines (including the infected ones) will be reduced to (1 p)M1(i). Therefore, on the next time tick the number of total infected machines will be Nði þ 1Þ ¼ NðiÞ þ ðM1 ðiÞ NðiÞÞ½1 ð1 ð1=ðMP1 þ ðTP0 MÞP2 ÞÞÞSNðiÞ ðd þ pÞNðiÞ. At the same time, M1 ði þ 1Þ ¼ ð1 pÞM1 ðiÞ, which gives M1 ðiÞ ¼ ð1 pÞi M1 ð0Þ ¼ ð1 pÞi MP1 . That is, i h Nði þ 1Þ ¼ ð1 d pÞNðiÞ þ ð1 pÞi MP1 NðiÞ " SNðiÞ # 1 1 1 MP1 þ ðTP0 MÞP2
If
i X
SNðjÞ MP1 þ ðTP0 MÞP2 ;
j¼0
And given death rate d and patching rate p, we have: Nði þ 1Þ ¼ NðiÞ þ ðM2 ðiÞ NðiÞÞ½1 ð1 ðð1 þ ðU 1Þð1 Pb ÞÞ=ðU ðT ðMP1 þ ðTP0 MÞP2 ÞÞÞÞÞSNðiÞ ðd þ pÞNðiÞ. At the same time, M2(i þ 1) ¼ (1 p)M2(i), which gives M2 ðiÞ ¼ ð1 pÞi M2 ð0Þ ¼ ð1 pÞi Mð1 P1 Þ. That is, h i Nði þ 1Þ ¼ ð1 d pÞNðiÞ þ ð1 pÞi Mð1 P1 Þ NðiÞ " SNðiÞ # 1 þ ðU 1Þð1 Pb Þ 1 1 UðT ðMP1 þ ðTP0 MÞP2 ÞÞ where MP1(i > L) ¼ 0, M(1 P1)(i L) ¼ 0, L ¼ min(i) for Pi j¼0 SNðjÞ MP1 þ ðTP0 MÞP2 . The recursion process will stop when there are no more vulnerable machines left or when the worm cannot increase the total number of infected machines in the rest system. , Theorem 2. In an O system if there are M(i) vulnerable machines (including the infected ones), and N(i) infected computers, then on average, the next time tick infected machines will be: i h Nði þ 1Þ ¼ ð1 d pÞNðiÞ þ ð1 pÞi MP1 NðiÞ " SNðiÞ # 1 þ ðU 1Þð1 Pb Þ 1 1 TU
Proof. We can easily prove the result by using similar approaches. ,
3.
Numerical results and discussions
In this section, we evaluate the numerical performance by using models with different parameters for the different
216
computers & security 26 (2007) 213–218
systems. We report the performance results along with observations.
3.1.
Models
1. Performance metrics: the two kinds of systems performance is defined as follows: the time taken t (X axis) to infected host numbers (Y axis). 2. Evaluation systems: the system is defined by a tuple: CA, T, P0, U, Pb, P1, P2, S, h, d, pD, representing the system configuration parameters. A determines an L system or an O system. Other parameters have the same definition as in Table 1. The following parameters are set with constant values (T ¼ 2,500,000, P0 ¼ 0.2, S ¼ 2) in all our simulations. 3. Evaluation method: we use numerical analysis to obtain performance data.
3.2.
Performance results
In this section, we report the performance results along with observations. Fig. 2 – Effect of patching rate. 1. The comparison between L system and O system. The general system is configured as C*, 2,500,000, 0.2, U, Pb, P1, P2, 2, h, 0.00002, pD. We assume U ¼ 1 and Pb ¼ 100% in the following observations (1), (2) and (3); p ¼ 0.000002 in (1), (3), (4), and (5); and h ¼ 1 in (1), (2), (4), and (5). We define P1 ¼ 0.01 and P2 ¼ 0.01 in L system, it is not related to P1 and P2 in the O system. (1) In Fig. 1, we make the following observations: the worms spread more quickly in L system. (2) In Fig. 2, p ˛ {0.000002, 0.0002}. We make the following observations: as the patching rate grows, the spread of active worms slows down more quickly in either system. The L system is influenced more evidently with different patching rate.
(3) In Fig. 3, h ˛ {1, 20}. We make the following observations: as the size of the hit-list increases, it takes the worms less time to spread in either system. (4) In Fig. 4, U ˛ {3, 10}, Pb ¼ 70%. We make the following observations: as the U grows, the spread of active worms slows down in either system. (5) In Fig. 5, Pb ˛ {70%, 90%}, U ¼ 10. We make the following observations: as the Pb grows, the spread of active worms slows down in either system. 2. The sensitivity of P2P system size in L system. Fig. 6 shows the data on the sensitivity of attack performance under different P2P system sizes in L system. The
Fig. 1 – Comparison of L system with O system.
Fig. 3 – Effect of hit-list size.
computers & security 26 (2007) 213–218
Fig. 4 – Effect of the number of U.
general system is configured as CI, 2,500,000, 0.2, 1, 0, P1, P2, 2, 1, 0.00002, 0.000002D. In this figure, (P1, P2) ˛ {(0.01, 0.01), (0.02, 0.02)} and then P2P size R ˛ {1000, 10,000}. The relative ratio is fixed between the number of vulnerable hosts and the number of invulnerable hosts. We observe that: with the P2P size increases, the attack performance becomes consistently better.
4.
Performance evaluation and conclusion
In this paper we have analyzed the characteristics of the spread of active worms between L system and O system.
217
Fig. 6 – The attack performance sensitivity to P2P system size.
P2P systems can be a potential vehicle for active worms to achieve fast worm propagation in the Internet. The worm can propagate in L system more efficiently. Installing the worm patching in time can effectively defend against the propagation of worms in either system, especially in L system during the early stage of infection. Increasing the number of local units or the probability of a scan blocked by unit edge security devices, the spread of worm slows down. On the contrary, increasing the size of the hit-list is able to accelerate the initial spread of a worm in either system with the same ratio – the number of vulnerable hosts to the number of invulnerable hosts in the P2P system, larger size of the P2P system can achieve better worm propagation performance.
Acknowledgment This work is supported by the National Natural Science Foundation of China under Grant 60573005.
references
Fig. 5 – Effect of the number of Pb.
Machie A, Roculan J, Russell R, Velzen MV. Nimda Worm Analysis. Incident Analysis, SecurityFocus, Technical Report. September 2001. Russell R, Machie A. Code Red II Worm. Incident Analysis, SecurityFocus, Technical Report. August 2001. Staniford S, Paxson V, Weaver N. How to own the Internet in your spare time. In: Proceedings of the 11th USENIX Security Symposium (Security’02); 2002. Slyck news,
. Yu W, Boyer C, Xuan D. Analyzing impacts of peer-to-peer systems on propagation of active worm attacks. Technical report.
218
computers & security 26 (2007) 213–218
The Department of Computer Science and Engineering, The Ohio State University; 2004. Zeitoun A, Jamin S. Rapid exploration of Internet live address space using optimal discovery path. In: Proceedings of IEEE GLOBECOM (Next Generation Networks and Internet), San Francisco, CA; December 2003.
Tao Li received the M.S degree from Yangtze University, Jingzhou, China in 1997. He is currently working toward Ph.D. degree in the Department of Control Science and Control Engineering, Huazhong University of Science and Technology, Wuhan, China. His current research interests include network & information security, impulsive control, and nonlinear systems. Zhihong Guan received the Ph.D. degree from South China University of Technology, Guangdong, China, in 1994. He is also currently a professor in Huazhong University of Science
and Technology, Wuhan, China. His current research interests include nonlinearity complex network system, analysis and applications of impulsive systems and hybrid systems, control of nonlinear dynamics, chaos synchronization and control, networked control systems, robotics and applications. Xianyong Wu received the M.S degree from Huazhong University of Science and Technology, Wuhan, People’s Republic of China, in 1997. He is currently working for his Ph.D. degree in the Department of Control Science and Control Engineering, Huazhong University of Science and Technology, Wuhan, China. He is also currently an assistant professor in the college of Electronics and information, Yangtze University, Jingzhou, China. His current research interests include information hiding, chaotic control and chaotic synchronization and secure communication.