Design of a sliding window scheme for detecting high packet-rate flows via random packet sampling

Computer Networks 55 (2011) 1351–1363 Contents lists available at ScienceDirect Computer Networks journal homepage: www.elsevier.com/locate/comnet ...

Download PDF

661KB Sizes 4 Downloads 99 Views

Report

PDF Reader
Full Text

Computer Networks 55 (2011) 1351–1363

Contents lists available at ScienceDirect

Computer Networks journal homepage: www.elsevier.com/locate/comnet

Design of a sliding window scheme for detecting high packet-rate ﬂows via random packet sampling Takanori Kudo ⇑, Tetsuya Takine Department of Information and Communications Technology, Graduate School of Engineering, Osaka University, 2-1 Yamadaoka, Suita 565-0871, Japan

a r t i c l e

i n f o

Article history: Received 19 May 2010 Received in revised form 31 October 2010 Accepted 15 December 2010 Available online 21 December 2010 Responsible Editor: A. Popescu Keywords: Random packet sampling Sliding window scheme High packet-rate ﬂows

a b s t r a c t We discuss the design of a sliding window scheme for detecting high packet-rate ﬂows via random packet sampling. We determine the values of control parameters, such as the sampling rate and window length, to minimize the false positive ratio, while keeping the false negative ratio sufﬁciently low and making the on-line processing possible. Under mild assumptions, we formulate this problem as a nonlinear program and provide its numerically feasible global optimal solution. We then conduct sampling experiments with public trace data and discuss the fundamental characteristics of the sliding window scheme with random packet sampling. Ó 2010 Elsevier B.V. All rights reserved.

1. Introduction In recent years, accurate trafﬁc measurement and monitoring are regarded as crucial for network management, trafﬁc engineering, and security tasks. With the rapid growth of link speed, however, it is very hard to capture all packets in backbone trafﬁc. To reduce overhead in trafﬁc measurement (e.g., router CPU, memory, IO, and bandwidth in exporting data/reports) and to keep the scalability in trafﬁc monitoring, packet sampling is considered to be a promising technique [1] and it is often deployed at routers [2]. In this paper, we consider the on-line detection of high packet-rate ﬂows via random packet sampling. A ﬂow is deﬁned as a set of packets with the same key attributes that do not alter during their transmissions through the network. Typical key attributes include source/destination IP address, source/destination port number, and transport layer protocol. Although this 5-tuple is commonly used to deﬁne ﬂows, ﬂows can be deﬁned arbitrarily and as ⇑ Corresponding author. Tel.: +81 6 6879 7742; fax: +81 6 6875 5901. E-mail addresses: [email protected] (T. Kudo), [email protected] (T. Takine). 1389-1286/$ - see front matter Ó 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.comnet.2010.12.019

we will see, our discussion is applicable to any deﬁnition of ﬂows. It is well known that a small number of ﬂows account for a large portion of the total trafﬁc volume, and typically their packet rates are very high [3]. Such ﬂows are also called elephant ﬂows or heavy hitters. Because high packet-rate ﬂows have a great impact on the network performance, it is important to detect/identify them promptly in network management and trafﬁc engineering [4]. Note that high packet-rate ﬂows also appear due to denial of service (DoS) attacks such as SYN Flooding [5,6] and smurf attacks [5]. Usually, ﬂow-level measured data is collected at routers and periodically sent to the collection point, where network operators process those data. This scheme, however, has limitation on rapidity of detection and may waste bandwidth. We implicitly assume that each router can sample packets, update and process the sliding window, and it sends only the result of the data analysis to the collection point. Such a function is particularly useful for tier-1 ISPs because the number of measurement points could be a few thousand or more. So far a lot of research efforts have been made to infer ﬂow characteristics [3,7,8], to estimate ranking [9] or top-k queries [10] from sampled data, and to detect high

1352

T. Kudo, T. Takine / Computer Networks 55 (2011) 1351–1363

packet-rate/elephant ﬂows [4,11,12] and trafﬁc anomalies [13,14], to name but a few. To the best of our knowledge, however, most of them discuss statistical inference methodologies for a given set of packets sampled in a certain interval of time. Note that if sufﬁcient computational resources are available, off-line data analytic algorithms can be combined directly with a jumping window scheme, where a set of old data is replaced periodically by a disjoint set of new data. For example, Estan and Varghese propose a jumping-window based on-line algorithm, called sample and hold, for identifying large ﬂows in terms of byte volume [11]. The algorithm processes headers of all packets and if the ﬂow ID of a packet does not exist in the memory, the new entry is created with probability proportional to the packet size. In continuous monitoring of trafﬁc, however, the set of sampled packets should be updated gradually. Lu et al. [12] propose ElephantTrap, which can detect elephant ﬂows only with a small cache. The idea behind the algorithm is similar to Frequent [15,16], a well-known streaming algorithm for ﬁnding frequent items. Note that ElephantTrap does not attempt to estimate the size of detected ﬂows, even though it is a light-weight algorithm with random packet sampling. To update the set of sampled packets, we employ a sliding window scheme, where the time window (called sliding window) of a ﬁxed length TSW is divided into K unit times called basic windows, and the sliding window is updated every unit time (i.e., basic window). Note that the jumping window scheme is a special case of the sliding window scheme with K = 1. The sliding window scheme is very simple and therefore it seems to be a natural scheme in applying off-line data analytic algorithms to continuous monitoring of trafﬁc. In most cases, we have to maintain data only in one sliding window and data in old basic windows can be discarded. Note here that data collected in each sliding window should be processed completely within a unit time on average. This requirement adds a new constraint to the implementation of off-line algorithms as on-line algorithms, and the constraint makes it more difﬁcult to design monitoring systems. In this paper, we consider a scenario that a network operator continuously monitors packets transmitted through a high-speed link and attempts to detect high packet-rate ﬂows in a timely manner within an allowable false negative ratio, where high packet-rate ﬂows are deﬁned as ﬂows whose packet rates are equal to or greater than a predeﬁned threshold R [packet/s]. From a viewpoint of statistical inference, detecting high packet-rate ﬂows might be one of the simplest problems, yet there is a difﬁculty of avoiding high false positive ratio inherent in random packet sampling with low sampling rate [9]. In the sliding window scheme with random packet sampling, we have to determine the length TSW of sliding windows, the number K of basic windows in a sliding window, and the sampling rate f of packets. These control parameters affect the performance of the scheme. Roughly speaking, large values of TSW and/or f yield many sampled packets, so that the accuracy of statistical inference will be improved, whereas the processing overhead becomes

large. On the other hand, a small value of K (i.e., a large basic window) yields a sufﬁcient time for data processing, whereas the time granularity in monitoring becomes coarse. We formulate this design problem as a nonlinear program with a numerically feasible global optimal solution. Many works have been done on efﬁcient data processing of sampled packets [17,18] and their implementation environments [19,20]. In our formulation, however, we do not specify the data processing scheme and its implementation environment explicitly. Essentially, the larger the number of sampled packets/ﬂows is, the longer it takes time to process them. We apply this principle to our formulation. More speciﬁcally, we introduce a function G(x), which represents the post-processing time of x sampled packets in a basic window, and through the function G(x), the employed data processing scheme and its implementation environment are taken into account. As a result, our formulation yields a generic design problem of the sliding window scheme with random packet sampling. The rest of the paper is organized as follows. Section 2 describes our scheme and its design problem. In Section 3, we formulate the design problem as a nonlinear program and derive its global optimal solution. Section 4 provides some experimental results using public trace data. Finally, concluding remarks are provided in Section 5.

2. Our scheme and problem statement In this section, we ﬁrst discuss random packet sampling and our sliding window scheme brieﬂy, and then we state the design problem considered in this paper. 2.1. Random packet-sampling In random packet sampling, packets are sampled independently with probability f (0 < f 6 1). Let X and Y denote random variables representing the number of packets and the number of sampled packets, respectively, in a randomly chosen ﬂow observed in an interval of length TSW [s]. We then have for y = 0, 1, . . . , x

Pr ½Y ¼ yjX ¼ x ¼

x y f ð1 f Þxy : y

Recall that we attempt to detect ﬂows whose packet rates are not less than R [packet/s], i.e., X P RTSW. Let x⁄ denote x⁄ = dRTSWe. Hereafter, we call ﬂows with x⁄ or more packets target ﬂows and among them, we call ﬂows with exactly x⁄ packets threshold ﬂows. In order to control the false negative ratio in detecting target ﬂows, we introduce the allowable false negative ratio (0 < < 1) for threshold ﬂows. We regard a ﬂow as a target one if the number of sampled packets is not less than y⁄, where y⁄ = y⁄(R, TSW, f, ) is given by

y ¼ max fy; Pr½Y 6 yjX ¼ x 6 g y ( ) y1 X x x i i ¼ max y; f ð1 f Þ 6 : y i i¼0

ð1Þ

1353

T. Kudo, T. Takine / Computer Networks 55 (2011) 1351–1363

Note that most target ﬂows consist of packets more than x⁄. Thus the overall false negative ratio in the above random packet sampling would be much smaller than . Note also that once y⁄ is ﬁxed, the false positive ratio for a ﬂow with z (z < x⁄) packets is given by

i¼0

z i

f i ð1 f Þzi :

Fig. 1 shows y⁄ as a function of f, where R = 4000 [packet/s] and = 0.01. Because y⁄ takes a natural number, it is a unit step function of f when TSW is ﬁxed. 2.2. Sliding window scheme

80 *

minðy X1;zÞ

T =20 120 TSW=10 SW TSW=5 100

y

1

140

60 40 20 0

0

0.0005

0.001

0.0015

0.002

f

In order to detect target ﬂows from sampled packets, we have to collect, analyze, and update data of sampled packets repeatedly. For this purpose, we utilize a sliding window scheme whose window length is denoted by TSW [s]. A sliding window consists of K successive basic windows of ﬁxed length TBW = TSW/K [s]. When data processing in the current sliding window is completed, the sliding window is updated by discarding data in the oldest basic window and adding data collected in the new basic window. Therefore successive sliding windows are overlapped each other when K P 2, and each packet appears in K successive sliding windows.

Fig. 1. y⁄ as a function of f (R = 4000, = 0.01).

Venkataraman et al. [21] propose streaming algorithms with random packet sampling for detecting superspreaders. Their goal is to ﬁnd source IPs that send packets to more than k different destination IPs. They set parameters under the constraints of false negative ratio and false positive ratio, while we set parameters under the constraints of false negative ratio and processing times, and our targets are high packet-rate ﬂows that can be deﬁned arbitrarily.

2.3. Problem statement

3. Nonlinear programming formulation

The scheme considered in this paper combines the sliding window scheme with random packet sampling, which is characterized by three control parameters, the length TSW of sliding windows, the number K of basic windows in a sliding window, and the sampling rate f of packets, as well as predeﬁned parameters R and . To determine those control parameters, we consider a special case of threshold ﬂows, called a reference ﬂow, which is deﬁned as follows. The ﬁrst packet in the reference ﬂow arrives at an arbitrary time tR , and inter-arrival times of subsequent packets are ﬁxed to 1/R [s]. With this reference ﬂow, we aim to determine control parameters TSW, K, and f by solving the following optimization problem,

3.1. Constraints

min s.t.

the false positive ratio the false negative ratio for the reference ﬂow 6 , the reference ﬂow is detected by time t R þ T D max , and the scheme works as an on-line scheme,

We consider the last two constraints in our optimization problem stated in Section 2.3. The length TD of the detection delay of the reference ﬂow is bounded above:

T D 6 T SW þ T SW =K þ s 6 T D

ð2Þ

where s denotes the maximum post-processing time of data in a sliding window. The second term TSW/K on the right hand side of (2) represents the length of a basic window, which is required by the following reason. The ﬁrst packet in the reference ﬂow may appear in the middle of a basic window, say BWi, and the packet rates in sliding windows that contain the basic window BWi would be less than R. Thus the detection of the reference ﬂow can be delayed one basic window. On the other hand, in order to ensure the on-line processing, we have to ﬁnish post-processing of data in the current sliding window before all data in the new basic window is collected. Therefore we have

s 6 T SW =K: where TD_max denotes the maximum allowable detection delay, which is also a predeﬁned parameter, and the constraints in the false negative ratio and the detection delay are given in terms of the reference ﬂow. Note here that once the sampling rate f and the length TSW of sliding windows are determined, the ﬁrst constraint on the false negative ratio can be fulﬁlled by setting y⁄ according to (1). Thus, in the next section, we exclude this constraint and formulate the resulting problem as a nonlinear program.

max ;

ð3Þ

We now evaluate the maximum post-processing time s. When the sliding window is updated, data in the new basic window is added. Therefore we assume that the post-processing time of data in a sliding window is bounded above by a positive, strictly increasing function G() of the number of packets sampled in a basic window. Note that when n packets are transmitted in a basic window of length TSW/ K, f n packets are sampled on average. Let Cmax denote the maximum packet rate. We then evaluate s as follows:

1354

T. Kudo, T. Takine / Computer Networks 55 (2011) 1351–1363

fC max T SW ; K

s¼G

ð4Þ

where CmaxTSW/K represents the maximum number of packets transmitted in a basic window. Because the number of ﬂows is not greater than the number of packets, fCmaxTSW/K is also regarded as an upper bound of the mean number of sampled ﬂows in a basic window. 3.2. Objective function Next we consider the objective function of our optimization problem. Let PWD(r) denote the false positive ratio of a ﬂow whose packet rate is equal to r [packet/s].

PWD ðrÞ ¼ Pr½Y P y jX ¼ rT SW ; where r(r < R) is set in such a way that r TSW is integer. Because PWD(r) = 0 for r such that X = rTSW < y⁄, we assume rTSW P y⁄. We now transform the objective function PWD(r) to an analytically tractable one. Note ﬁrst that

P WD ðrÞ ¼ 1

1 yX

rT SW y

y¼0

f y ð1 f ÞrT SW y :

is approximately equivalent to minimizing the false positive ratio of non-target ﬂows with any packet rate r (r < R). To validate this claim, we provide some numerical results. Fig. 2 shows the false positive ratio of ﬂows with packet rate r = 2000 [packet/s] as a function of f, where R = 4000 and = 0.01. We observe that PWD(r) is not a strictly decreasing function. Note ﬁrst that y⁄ is a natural number. Further, when TSW is ﬁxed, the number of sampled packets increases with f. Thus the false positive ratio increases within each range of f where y⁄ remains constant. In the same setting as in Figs. 2, 3 shows the false positive ratio of ﬂows with packet rate r = 2000 [packet/s] as a function of fTSW. It is interesting to observe that regardless of the value of TSW, the false positive ratio has a very similar characteristic and it is bounded above by a strictly decreasing function of fTSW. We observed the same phenomenon for r = 1000 and 3000, too. Thus we conclude that the product fTSW is an essential quantity to control the false positive ratio of non-target ﬂows. 3.3. Global optimal solution So far, we have shown that the design of our scheme can be formulated as the following nonlinear program.

We thus apply the Poisson approximation of the binomial distribution (e.g., Section VI.5 of [22]): For a sufﬁciently large x 1 and a sufﬁciently small f 1, a binomial distribution can be approximated well by a Poisson distribution with the same mean.

We then have

PWD ðrÞ 1

1 yX

rfT SW

e

y¼0

ðrfT SW Þy : y!

ð5Þ

TSW=5 TSW=10 TSW=20

0.8

PWD (2000)

x y ðfxÞy f ð1 f Þxy efx : y! y

1

0.6 0.4 0.2

For mice ﬂows (i.e., ﬂows with r R), the Poisson approximation gives the upper bound of PWD(r).

0

0

⁄

Theorem 1. Suppose rTSW is an integer. If drfTSWe 6 (y 1)/2, (i.e., y⁄ P 2drfTSWe + 1) we have 1 yX

y¼0

erfT SW

ðrfT SW Þy : y!

0.001 f

0.0015

0.002

Fig. 2. False positive ratio as a function of f (R = 4000, r = 2000, = 0.01).

ð6Þ

The proof of Theorem 1 is given in Appendix A. Note here that rfTSW > 0 represents the mean number of sampled packets of a ﬂow with packet rate r in a sliding window. Thus the condition of Theorem 1 implicitly assumes that y⁄ P 3 because drfTSWe P 1. It also suggests that a large y⁄ is preferable because the range of r in which (6) holds widens with the increase of y⁄. This conjecture will be discussed with experimental results in Section 4.3. (5) suggests that for arbitrarily ﬁxed y⁄ and r, the false positive ratio of non-target ﬂows is a strictly decreasing function of fTSW. Because the expected amount of information we can obtain increases with the mean number r fTSW of sampled packets, this feature coincides with an intuition that the more we obtain samples, the more the accuracy improves. Thus we claim that maximizing fTSW

1

TSW=5 TSW=10 TSW=20

0.8 PWD(2000)

PWD ðrÞ < 1

0.0005

0.6 0.4 0.2 0

0

0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 fTSW

Fig. 3. False positive ratio as a function of fTSW (R = 4000, r = 2000, = 0.01).

1355

T. Kudo, T. Takine / Computer Networks 55 (2011) 1351–1363

P:

max s.t.

f TSW TSW > 0, f > 0, K is a natural number, fC max T SW Kþ1 6 T D max , K K T SW þ G fC max T SW T SW G 6 K , K

where the last two constraints come from (2)–(4). Recall that G(x) (x P 0) is a positive, strictly increasing function of x. To make things tractable, we assume that G(x) is differentiable, which implies that the inverse function G1(x) of G(x) exists and G0 (x) > 0 (x P 0). Suppose (f, TSW) is a feasible solution of problem P. It follows from the last constraint that

T SW fC max T SW PG > Gð0Þ; K K

max

P ðK þ 1ÞG

fC max T SW fC max T SW þG K K

> ðK þ 2ÞGð0Þ; which is a necessary condition of problem P being feasible. Note that G(0) is considered as overhead in processing data in a sliding window, and the maximum allowable detection delay should be set in such a way that

TD

max

> 3Gð0Þ;

ð7Þ

because K is a natural number. In what follows, we assume TD_max > 3G(0) and deﬁne a non-empty ﬁnite set K of natural numbers as

K¼

T D max 2 : 1; 2; . . . ; Gð0Þ

Theorem 2. Given K 2 K, there exists the unique global

optimal solution f ; T SW of problem P:

T SW

K þ2 T D max G1 ; C max T D max K þ2 K T D max : ¼ K þ2

The proof of Theorem 2 is given in Appendix B. In the proof, the last two constraints in problem P are shown to be active for a given K 2 K and therefore these constraints hold with equality when f = f⁄ and T SW ¼ T SW . The optimal value f T SW of the objective function in problem P is proportional to KG1(TD_max/(K + 2)). As a result, the optimal K = K⁄ is given by

K ¼ arg max KG K2K

1

T D max ; K þ2

ð8Þ

and the optimal f = f⁄ and T SW ¼ T SW of problem P are given by

f ¼ T SW

K þ 2 T D max ; G1 C max T D max K þ2 K ¼ T D max ; K þ2

respectively.

In this section, we provide experimental results for public trace data and discuss some fundamental characteristics of the sliding window scheme. We use CAIDA trace data [23], which was measured in a backbone link of 10 Gbps from 6:00 to 6:05 on March 31, 2009. We regard a collection of packets with the same 5-tuple as a ﬂow. CAIDA trace data of 300 s measurement contains 152,593,821 packets and 13,603,014 ﬂows. 4.1. Assumptions in experiments To conduct our experiments, we assume that the postprocessing time of data in a sliding window is a linear function of the number of sampled ﬂows in a basic window (which is bounded above by the number of sampled packets in the basic window), i.e.:

GðxÞ ¼ D1 x þ D2 ;

Note that problem P is feasible if K 2 K.

f ¼

1. Set the values of K = K⁄ and T SW ¼ T SW by (8) and (10), respectively, 2. set the temporary value of f = f⁄ by (9), 3. obtain y⁄ by (1), and 4. set the ﬁnal value of f = f⁄⁄, where Py 1 dRT SW e i f ¼ arg min f ð1 f ÞdRT SW ei 6 . i¼0 i f 4. Experiments with trace data

and therefore from the second last constraint

TD

As shown in Section 3.2, the false positive ratio is an increasing function of fTSW in the range of f where y⁄ remains constant. Taking account of it, we set three control parameters K, TSW, and f according to the following procedure.

ð9Þ

where D1 [s] denotes the post-processing time of an individual sampled ﬂow and D2 [s] denotes the post-processing time of data in a sliding window, which is independent of the number of sampled ﬂows. Note that D2 includes time needed for freeing up memory space of the oldest basic window and summarizing/exporting the detection result in the current sliding window to the collection point. We also assume TD_max > 3D2, which comes from the feasibility condition (7). The optimal K⁄ is then given by

K ¼

n 8 K > max T > Kþ2 D þ < arg K2fK ;K g > > :

TD 1;

max

max

o K D2 ;

> 9D2 =2;

3 D2 < T D

max

6 9D2 =2;

where

$sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ % 2T D max 2 ; K ¼ D2

&sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ’ 2T D max K ¼ 2 : D2 þ

Further the optimal T SW is given in (10) and the optimal f⁄ in (9) becomes

ð10Þ f ¼

TD

max ðK þ 2ÞD2 : C max D1 T D max

1356

T. Kudo, T. Takine / Computer Networks 55 (2011) 1351–1363

Table 1 Control parameters and threshold values (TD_max = 10). R

1000 2000 4000

K⁄

61

T SW

9.6825

= 0.01

= 0.05

= 0.1

f⁄⁄ (104)

y⁄

f⁄⁄ (104)

y⁄

f⁄⁄ (104)

y⁄

8.679 8.985 9.511

3 9 24

9.452 9.401 9.613

5 12 28

9.577 9.181 9.604

6 13 30

Table 2 Control parameters and threshold values (TD_max = 20). R

1000 2000 4000

K⁄

87

T SW

19.5506

= 0.01

= 0.05

= 0.1

f⁄⁄ (104)

y⁄

f⁄⁄ (104)

y⁄

f⁄⁄ (104)

y⁄

9.605 9.737 9.720

10 25 57

9.312 9.522 9.653

12 28 62

9.696 9.513 9.657

14 30 65

In our experiments, we set D1 = 5 104 [s], D2 = 5 103 [s], and Cmax = 2 106 [packet/s], and consider two cases: TD_max = 10 and 20 [s]. Further, for each case, we consider three threshold values of high packet-rate ﬂows, R = 1000, 2000, and 4000, and three allowable false negative ratios, = 0.01, 0.05, and 0.1 for the reference ﬂow with packet rate R. Tables 1 and 2 show the values of control parameters computed according to the procedure at the end of Section 3.

4.2. Performance measures We evaluate the performance of our scheme, comparing with the performance of the ideal scheme with f = 1.0 (i.e., sampling all packets), where K and TSW are identical in the two schemes. Let t1 and t2 denote the detection times of a target ﬂow in the ideal scheme and in our scheme, respectively. For simplicity, we regard the end of a sliding window that detects the target ﬂow for the ﬁrst time as its detection time. In other words, the post-processing times of data in a sliding window are assumed to be identical in both schemes. Based on the detection result, we classify target ﬂows into four classes: (i) t1 > t2, (ii) t1 = t2, (iii) t1 < t2 < 1, and (iv) t2 = 1. Note that both the false positive ratio and false negative ratio in the ideal scheme are equal to zero because f = 1.0. On the other hand, they would be positive in our scheme. Therefore some target ﬂows may be classiﬁed into class (i); our scheme may detect target ﬂows before their packet rates reach the threshold R. As far as target ﬂows are concerned, however, ﬂows both in class (i) and in class (ii) are considered to be those detected properly within TD_max. Class (iii) implies that our scheme detects those target ﬂows but their detection times may be beyond the maximum allowable detection delay TD_max. Note that class (iii) can include target ﬂows detected within TD_max. For example, consider a target ﬂow with constant packet-rate of 2R. After the generation of the ﬁrst packet, say, at time t0, the packet rate of this ﬂow (measured over an interval

of length TSW) increases linearly with the update of sliding windows, and it reaches the threshold R at time t1, where

K T SW t1 6 t0 þ 1 þ : 2 K Suppose this ﬂow is not detected by time t1 but it is detected in one of bK/2c subsequent sliding windows, where K P 2. In such a case, it is considered to be a ﬂow detected within TD_max because from (2)

K T SW K T SW 1 T SW 6 T D 1þ þ ¼ 1þ 2 2 K K K

max :

Finally, class (iv) implies false negative, i.e., we fail to detect those ﬂows. On the other hand, as for non-target ﬂows, we adopt the false positive ratio (FPR) to evaluate the performance of our scheme.

FPR ¼

the number of detected non-target flows : the total number of non-target flows

ð11Þ

We will also discuss the packet-rate of wrongly detected non-target ﬂows. 4.3. Experimental results We ﬁrst show the fundamental characteristics of CAIDA trace data [23] used in our experiments. Figs. 4 and 5 show the ranking of the maximum packet rates of ﬂows when TSW = 9.6825 [s], K = 61, and f = 1 (cf. Table 1) and when TSW = 19.5506 [s], K = 87, and f = 1 (cf. Table 2), respectively. Note that both horizontal and vertical axes of Figs. 4 and 5 are in logarithmic scale. Those ﬁgures clearly indicate the elephant/mice phenomenon (i.e., the power law) in the packet rate distribution. Tables 3 and 4 show the sampling experiment results for TD_max = 10 and 20, respectively, where N denotes the total number of target ﬂows. In these tables, the averages of 1000 independent sampling experiments and 95% conﬁdence intervals are shown. Regardless of the values of the maximum allowable detection time TD_max and the

1357

T. Kudo, T. Takine / Computer Networks 55 (2011) 1351–1363

allowable false negative ratio , our scheme detects most target ﬂows before the ideal scheme does. Note that this is a typical phenomenon in detecting high packet-rate ﬂows by means of the sliding window scheme 105

4000 2000 1000

maximum packet rate

104 103 2

10

1

10

0

10

10-1 0

10

1

10

2

10

3

10

4

10 rank

10

5

6

10

10

7

10

8

Fig. 4. Ranking of maximum packet rates of ﬂows (TD_max = 10).

105

4000 2000 1000

4

maximum packet rate

10

103 102 1

10

100 10-1 10

0

10

1

10

2

10

3

4

10 rank

10

5

10

6

10

7

8

10

Fig. 5. Ranking of maximum packet rates of ﬂows (TD_max = 20). Table 3 The number of detected target ﬂows and its ratio ðT D R

max

with random packet sampling (cf. [21]). For the time being after the generation of the ﬁrst packet in a target ﬂow, its packet rate would not reach the threshold R. As time goes by, however, the sliding window is updated successively, and the packet rate of the target ﬂow increases and eventually exceeds the threshold at some time t1. Note that the ideal scheme with f = 1.0 detects the target ﬂow at time t1 with probability one. Fig. 6 shows a typical example of such a transient behavior of the packet rate of a target ﬂow, where the unit time is taken as the length TBW of basic windows and the horizontal axis is labeled in such a way that the packet rate exceeds the threshold R = 1000 at time t1 = 61TBW (=TSW) for the ﬁrst time. From this ﬁgure, we observe that in some sliding windows, the packet rate of the target ﬂow is below yet close to the threshold R. As a result, our scheme detects many target ﬂows before the ideal scheme does. On the other hand, the numbers of target ﬂows in classes (iii) and (iv) are quite few. Recall that control parameters are set in such a way that the reference ﬂow with packet rate R is detected with probability 1 . Because most target ﬂows have higher packet rates than R (see Figs. 4 and 5), they are detected with probability larger than 1 . Tables 5 and 6 show the number NWD of wrongly detected non-target ﬂows and FPR in (11) for TD_max = 10 and TD_max = 20, respectively, where = 0.01, 0.05, and 0.1. From Table 5, we observe that there are so many wrongly detected ﬂows, whose number NWD increases with the decrease of , and this tendency is emphasized when R is small. Fig. 7 shows the cumulative distributions of packet rates of non-target ﬂows when they are detected wrongly, where TD_max = 10 and = 0.01. We observe that fairly low packet-rate ﬂows are detected wrongly. The wrong detection of non-target ﬂows comes from two factors. One is due to the distribution of packet rates. As shown in Figs. 4 and 5, the packet rate ranking of ﬂows follows the power law. Although the probability of

¼ 10Þ.

N

(i) t1 > t2

(ii) t1 = t2

(iii) t1 < t2 < 1

(iv) t2 = 1

1000

58

0.226 ± 0.030

2000

14

4000

3

57.388 ± 0.047 0.993 ± 0.001 13.813 ± 0.027 0.992 ± 0.001 2.946 ± 0.014 0.991 ± 0.003

0.317 ± 0.034 5.47 103 ± 5.86 104 0.084 ± 0.019 6.00 103 ± 1.36 103 0.024 ± 0.009 8.00 103 ± 3.16 103

0.069 ± 0.016 1.19 103 ± 2.79 104 0.022 ± 0.009 1.57 103 ± 6.50 104 0.003 ± 0.003 1.00 103 ± 1.13 103

1.843 ± 0.083 3.18 102 ± 1.44 103 0.442 ± 0.041 3.15 102 ± 2.96 103 0.110 ± 0.020 3.67 102 ± 6.73 103

0.408 ± 0.040 7.03 103 ± 6.86 104 0.093 ± 0.019 6.64 103 ± 1.35 103 0.021 ± 0.009 7.00 103 ± 2.96 103

3.593 ± 0.108 6.19 102 ± 1.86 103 0.967 ± 0.060 6.91 102 ± 4.30 103 0.196 ± 0.026 6.53 102 ± 8.56 103

0.878 ± 0.058 1.51 102 ± 1.00 103 0.214 ± 0.027 1.52 102 ± 1.92 103 0.035 ± 0.011 1.12 102 ± 3.80 103

= 0.01 0.081 ± 0.017 0.027 ± 0.010

= 0.05 1000

58

2000

14

4000

3

54.880 ± 0.108 0.961 ± 0.002 13.147 ± 0.055 0.962 ± 0.003 2.777 ± 0.028 0.956 ± 0.007

0.869 ± 0.056

51.849 ± 0.142 0.923 ± 0.002 12.261 ± 0.077 0.916 ± 0.005 2.594 ± 0.036 0.923 ± 0.009

1.680 ± 0.078

0.318 ± 0.035 0.092 ± 0.019

= 0.1 1000

58

2000

14

4000

3

0.558 ± 0.045 0.175 ± 0.025

1358

T. Kudo, T. Takine / Computer Networks 55 (2011) 1351–1363

Table 4 The number of detected target ﬂows and its ratio (TD_max = 20). R

N

(i) t1 > t2

(ii) t1 = t2

(iii) t1 < t2 < 1

(iv) t2 = 1

1000

22

0.140 ± 0.023

2000

7

4000

1

21.712 ± 0.034 0.993 ± 0.001 6.875 ± 0.021 0.992 ± 0.002 0.984 ± 0.008 0.992 ± 0.006

0.130 ± 0.023 5.91 103 ± 1.03 103 0.044 ± 0.013 6.29 103 ± 1.82 103 0.008 ± 0.006 8.00 103 ± 5.52 103

0.018 ± 0.008 8.18 104 ± 3.75 104 0.010 ± 0.006 1.42 103 ± 8.81 104 0 0

= 0.05 1000

22

2000

7

4000

1

0.741 ± 0.053 3.37 102 ± 2.42 103 0.208 ± 0.030 2.97 102 ± 4.22 103 0.037 ± 0.012 3.70 102 ± 1.17 102

0.089 ± 0.018 4.05 103 ± 8.22 104 0.040 ± 0.012 5.71 103 ± 1.74 103 0 0

= 0.1 1000

22

2000

7

4000

1

1.515 ± 0.074 6.89 102 ± 3.38 103 0.459 ± 0.039 6.56 102 ± 5.53 103 0.054 ± 0.014 5.40 102 ± 1.40 102

0.186 ± 0.025 8.45 103 ± 1.13 103 0.099 ± 0.019 1.41 102 ± 2.73 103 0 0

= 0.01 0.071 ± 0.016 0.008 ± 0.006

20.651 ± 0.069 0.962 ± 0.003 6.508 ± 0.042 0.965 ± 0.005 0.922 ± 0.017 0.963 ± 0.012

0.519 ± 0.043 0.244 ± 0.030 0.041 ± 0.012

19.382 ± 0.094 0.923 ± 0.004 6.067 ± 0.054 0.920 ± 0.006 0.890 ± 0.019 0.946 ± 0.014

0.912 ± 0.056 0.375 ± 0.036 0.056 ± 0.014

detecting each low packet-rate ﬂow is very small, there are a large number of such ﬂows and therefore some of them are detected wrongly. Thus, as discussed in [9], the wrong detection of low packet-rate ﬂows is inevitable in random packet sampling. The other is a factor inherent in the on-line monitoring. Suppose there exists a long-lived, low packet-rate ﬂow. Even though the wrong detection probability of such a ﬂow in a sliding window is very small, it successively appears in many sliding windows and eventually it could be detected wrongly. In order to demonstrate this phenomenon, we consider the wrong detection probability PWD of a longlived, constant packet-rate ﬂow, where the wrong detection probability of the ﬂow in a sliding window is assumed to be 0.005. Note that this ﬂow appears in successive n (n P 1) sliding windows at full rate when the lifetime LT of the ﬂow is equal to LT = TSW + (n 1)TBW (i.e., the ﬂow appears in K + (n 1) successive basic windows at full

1200

packet rate

Table 5 Wrongly detected non-target ﬂows (TD_max = 10). R

# Of non-target ﬂows

= 0.01 1000 13,602,956 2000 4000

13,603,000 13,603,011

NWD (FPR)

2279.53 ± 5.41 (1.68 104 ± 3.98 107) 84.48 ± 1.05 (6.21 106 ± 7.73 108) 7.59 ± 0.39 (5.58 107 ± 2.90 108)

= 0.05 1000 13,602,956 2000 13,603,000 4000 13,603,011

526.91 ± 3.35 (3.87 105 ± 2.46 107) 40.50 ± 0.66 (2.98 106 ± 4.88 108) 4.71 ± 0.23 (3.46 107 ± 1.72 108)

= 0.1 1000 13,602,956 2000 13,603,000 4000 13,603,011

296.38 ± 2.60 (2.18 105 ± 1.91 107) 29.32 ± 0.59 (2.16 106 ± 4.35 108) 3.30 ± 0.24 (2.43 107 ± 1.78 108)

Table 6 Wrongly detected non-target ﬂows (TD_max = 20).

1000

R

800 600 400 200 0

rate). Table 7 shows the wrong detection probability PWD of the ﬂow with lifetime LT (PTSW), where K = 61 and PWD is calculated by

0

10

20

30 40 50 time (×TBW)

60

70

80

Fig. 6. Transient behavior of the packet rate of a typical target ﬂow (K = 61, R = 1000).

# Of non-target ﬂows

NWD (FPR)

= 0.01 1000 13,602,992 2000 13,603,007 4000 13,603,013

144.43 ± 1.51 (1.06 105 ± 1.11 107) 10.86 ± 0.37 (7.98 107 ± 2.69 108) 1.52 ± 0.18 (1.12 107 ± 1.29 108)

= 0.05 1000 13,602,992 2000 13,603,007 4000 13,603,013

72.54 ± 1.05 (5.33 106 ± 7.70 108) 6.25 ± 0.30 (4.59 107 ± 2.23 108) 0.67 ± 0.13 (4.93 108 ± 9.40 109)

= 0.1 1000 13,602,992 2000 13,603,007 4000 13,603,013

47.35 ± 0.93 (3.48 106 ± 6.84 108) 4.11 ± 0.26 (3.02 107 ± 1.89 108) 0.45 ± 0.12 (3.31 108 ± 8.54 109)

1359

1

1

0.8

0.8

0.6 R=1000, ε=0.01 R=1000, ε=0.05 R=1000, ε=0.10 R=2000, ε=0.01 R=2000, ε=0.05 R=2000, ε=0.10 R=4000, ε=0.01 R=4000, ε=0.05 R=4000, ε=0.10

0.4 0.2 0

P(r≤x)

P(r≤x)

T. Kudo, T. Takine / Computer Networks 55 (2011) 1351–1363

0

1000

2000

3000 x

4000

5000

0.6

R=1000, ε=0.01 R=1000, ε=0.05 R=1000, ε=0.10 R=2000, ε=0.01 R=2000, ε=0.05 R=2000, ε=0.10 R=4000, ε=0.01 R=4000, ε=0.05 R=4000, ε=0.10

0.4 0.2 0

6000

0

1000

2000

3000 x

4000

5000

6000

Fig. 7. Distribution of packet rates of wrongly detected non-target ﬂows (TD_max = 10).

Fig. 8. Distribution of packet rates of wrongly detected non-target ﬂows (TD_max = 20).

Table 7 Detection probability of long-lived ﬂows (K = 61).

Table 8 Threshold y⁄ (former) and the number NWD of wrongly detected ﬂows (latter).

Detection prob.

Lifetime (TSW)

Detection prob.

1 2 3 4 5

0.005 0.267 0.460 0.602 0.707

6 7 8 9 10

0.784 0.841 0.883 0.914 0.937

PWD ¼ 1 ð1 0:005ÞðLT T SW Þ=T BW þ1 :

ð12Þ

Note that the above formula takes account only of sliding windows at full rate, and therefore PWD in (12) is regarded as a lower bound of the exact wrong detection probability. We observe, for example, that if the lifetime LT of the ﬂow is equal to 10TSW(=610TBW), it appears in 9TSW/TBW + 1(=550) successive sliding windows at full rate and in every sliding window, the ﬂow has an opportunity to be detected wrongly. As a result, PWD is greater than 0.937 when LT = 10TSW. However, these phenomena can be mitigated by setting a larger TD_max. See Table 6, where TD_max is set to be 20 which is twice as large as in Table 5. We observe that the increase of TD_max from 10 to 20 dramatically decreases the number NWD of wrongly detected non-target ﬂows. In general, fTSW increases with TD_max, so that the expected number of sampled packets increases, which leads to a more accurate detection in each sliding window (see Fig. 3). Also, we expect that the increase of TSW weakens the inﬂuence of long-lived ﬂows because, as shown in (12), the wrong detection probability PWD of long-lived, low-rate ﬂows is a decreasing function of the sliding window length TSW. Fig. 8 shows the cumulative distributions of packet rates of non-target ﬂows when they are detected wrongly, where TD_max = 20 and = 0.01. Compared with the case of TD_max = 10 in Fig. 7, the wrong detection of low packet-rate ﬂows is suppressed. It might be interesting to observe that the number NWD of wrongly detected ﬂows has a positive correlation with the threshold y⁄. See Table 8, which shows the threshold y⁄ and the number NWD of wrongly detected ﬂows for

R

= 0.01

= 0.05

= 0.1

TD_max = 10 1000 2000 4000

3/2279.53 9/84.48 24/7.59

5/526.91 12/40.50 28/4.71

6/296.38 13/29.32 30/3.30

TD_max = 20 1000 2000 4000

10/144.43 25/10.86 57/1.52

12/72.54 28/6.25 62/0.67

14/47.35 30/4.11 65/0.45

various combinations of parameter values of TD_max, R, and . In this speciﬁc CAIDA trace data, NWD is small enough (say, a dozen or less) if y⁄ P 25. We conducted sampling experiments with another trace data [24] and made a very similar observation. Note that y⁄ is an increasing function of TD_max, so that we can suppress the wrong detection of non-target ﬂows by enlarging TD_max, i.e., in compensation for the rapidity of detection. 4.4. Effectiveness of optimal parameters Finally, we conﬁrm the effectiveness of the optimal parameters obtained by our design method. To do so, let 1

Detection Precision

Lifetime (TSW)

R=4000 R=2000 R=1000

0.8 0.6 0.4 0.2 0

1

1.5

2

2.5

3 c

3.5

4

4.5

5

Fig. 9. The average detection precision and its 95% conﬁdence interval (f = f⁄/c, TD_max = 20, = 0.05).

1360

T. Kudo, T. Takine / Computer Networks 55 (2011) 1351–1363

TD_max = 20 and = 0.05, and we ﬁx the length TSW of sliding windows and the number K of basic windows in a sliding window to be optimal, i.e., TSW = 19.5506 and K = 87, as shown in Table 2. As a result, every target (resp. non-target) ﬂow has the same number of opportunities to be detected correctly (resp. wrongly), regardless of the value of f. Under this setting, we vary the sampling rate f in the feasible region of f 6 f⁄, where the threshold y⁄ is adjusted according to (1). Therefore the false negative ratio is always less than = 0.05 in the following results. In order to evaluate the effectiveness of optimal parameters, we deﬁne the detection precision as the ratio of the number of detected target ﬂows to the number of all the detected ﬂows (including wrongly detected ﬂows). Fig. 9 shows the average detection precision versus parameter c that controls the sampling rate f to be f = f⁄/ c. In the ﬁgure, 95% conﬁdence intervals are also shown, even though most of them are invisible. Note that the results with the optimal parameters correspond to c = 1. The detection precision is not a strictly decreasing function of c for the same reason as the false positive ratio (see Section 3.2). From the ﬁgure, we observe that the detection precision gets worse rapidly as c becomes large. Therefore we conclude that the parameter optimization is very important.

and Dr. Nobuo Yamashita of Kyoto University for their helpful comments. Research of the second author was also supported in part by Grant-in-Aid for Scientiﬁc Research (C) of Japan Society for the Promotion of Science under Grant No. 22500056.

Appendix A. Proof of Theorem 1 We ﬁrst rewrite (6) to be

rT SW X

1 X ðrfT SW Þy erfT SW f y ð1 f ÞrT SW y < ; y! y y¼y

rT SW

y¼y

ðA:1Þ

which holds if rTSW < y⁄, so that we assume rTSW P y⁄ hereafter. It is clear that a sufﬁcient condition for (A.1) is given by

ðrfT SW Þy f y ð1 f ÞrT SW y 6 erfT SW ; y! y

rT SW

for all y = y⁄, y⁄ + 1, . . . , rTSW. For simplicity in description, we deﬁne x(r) and k(r) as

xðrÞ ¼ rT SW ;

kðrÞ ¼ frT SW :

5. Concluding remarks We then have We considered the design of a sliding window scheme for detecting high packet-rate ﬂows via random packet sampling. Even though our design problem was originated from the minimization of the false positive ratio in detecting high packet-rate ﬂows, it can be viewed as a maximization problem of the expected amount of information obtained in a sliding window, under the constraint that the sliding window scheme works as an on-line scheme. Note that in this formulation, the post-processing time of data in a sliding window is modeled by an arbitrary, increasing function G(), and overhead due to the statistical inference and data processing algorithms are speciﬁed only through G(). Therefore the problem is quite general and its solution is applicable to other inference problems as well, e.g., the ﬂow length distribution and ranking of the ﬂow lengths. In general, the sliding window scheme with random packet sampling suffers from a large number of wrongly detected non-target ﬂows. Even though it is inevitable in random packet sampling, we showed that it can be mitigated in compensation for the rapidity of detection (i.e., enlarging the sliding window length). This principle is consistent with the implication of our problem formulation: The increase of the amount of information improves the accuracy of the detection. Even though there remain some wrongly detected ﬂows, the scheme can serve at least as a ﬁlter in order to prepare the input to ﬁner analysis. Acknowledgements The authors thank Dr. Ryoichi Kawahara, Dr. Noriaki Kamiyama, and Dr. Shigeaki Harada of NTT Corporation

xðrÞ y

! f y ð1 f ÞxðrÞy

xðrÞ! f y ð1 f ÞxðrÞy y!ðxðrÞ yÞ! y xðrÞy xðrÞ! kðrÞ kðrÞ ¼ 1 y!ðxðrÞ yÞ! xðrÞ xðrÞ xðrÞ y kðrÞ kðrÞ xðrÞ! ¼ 1 xðrÞ y! ðxðrÞ yÞ!ðxðrÞ kðrÞÞy ¼

6 ekðrÞ

kðrÞy xðrÞ! ; y! ðxðrÞ yÞ!ðxðrÞ kðrÞÞy

ðA:2Þ

where the last inequality follows from 1 x 6 exp(x) for all x P 0. Note here that

xðrÞ! ðxðrÞ yÞ!ðxðrÞ kðrÞÞy ¼

! xðrÞ k xðrÞ dkðrÞe xðrÞ kðrÞ xðrÞ kðrÞ k¼0 0 1 y1 Y xðrÞ k A : @ xðrÞ kðrÞ k¼dkðrÞeþ1 dkðrÞe1 Y

ðA:3Þ

Fractions in the ﬁrst factor on the right hand side of (A.3) is greater than one, the fraction in the middle is not greater than one, and fractions in the last factor are less than one. Because y⁄ P 2dk(r)e + 1 by assumption, (A.3) is rewritten to be

1361

T. Kudo, T. Takine / Computer Networks 55 (2011) 1351–1363

xðrÞ! ðxðrÞ yÞ!ðxðrÞ kðrÞÞy

feasible solution f TSW < 0. We then consider the following mathematical program.

! xðrÞ dkðrÞe þ k xðrÞ dkðrÞe k ¼ xðrÞ kðrÞ xðrÞ kðrÞ k¼1 0 1 y1 xðrÞ dkðrÞe @ Y xðrÞ k A ; xðrÞ kðrÞ xðrÞ kðrÞ k¼2dkðrÞeþ1 dkðrÞe Y

P00 : min s:t:

for all y = y⁄, y⁄ + 1, . . . , rTSW. Note that for x, a, and b such that 0 < a 6 b 6 x

ðA:4Þ

Theorem 1 now follows from (A.1), (A.2), and (A.4). h

First, for a ﬁxed K 2 K, we formally convert the original problem to the corresponding minimization problem as follows:

Therefore we obtain

KG

fC max K 6 T SW 6 TD K þ1 K

ðK þ 1Þf þ GðffC max Þ 6

T SW > 0;

Also we have from (B.5) max

6 0;

f P

max

ðK þ 1Þf:

K TD

ðK þ 1Þf f

fC max 6 6f T SW max G K 6

max

G1 ðT D

ðK þ 1ÞfÞ : fC max

max

ðB:7Þ

6 0;

and show that there exists the global optimal solution of P0 . Suppose the constraints hold with equality. We then have

> 0;

max ;

f : T SW

fT SW

max

ðB:6Þ

Therefore, with (B.6), we obtain

We then consider the relaxed problem P0 of the original problem P:

K TD K þ2

fC max : G K

so that

GðffC max Þ 6 T D

K þ1 fC max T SW T SW þ G TD K K fC max T SW T SW 6 0; G K K

max

K þ1 fC max T SW 6 TD T SW þ G K K

fT SW f > 0; K þ1 fC max T SW T SW þ G TD K K fC max T SW T SW 6 0: G K K

max :

On the other hand, it follows from (B.3) that

Appendix B. Proof of Theorem 2

f ¼

ðB:5Þ

6 TD

xðrÞ! < 1: ðxðrÞ yÞ!ðxðrÞ kðrÞÞy

T SW ¼

fT SW 6 f;

ðB:4Þ

K þ1 fC max K þ1 fC max T SW T SW þ G T SW þ G 6 K K K K

and therefore

s:t:

ðB:3Þ

because G(x) is a strictly increasing function of x. Also, from (B.3) and (B.5), we have

xðrÞ dkðrÞe þ k xðrÞ dkðrÞe k < 1; xðrÞ kðrÞ xðrÞ kðrÞ

P0 : min

T D max 6 0; fC max T SW T SW G 6 0; K K

fC max fC max T SW T SW G 6G 6 ; K K K

we have

s:t:

K þ1 fC max T SW T SW þ G K K

where f > 0 denotes a sufﬁciently small positive constant. It follows from (B.4) and (B.5) that

xþa xb < 1; x x

P : min

fT SW

ðB:1Þ

K T SW K þ2 T D max G1 G1 ¼ > 0; C max T SW C max T D max K K þ2 ðB:2Þ

where the inequality in (B.2) follows from the assumption G(0) < TD_max/(K + 2). Thus the relax problem P0 has a

Eq. (B.6) and (B.7) imply that the feasible region of (f, TSW) in P00 is compact. Because any continuous function has a global minimum in the compact support, there exists

a global minimum solution f ; T SW in P00 . Note that (f⁄, T SW ) is also a global minimum solution in the relax prob

lem P0 because of the following reason. Suppose f ; T SW is not a global minimum solution in the relax problem P0 and there exists a feasible solution ðf ; T SW Þ in P0 such that f T SW < f T SW < f. This implies that ðf ; T SW Þ is also a 00 feasible solution

of P , which contradicts the global optimality of f ; T SW in P00 . Thus there exists a global minimum solution ðf ; T SW Þ in the relax problem P0 and it satisﬁes

f T SW < 0:

1362

T. Kudo, T. Takine / Computer Networks 55 (2011) 1351–1363

Therefore there exists a local minimum solution ðf ; T SW Þ such that f T SW > 0. We then introduce the Lagrangian L(f, TSW, k1, k2) for the relax problem P0 .

fC max T SW T SW Lðf ; T SW ; k1 ; k2 Þ ¼ fT SW þ k1 G K K K þ1 fC max T SW T SW þ G þ k2 TD K K

max

where k1 and k2 denote Lagrange multipliers. From the KKT condition for a local minimum solution ðf ; T SW Þ such that f T SW > 0, we have ! @L C max T SW 0 f C max T SW ¼ T SW þ ðk1 þ k2 Þ G ¼ 0; K K @f ðf ;T SW Þ¼ðf ;T SW Þ

ðB:8Þ

@L K þ1 k1 k2 ¼ f þ @T SW ðf ;T SW Þ¼ðf ;T SW Þ K K þ ðk1 þ k2 Þ

f C max f C max T SW G0 K K

! ¼ 0: ðB:9Þ

We then have from (B.8)

C max 0 f C max T SW ðk1 þ k2 Þ G K K

! ¼ 1;

ðB:10Þ

from which and (B.9), it follows that

ðK þ 1Þk1 ¼ k2 :

ðB:11Þ

0

Because G (x) > 0 for all x(x P 0), (B.10) implies k1 + k2 > 0. Thus, with (B.11), we conclude k1 > 0 and k2 > 0. Therefore both of two constraints in the relax problem P0 are active because of the KKT complementarity condition. As a result, it follows from (B.1) and (B.2) that the local minimum solution ðf ; T SW Þ in P0 is determined uniquely by

K þ2 T D max G1 > 0; C max T D max K þ2 K T D max > 0; T SW ¼ K þ2

f ¼

which give the global minimum solution f ; T SW in the re0 lax problem P . Furthermore it is also the global minimum solution of the original problem P because f > 0 and T SW > 0. h References [1] N. Dufﬁeld, C. Lund, M. Thorup, Charging from sampled network usage, in: Proceedings of ACM SIGCOMM IMW’01, 2001, pp. 245– 256. [2] Cisco Netﬂow, 2010. Available from: . [3] A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford, F. True, Deriving trafﬁc demands for operational IP networks: methodology and experience, IEEE/ACM Transactions on Networking 9 (2001) 265–280. [4] T. Mori, T. Takine, J. Pan, R. Kawahara, M. Uchida, S. Goto, Identifying heavy-hitter ﬂows from sampled ﬂow statistics, IEICE Transactions on Communications E90-B (2007) 3061–3072.

[5] J. Mirkovic, P. Reiher, A taxonomy of DDoS attack and DDoS defense mechanisms, ACM SIGCOMM CCR 34 (2004) 39–53. [6] V.A. Siris, F. Papagalou, Application of anomaly detection algorithms for detecting SYN ﬂooding attacks, Computer Communications 29 (2006) 1433–1442. [7] B. Krishnamurthy, S. Sen, Y. Zhang, Y. Chen, Sketch-based change detection: methods, evaluation, and applications, in: Proceedings of ACM SIGCOMM IMC’03, 2003, pp. 234–247. [8] A. Kumar, J. Xu, Sketch guided sampling – using on-line estimates of ﬂow size for adaptive data collection, in: Proceedings of IEEE INFOCOM, 2006, pp. 1–11. [9] C. Barakat, G. Iannaccone, C. Diot, Ranking ﬂows from sampled trafﬁc, in: Proceedings of ACM CoNEXT’05, 2005, pp. 188–199. [10] E. Cohen, N. Grossaug, H. Kaplan, Processing top k queries from samples, Computer Networks 52 (2008) 2605–2622. [11] C. Estan, G. Varghese, New directions in trafﬁc measurement and accounting: focusing on the elephants, ignoring the mice, ACM Transactions on Computer Systems 21 (2003) 270–313. [12] Y. Lu, M. Wang, B. Prabhakar, F. Bonomi, ElephantTrap: a low cost device for identifying large ﬂows, in: Proceedings of the 15th Annual IEEE Symposium of High-Performance Interconnects (HOTI), 2007, pp. 99–108. [13] D. Brauckhoff, B. Tellenbach, A. Wagner, M. May, A. Lakhina, Impact of packet sampling on anomaly detection metrics, in: Proceedings of ACM SIGCOMM IMC’06, 2006, pp. 159–164. [14] J. Mai, A. Sridharan, C.-N. Chuah, H. Zang, T. Ye, Impact of packet sampling on portscan detection, IEEE Journal on Selected Areas in Communications 24 (2006) 2285–2298. [15] E.D. Demaine, A. López-Ortiz, J.I. Munro, Frequency estimation of internet packet streams with limited space, in: Proceedings of the 10th Annual European Symposium on Algorithms, 2002, pp. 348– 360. [16] R.M. Karp, C.H. Papadimitriou, S. Shenker, A simple algorithm for ﬁnding frequent elements in streams and bags, ACM Transactions on Database Systems 28 (2003) 51–55. [17] L. Golab, M.T. Özsu, Issues in data stream management, ACM SIGMOD Record 32 (2003) 5–14. [18] J. Li, D. Maier, K. Tufte, V. Papaddimos, P.A. Tucker, No pane, no gain: efﬁcient evaluation of sliding-window aggregates over data streams, SIGMOD Record 34 (2005) 39–44. [19] D. Carney, U. Çetintemel, M. Cherniack, C. Convey, S. Lee, G. Seidman, M. Stonebraker, N. Tatbul, S. Zdonik, Monitoring Streams – a new class of data management applications, in: Proceedings of the 28th VLDB Conference, 2002, pp. 215–226. [20] R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Bubu, M. Datar, G. Manku, C. Olson, J. Rosensttein, R. Varma, Query processing, resource management, and approximation in a data stream management system, in: Proceedings of the 2003 CIDR Conference, 2003. [21] S. Venkataraman, D. Song, P.B. Gibbons, A. Blum, New streaming algorithms for fast detection of superspreaders, in: Proceedings of Network and Distributed System Security Symposium (NDSS), 2005. [22] W. Feller, An Introduction to Probability Theory and Its Applications, third ed., vol. 1, John Wiley & Sons, New York, 1968. [23] Colby Walsworth, Emile Aben, K.C. Claffy, Dan Andersen, The CAIDA Anonymized 2009 Internet Traces – , 2010. Available from: . [24] WIDE: the MAWI Working Group, 2010. Available from: .

Takanori Kudo received B.E. and M.E. degrees in information communication engineering from Osaka University. He is currently a Ph.D. candidate in Information and Communications Technology at Osaka University. His main research interests are design of trafﬁc measurement schemes and management of measured data.

T. Kudo, T. Takine / Computer Networks 55 (2011) 1351–1363 Tetsuya Takine is currently a Professor in the Department of Information and Communications Technology, Graduate School of Engineering, Osaka University. His research interests include queueing theory, emphasizing numerical computation, and its application to performance analysis of computer and communication networks. He is now serving as an area editor of Operations Research Letters and an associate editor of Queueing Systems, Stochastic Models, and International Transactions in Operational Research. He received Telecom System Technology Award from The Telecommunications Advancement Foundation in 2003 and 2010, and Best Paper Awards

1363

from ORSJ in 1997, from IEICE in 2004 and 2009, and from ISCIE in 2006. He is a fellow of ORSJ and a member of IEICE, IPSJ, ISCIE, and IEEE.

Design of a sliding window scheme for detecting high packet-rate flows via random packet sampling

Design of a sliding window scheme for detecting high packet-rate flows via random packet sampling

Recommend Documents