- Email: [email protected]

available at www.sciencedirect.com

journal homepage: www.elsevier.com/locate/cose

Wavelet based Denial-of-Service detection Glenn Carla, Richard R. Brooksb,*, Suresh Raic,1 a

Department of Electrical Engineering, The Pennsylvania State University, University Park, PA 16801, USA Holcombe Department of Electrical and Computer Engineering, Clemson University, 313-C Riggs Hall, P.O. Box 340915, Clemson, SC 29634-0915, USA c EE Department, LSU, Baton Rouge, LA 70803, USA b

article info

abstract

Article history:

Network Denial-of-Service (DoS) attacks that disable network services by flooding them

Received 17 May 2005

with spurious packets are on the rise. Criminals with large networks (botnets) of compro-

Revised 30 June 2006

mised nodes (zombies) use the threat of DoS attacks to extort legitimate companies. To

Accepted 3 August 2006

fight these threats and ensure network reliability, early detection of these attacks is critical. Many methods have been developed with limited success to date. This paper presents an

Keywords:

approach that identifies change points in the time series of network packet arrival rates.

CUSUM

The proposed process has two stages: (i) statistical analysis that finds the rate of increase

DDoS/DoS

of network traffic, and (ii) wavelet analysis of the network statistics that quickly detects the

Network security

sudden increases in packet arrival rates characteristic of botnet attacks.

Performance testing

Most intrusion detections are tested using data sets from special security testing configu-

Haar transform

rations, which leads to unacceptable false positive rates being found when they are used in the real world. We test our approach using data from both network simulations and a large operational network. The true and false positive detection rates are determined for both data sets, and receiver operating curves use these rates to find optimal parameters for our approach. Evaluation using operational data proves the effectiveness of our approach. ª 2006 Elsevier Ltd. All rights reserved.

1.

Introduction

Network attackers have many goals. These include thrills for ‘script kiddies’ or intentional damage for employees, criminals, or political organizations. It is unlikely that exploitation of software errors and system misconfiguration will ever stop. The increasing number of always-on broadband connections administered by untrained users is likely to further this trend. Intrusion detection systems (IDS) identify and report security violations. Early detection supports attack mitigation before a system and its users are affected. Many commercial

IDS are available. Newman et al. compared many of these commercial products and stated: One thing that can be said with certainty about network-based intrusion-detection systems is that they’re guaranteed to detect and consume all your available bandwidth. – Because no product distinguished itself, we are not naming a winner. This result expresses the danger of using systems with high false positive rates. Anyone familiar with Aesop’s fable of ‘‘The Boy Who Cried Wolf’’ knows that systems with high false positive rates consume system resources when no

* Corresponding author. Tel.:þ1 864 656 0920; fax: þ1 864 656 1347. E-mail addresses: [email protected], [email protected] (R.R. Brooks), [email protected] (S. Rai). 1 Tel.: þ1 225 578 4832; fax: þ1 225 578 5200. 0167-4048/$ – see front matter ª 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.cose.2006.08.017

computers & security 25 (2006) 600–615

attacks exist, and when real attacks occurs alarms are almost certain to be ignored. This paper presents a new algorithm for detecting packet flooding Denial-of-Service (DoS) intrusions. For background on DoS IDS, readers are directed to the current survey in Carl et al. (2006). Our approach combines the benefits of both statistical and wavelet analysis techniques. Statistical methods detect features that are indicative of DoS attacks in the network traffic patterns. If properly tuned, statistical approaches can have relatively low false positive rates, but (as shown in Section 8) the time required to signal that a DoS attack is underway can be unacceptably long. Attacks are signaled only after the damage has been done and users inconvenienced. The use of wavelets, on the hand, can detect attacks quickly, but often at the cost of incurring higher false positive rates. In Section 8 we show that, by combining both techniques, detection efficiency is better than with a pure wavelet approach and detection delay is lower than with purely statistical techniques. Our method can be tuned to fit the needs of a specific installation. Another major issue with current intrusion detection approaches is the lack of effective testing. It is difficult to find operational networks for intrusion detection testing, since almost every enterprise is unwilling to tolerate the effects of a live attack on their network. The researcher can at best set up detection software on the network and hope an attack will occur. Scientific analysis of these results is almost impossible since the researcher has no reliable ground truth for comparison. Most researchers resort to testing their solutions using either network simulators like ns-2 (VINT Project), testbeds constructed for network intrusion research (McHugh, 2000), or Honeypots. We will see in Section 8 that the background traffic in all of these systems is nowhere near as chaotic as the traffic in operational networks. Tests that do not use operational networks almost always produce artificially low false positive rates. A fuller discussion of these issues can be found in Brooks (2005). The survey in Carl et al. (2006) focuses specifically on the testing of DoS detection software. In Sections 5–8 of this paper we present a new test methodology for validating DoS detection software. It has the strong points of both network simulation (or specialized test facility) testing and of using operational networks. In this approach, we use background traffic collected from a typical corporate setting and merge it with network DoS attack traffic data. The contributions of this paper are twofold. First, we present a new method for detecting network flooding DoS attacks. The approach has lower false positive rates and shorter reaction times than competing approaches. Second, we evaluate our approach using techniques that insert attack data into traffic sets containing realistic background traffic. This testing approach goes beyond the current art and suggests ways that researchers can provide believable test results for intrusion detection systems. The rest of the paper’s organization is as follows. Section 2 provides background on Denial-of-Service (DoS) attacks. Section 3 reviews previous work. Section 4 describes our wavelet-based DoS detection method. Sections 5–7 detail test data generation, detection method implementation, and testing procedures, respectively. Section 8 presents detection results. Finally, Section 9 discusses our results and concludes the paper.

2.

601

Denial-of-Service (DoS) attacks

Denial-of-Service (DoS) attacks attempt to prevent legitimate use of a networked resource. Their target can be an entire network or an individual machine. Many kinds of attacks exist. UDP attacks saturate the network links and queues en route, starving the target of legitimate traffic through congestion. TCP SYN flood attacks consume end-system resources with numerous TCP connection requests (Ricciuli et al., 1999). Malicious nodes initiate, but do not complete, the three-way handshake TCP uses to initiate connections. This leaves many stale TCP connection requests on the victim. The connection is never completed, and continues to consume end-system resources until its eventual time out. During both attacks (UDP and TCP Syn flood) legitimate users are ‘denied service.’ This paper considers only DoS flooding attacks that induce network congestion. A TCP SYN flood targets queue management on the node and is outside the scope of this paper. To increase attack effectiveness, Distributed Denial-ofService (DDoS) flooding attacks are becoming increasingly common. DDoS flooding attacks start with an attacker compromising individual hosts. The attacker inserts a ‘‘zombie’’ process on the node; that can be triggered remotely to send large volumes of network traffic to a victim node. The zombie processes hosts are typically aggregated into a large ‘‘botnet’’ that the intruder commands using attack software, like the Stacheldraht tool. These botnets can be massive, some containing over a million nodes. Frequently, botnets are used to threaten legitimate enterprises with DoS attacks and extort money (Information week, 2005; MSNBC, 2004). When launched, the DDoS attack slows down or crashes the target. The target has no advance warning until an attack is initiated.

3.

Previous work

There exist a number of DoS attack studies (Ricciuli et al., 1999; Paxson, 1999; Spatscheck and Petersen, 1999; Vigna and Kemmerer, 1999; Dittrich, 2000; Strother, 2000). Most of these studies address vulnerabilities or possible countermeasures, but few focus on attack detection. Xu and Lee (2003) isolate and protect web servers against DDoS attacks. They address the attack and countermeasure issue by using a game-theoretic framework that models the system’s performance as the minimax solution between conflicting goals, i.e. their solution configures the server in the way that provides the best service possible while under attack. More recent reports (Anderson et al., 2004; Lakshminarayanan et al., 2004; Kreibich and Crowcroft, 2004) look at the ways of preventing and constraining DoS attacks. Anderson et al. (2004) rely on the use of a ‘send-permission-token’ to restrict DoS attacks. Lakshminarayanan et al. has a host insert a packet filter at the last hop IP router. The ‘Honeycomb’ technique (Kreibich and Crowcroft, 2004) uses a decoy computer, pattern-matching techniques, and protocol conformance checks to create intrusion detection signatures. Recently a number of companies, like Tipping Point, are commercializing intrusion prevention systems. Unlike

602

computers & security 25 (2006) 600–615

intrusion detection systems, which are typically software solutions, intrusion prevention systems use custom hardware that is situated inside the corporate firewall. They perform stateful monitoring of all data streams passing through the hardware, and remove packets that violate protocols. Since these are commercial offerings, their technologies have not been subject to peer review and critical comparison of their performance is difficult to find. The marketing information that is available contrasts so extremely with independent evaluations, like the one in Newman et al., as to be suspect. Attack detection research is limited. In Moore et al. (2001), IP traffic is monitored for unsolicited ‘backscatter’ packets. Backscatter packets are a non-collocated victim’s response to specific attacks (e.g. TCP SYN flood, closed-port probe). This approach quantified the global DoS activity at around 12,000 attacks over a three-week period. Allen and Marin (2004) use estimates of the Hurst parameter to identify attacks which cause a decrease in the traffic’s self-similarity. This requires statistics of network traffic selfsimilarity before the attack. Network traffic is self-similar, in that it has been shown to be fractal (multi-fractal) under coarse (fine) scales (Leland et al., 1993; Abry et al., 2002). The high-speed core network is not self-similar since the traffic in that domain tends to follow a Poisson distribution (Cao et al.). The approach in Allen and Marin (2004) is made more difficult by there not being a widely accepted metric for computing the fractal dimension (self-similarity metric) of network traffic dynamics. Two statistical methods of analyzing network traffic to find DOS attacks are provided in Feinstein et al. (2003). One monitors the entropy of the source addresses found in packet headers, while the other monitors the average traffic rates of the ‘most’ active addresses. In Barford et al. (2002), wavelets are used to spectrally separate time localized anomalous ‘signals’ from normal traffic ‘noise.’ In Wang et al. (2002) and Blazek et al. (2001), changepoint detection analysis with a cumulative sum algorithm is used to identify traffic changes. Our method builds on the approach in Blazek et al. (2001). Our use of wavelets is similar to Barford et al. (2002), but we use wavelets for detecting change-points in the cumulative sum statistic and not for traffic signal decomposition. Section 4 describes our approach in detail and also provides insights into its precursors (Barford et al., 2002; Blazek et al., 2001).

4.

Our approach is an anomaly detection system. Anomaly detection systems can detect new attacks by observing their effect on the running system, but they also tend to have higher false alarm rates than template based systems. Many traffic characteristics can be measured: rate, protocol, packet sizes, source/destination addresses, and others. For a TCP SYN attack, a system queue (or number of halfopen connections) can be measured. Protocol modifications have addressed TCP SYN attacks; making them unlikely. Detecting these attacks is therefore not very interesting. For UDP flood attacks, the number of packets sensed at a monitoring location could be used to detect a DoS attack. Our approach for detecting packet flooding DoS attacks counts the number of incoming packets in uniform time intervals. This is applicable anywhere in the network. At the target or en route, a DoS attack is determined by looking for an abrupt, positive change in the number of in-coming packets. Slow, ramping DoS attacks may avoid detection by our approach. But they are also less deadly, since their slow onset provides time for network administrators to detect them through normal network performance analysis.

4.1.

Change-point detection

DoS attacks affect the statistical properties (e.g. mean, variance) of the arrival traffic, but with unknown magnitude and temporal variation. Target proximity, number of zombies, and attack coordination are factors that modify these properties. We monitor the mean and variance of network traffic volume to identify the onset of a DoS attack. We assume that the mean of the baseline traffic varies smoothly, and that the change due to attack traffic is greater than normal changes in the baseline. Note that an effective attack will require a large increase in traffic volume in most realistic scenarios.2 Any network link or end-system node can be monitored for DoS attack traffic. Let Npt(k) denotes a time-series representing the number of recorded packets during discrete time interval k. The non-overlapping time intervals are of uniform length I, therefore Npt(k) can be considered the arrival packet rate during interval k. Interval length I is experimentally chosen such that the monitored series Npt(k) is not smeared and also does not contain large fluctuations. If a DoS attack begins at time l, the packet time series Npt(k) will be characterized by a significant increase for k l. Let

DoS detection

IDS are classified as either signature or anomaly detection systems. Signature detection systems observe and compare patterns with known templates. If there is confidence in a template match, a known security breach is declared. Signature systems can be limited by the size and maturity of the template database. Most current virus scanners are template based. The drawback to this approach is that it can only detect known malware implementations. It is typically unable to identify brand new exploits. Anomaly detection systems compare observed patterns against a nominal model. Security violations are declared when observed patterns deviate from the expected model.

2

It does occur, though, that nodes are occasionally overwhelmed by legitimate traffic. This is sometime known as ‘‘flash events’’ or the ‘‘Slashdot effect.’’ It occurs when sites suddenly become overwhelmingly popular with their legitimate user community. For example, a new Linux distribution may be released, or a notice can be posted about an item on a popular website. This leads to a sudden increase in traffic that the server is unprepared to handle. It is almost impossible to tell the difference between flash events and DoS attacks by botnets, since the operational effect is the same: the server is overwhelmed by traffic and unable to perform its work. We feel that it is unrealistic to expect a network monitor to tell the difference between the two phenomena, since difference between the two lies in the intentions of the users and not in the traffic generated.

computers & security 25 (2006) 600–615

Npt(k) be the superposition of normal (or baseline) traffic N0pt ðkÞ and attack traffic N1pt ðkÞ: ( N0pt ðkÞ; 0 < k < l (1) Npt ðkÞ ¼ N0pt ðkÞ þ N1pt ðkÞ; k l Assuming N0pt ðkÞ and N1pt ðkÞ are independent, the average rate m(k) of Npt(k) is the summation of the individual traffic means m0(k) and m1(k) 0 m ðkÞ; 0 < k < l (2) mðkÞ ¼ m0 ðkÞ þ m1 ðkÞ; k l Due to the DoS attack, the average arrival rate m(k) will increase for k l. To identify this increase, change-point detection based on a cumulative sum (CUSUM) statistic is used. The CUSUM algorithm can be described as repeated testing of the null and alterative hypotheses (Basseville and Nikiforov, 1993) through the use of a sequential probability ratio test (SPRT). For time-series Npt(k), let P0(Npt(k)) and P1(Npt(k)) denote the probability of no-change and change, respectively. The SPRT process is then + P1 ðNpt ðkÞÞ ; Sð0Þ ¼ 0 (3) Ssprt ðkÞ ¼ Ssprt ðk 1Þ þ log P0 ðNpt ðkÞÞ where (x)þ ¼ max(0, x). The decision rule accepts the null hypothesis (no-change) when Ssprt(k) h, where h is a positive number. Conversely, when Ssrpt(k) > h, the alternative hypothesis (change) is accepted. If g is a given false alarm rate, then the optimal threshold h is given by: (4) P0 Ssprt ðkÞ > h ¼ g

603

Substituting Eq. (7) for Npt(k) in S(k), leads to a modified CUSUM algorithm: þ ~ ~ ~ 1Þ þ N ~ pt ðkÞ mðkÞ c ; Sð0Þ ¼0 (8) SðkÞ ¼ Sðk ~ pt ðkÞ is a windowed average of Npt(k) and c ¼ ce m(k) is where N a correction factor. Parameter ce reduces the variance of both noise and slowly varying traffic, by removing their contribution to the CUSUM. ~ Also, c decays SðkÞ back toward zero during periods of normal traffic activity. Parameters 3 and (1 a) determine the amount ~ pt ðkÞ and m(k), respectively. Large of past history captured in N values of 3 and (1 a) imply long memory dependence. In Section 8, the values of 3, a, and ce will be experimentally determined. The CUSUM statistic follows the incoming packet rate increase due to a DoS attack (Fig. 1b). When the CUSUM exceeds an appropriate threshold, a DoS attack is detected (Blazek et al., 2001).

4.2.

Wavelet analysis of CUSUM statistic

DoS threshold settings and false alarm rates for a CUSUM detection approach are provided in Blazek et al. (2001). To lower false alarm rates and increase true detection rate, our

The CUSUM algorithm in Eq. (3) is optimal when the timeseries Npt(k)’s pre-change and post-change distributions are independent and identically distributed. Since no model of the arriving process Npt(k) is available, the i.i.d. assumption of Eq. (1) does not hold. Furthermore, the pre-attack and post-attack change distributions, P0(Npt(k)) and P1(Npt(k)), respectively, cannot be determined. The CUSUM of Eq. (3) cannot be applied directly to time-series Npt(k) to detect a DoS attack. Blazek addressed this issue by proposing the following non-parametric CUSUM-based method (Blazek et al., 2001): þ (5) SðkÞ ¼ Sðk 1Þ þ Npt ðkÞ mðkÞ ; Sð0Þ ¼ 0 where m(k) is the estimated average packet rate given by the recursion for 0 e < 1: mðkÞ ¼ emðk 1Þ þ ð1 eÞNpt ðkÞ;

mð0Þ ¼ 0

(6)

S(k) is sensitive to changes in input Npt(k). Before a change, S(k) is a zero mean process with bounded variance. After a change, S(k) will have a larger variance due to addition of attack traffic, and will become a positive process with growing variance. At each k, S(k) is the cumulative sum of the difference between current and (estimated) average packet rates. S(k) increases (decreases) when the packet rate increases (decreases) more quickly then the average rate. The non-linear operator ($)þ limits S(k) to positive values, since decreasing arrival rates are of no significance in detecting a DoS attack. To reduce high frequency noise in the arrival process, Npt(k) is low-pass filtered using a windowed averaging technique, 0 < a 1: ~ pt ðkÞ ¼ aNpt ðkÞ þ ð1 aÞN ~ pt ðk 1Þ; N

~ pt ð0Þ ¼ 0 N

(7)

Fig. 1 – (a) ns-2 packet time series with a DoS attack starting ~ (c) 6th level wavelet at k [ 6 3 104; (b) CUSUM statistic S(k); ~ decomposition coefficient (d6,0) of the CUSUM statistic S(k).

604

computers & security 25 (2006) 600–615

approach uses wavelet analysis of the modified CUSUM statistic Eq. (8). Wavelet analysis describes input signals in terms of frequency components rather than statistical measures. Note, wavelets provide concurrent time and frequency description, which may determine the time at which certain frequencies components are present. These abilities will lower detection time and increase accuracy due to time localization and filtering capabilities, respectively. Fourier analysis, in contrast, indicates if certain frequency components are present and is better suited to periodic signals whose frequency content is stationary. The property of vanishing moments gives some wavelets the ability to suppress polynomial signals. Note, packet time-series contain many traffic dynamics, most of which are non-periodic or periodic over time-varying intervals. Wavelets are well suited for analysis of packet time-series. Wavelets are oscillating functions defined over small intervals of time. The family of discrete (orthogonal) wavelets function used for analysis is (9) Jj;l ðkÞ ¼ 2j=2 J 2j ðk lÞ ; l; j˛Z where j is the scaling index, l is the time translation index, k is the time interval, and J is the mother wavelet. We used the Haar ‘mother wavelet’ for its simplicity. Scale j is bounded by 2j M, where M is the number of signal samples available at k. Wavelet Jj,l(k) is zero-mean bandpass function of scale j, localized at time interval k. The effect of j is to compress the time window over which the mother wavelet J is defined. In the frequency domain, time compression is equivalent to stretching the wavelet spectrum and shifting by 2j Jaffard (2001). The higher the value of scaling (wavelet decomposition level WDL) j, the smaller the widths of the bandpass filters, resulting in finer frequency analysis. Adjustment of parameters j and l determines the spectral range of the wavelet filter Jj,l(k). Wavelet analysis is performed through the discrete wavelet transform (DWT). For a signal X(k), the DWT is defined through the inner product of the signal and a wavelet: DWTX ðj; lÞ ¼ dj;l ðkÞ ¼ CXðkÞ; Jj;l ðkÞD

(10)

The values dj,l(k) are the discrete wavelet coefficients of X(k) at time position l and scale j, where l ¼ 0,.,2j 1 and capture the localized energy present in the wavelet’s bandpass range. If the coefficients contain ‘enough’ energy in a spectral range of interest, then a signal composed of these frequency components is declared as being present. Haar wavelets calculate their coefficients iteratively through averaging and differencing of two data values. From these two operations, low-pass (average) and high-pass (difference) signal characteristics are stored in the Haar wavelet coefficients (Nievergelt, 1999). Depending on the scale j, the data values are either from the input signal ( j ¼ 1), or previously obtained wavelet coefficient values ( j > 1). As an example, high-pass filter coefficient dj,0 is the difference between the two coefficients containing the average signal value over [k, (k þ 2j)/2] and [(k þ 2j)/2, (k þ 2j)], respectively. Coefficient dj,0 denotes the average amount of signal change over the time window of 2j samples. Wavelet analysis can separate out a time-varying signal of interest from ‘noise’ through localized filtering. At each time instance, wavelet analysis captures the energy of the active

frequency components in multiple, non-overlapping spectral windows. Ideally, the signal and noise components will be captured in separate spectral windows. In the detection method of Barford et al. (2002), wavelet analysis is used to separate a DoS ‘signal’, N1pt ðkÞ, from background traffic ‘noise’, N0pt ðkÞ, both of which are present in the input time series Npt(k) of Eq. (1). The DoS signal N1pt ðkÞ is an abrupt, positive increasing signal. Abrupt signal changes and noise contain most of their energy in high-frequency components. For packet time-series Npt(k), DoS attack detection by direct wavelet analysis may be susceptible to high failure rates. If Npt(k)’s high-frequency noise is nontrivial, the (DoS) signal may not be detectable through the coefficients, due to a low signal-to-noise ratio (Wang, 1995). Analysis of the coefficients’ values may lead to high false rates or missed detections. Thus, to improve detection efficiency, change-point detection via wavelet analysis of ~ ~ the modified CUSUM statistic, SðkÞ, is performed. In SðkÞ, Npt(k)’s arrival noise is filtered through parameters a and ce, enhancing the DoS’s change-point. DoS attacks cause a sharp increase in the CUSUM statistic ~ SðkÞ, followed by a linear increase. Detecting this event is equivalent to change-point detection. Wavelets have been successfully applied to change-point detection in time series (Basseville and Nikiforov, 1993; Antoniadis and Gijbels, 1997; Wang, 1999; Wang, 1995; Lavielle, 1999), and generally outperform traditional statistical methods. By appropriate selection of scale j, the change-point’s signal energy is concentrated in a few coefficients. ~ 0 s change-point Our approach uses wavelets to identify SðkÞ due to a DoS attack. With a vanishing moment of one, the Haar wavelet coefficients dj,l capture abrupt and linear increases of the CUSUM. When dj,l, is larger than a threshold value, a DoS attack is detected. In Fig. 1c, the wavelet coefficient (d6,0) of the CUSUM is shown. d6,0 significantly increases at the CUSUM change, which is due to a DoS attack at interval k ¼ 6 104.

5.

Packet time-series collection

Test data were collected from multiple sources: (i) Network simulators, (ii) DARPA evaluation test sets (MIT), (iii) isolated laboratory testing, and (iv) live data captured from operational networks. Neither (i), (ii), nor (iii) provide test sets representative of operational networks, for reasons including: poor mathematical modeling (Floyd and Paxson, 2001; Willinger and Paxson, 1998), limited topology configurations, short data lengths, and low traffic rates (McHugh, 2000). In particular, the background traffic does not reflect naturally occurring traffic dynamics. Higher amounts of burstiness and timeof-day variations are needed (Leland et al., 1993). Accurate modeling of realistic network traffic is an open question. We present limited results based on synthetic data and focus our evaluation based on live captured data.

5.1.

ns-2 simulation

Simulated time series were constructed using the ns-2 simulator (VINT Project). Our ns-2 simulations had 75 nodes, 15 of which could be DoS attack victims. The topology contained highly connected interior nodes surrounded by simply

computers & security 25 (2006) 600–615

605

connected edge nodes. The interconnection between interior nodes used high bandwidth links (100–233 Mbps). Low bandwidth (10–33 Mbps) connections were used between the edge and interior nodes. Zombies were edge nodes. Targets were highly connected interior nodes. This topology is qualitatively similar to the Internet (Brooks, 2005). Background traffic was a mix of bursty and constant bit rate (CBR) TCP traffic. The bursty traffic used a Pareto distributed random variable, a packet size of 2048, burst time of 10 ms, and shape of 1.5. Background CBR traffic was built with 2048 byte packets every 10 ms. DoS attacks generated CBR traffic of 404 byte UDP packets every .5 ms. Fig. 1 is a UDP flood attack produced by 20 zombies targeting a single victim. The time-series time interval is 10 ms. The attack produces an abrupt 300% traffic increase at interval k ¼ 6 104. More prominent is the lack of burstiness present in the ns-2 background traffic. This is an issue present in synthetic time-series.

5.2.

Live network traffic

To address our concerns with simulated data, live TCP data were collected from Penn State University’s operational network. Traffic was passively collected at a subnet firewall using tcpdump (TCPDump Group). Test data were collected for two weeks yielding 119 packet time-series. Each sample averaged 8 h in length. Privacy issues only allowed for counting of the number of TCP packets and their arrival timestamp. From each recording, a time-series counted the number of TCP packets within consecutive, uniformly sized, time intervals of 1 s (Fig. 2a).

5.3.

DoS attack modeling

Our ‘live’ time-series have higher traffic rates and burstiness than the ns-2 simulations (Fig. 1a). We did not identify any DoS flood events in the raw data. Like the testing approach in Feinstein et al. (2003), we augmented operational timeseries with a DoS attack modeled as an abrupt traffic surge or step function of random length. Additionally, randomly selected sections of other time-series were added as noise during the attack’s duration (see Appendix A). Actual DoS attack data would be preferable, and we are continuing our monitoring efforts to collect data samples containing actual network attacks and representative background traffic. Fig. 3 shows a live time-series containing a model DoS attack of scale 7 (see Appendix A). Our DoS event is qualitatively similar to the ns-2 (Fig. 1) and DARPA/MIT (Fig. 4, MIT) simulations. The DoS attack is a steep increase in average packet rate, overlaid on operational traffic of high variance and rate. These traffic characteristics are not present in the synthetic data (ns-2, DARPA/MIT), and are representative of operational Internet traffic (Floyd and Paxson, 2001; Willinger and Paxson, 1998; McHugh, 2000).

6.

Detection method implementation

The CUSUM and wavelet analysis algorithms were implemented in C. Sun (Windows) platform support is provided

Fig. 2 – (a) Live packet time series; (b) CUSUM statistic; (c) wavelet decomposition coefficient d6,0.

through the PCAP (WinPcap) libraries. ns-2 or tcpdump file formats are accepted. All input time-series used in testing were in saved file format. PERL and MATLAB scripts automated testing, analysis and data presentation.

7.

Testing procedures

Four test data sets of multiple time series were formed. One data set was from ns-2 simulations and three from live time-series. The size of the ns-2 and live test data sets was 8 and 238 time-series, respectively. Half of the time-series within each data set contain a single DoS attack. The remaining time-series contain no attacks. Each data set is summarized in Table 1. The column scale represents the multiplicative factor by which the average packet rate increases due to a DoS attack. For the Sns-2 data set, scale was three; the DoS attack by 20 zombies produce traffic (30 packets per second) that is triple that of the background rate (10 packets per second). In the live data sets, the average traffic rate increase caused by a DoS attack is determined by the

606

computers & security 25 (2006) 600–615

Fig. 3 – Live time-series with a modeled DoS attack. (a) Time series; (b) CUSUM statistic; (c) coefficient d10,0. Fig. 4 – DARPA/MIT synthetic time-series.

parameter scale (Appendix A). Data sets SLIVE4, SLIVE7, and SLIVE10 were constructed using scale values of 4, 7, and 10, respectively. Each time-series s within data set Si, where i ˛ {ns-2, LIVE4, LIVE7, LIVE10}, was evaluated by the two-stage detection method. The algorithm is adjusted by a set of operating parameters P. The elements of set P include: -

Wavelet decomposition level (WDL), Wavelet coefficient threshold (T ), CUSUM estimated packet rate memory (e), CUSUM local averaging memory (a), and CUSUM noise correction factor (ce).

The detection method outputs the number of DoS events detected within time-series s under test. This is equivalent to the number of times the wavelet decomposition coefficients exceed the wavelet threshold T. The wavelet decomposition level and threshold is defined in set P. DoS analysis of the data sets was performed over multiple sets of P. In each testing iteration of time-series s, only one element of P was varied while the others remained

constant. The range of the single parameter variation was determined through experimentation. For any test data set Si, where i ˛ {ns-2, LIVE4, LIVE7, LIVE10}, optimum detection rate is achieved under those conditions P which jointly maximize the true detection peri ðPÞ and minimize the false positive percentage centage pStrue Si i i ðPÞ and pSfalse ðPÞ is provided in pfalse ðPÞ. The derivation of pStrue i ðPÞ is maximized Appendix B. The set P ¼ P* under which pStrue i ðPÞ is minimized is the detector’s optimum operating and pSfalse i i point. The values pStrue ðP Þ and pSfalse ðP Þ represent the detector’s best overall average performance.

Table 1 – Test data sets Label

Source

Number of time-series

Scale

Number of DoS events

SNS2 SLIVE4 SLIVE7 SLIVE10

ns-2 Live Live Live

8 238 238 238

3 4 7 10

4 118 118 118

computers & security 25 (2006) 600–615

8.

Testing results

Receiver Operating Curves (ROC) illustrate our test results. The detection method’s false positive vs. true detection percentage is plotted against a varying parameter from P. The ROC illustrate the performance gain between detection methods, the effects of parameter variations, and the best possible detection efficiency. Each ROC is built from the values of the dyadic set i i ðPÞ; pStrue ðPÞÞ. A datapoint represents the likelihood of ðpSfalse detecting a true DoS attack or a false positive. Probabilities were determined by testing all data set samples under a common parameter set P. The datapoint with minimum distance to the upper left point (0, 1) represents the best possible detection efficiency. Each test iterations varied one parameter value from P. At the i i ðPÞ; pStrue ðPÞÞ, the varying element most efficient value of ðpSfalse from P is recorded. This parameter value is the best possible and becomes an element of the set P*. For subsequent parameter variation testing of the same data set, parameters defined in P* are used in P. After all parameters of P have been exercised, the set P* is complete. Set P* determines the detection’s method operating parameters which produces the maximum true and minimum false positive rate that can be jointly obtained.

8.1.

ns-2 simulations

For ns-2 data sets, detection efficiency was ideal (Fig. 5). All DoS attacks within the data set were true detected with zero false positives. This is visually evident as each ROC plot has -2 ns-2 a datapoint at (0, 1), representing pns false ðPÞ ¼ 0%, ptrue ðPÞ ¼ 100%. The set P corresponding to this datapoint is the best

Fig. 5 – Efficiency vs. wavelet threshold for ns-2 simulations. Varying parameters: wavelet threshold (T ) and wavelet decomposition level (WDL); static parameters: CUSUM estimate average memory (e) [ 0.97, CUSUM local average memory (a) [ 0.99, CUSUM correction factor (ce) [ 0.20.

607

Table 2 – ns-2 parameter set P* Parameter

Setting

Wavelet decomposition level (WDL) Wavelet coefficient threshold (T ) CUSUM estimated average memory (e) CUSUM local averaging memory (a) CUSUM correction factor (ce)

{5,6,7} [30,130] 0.97 0.99 0.20

possible parameter set P*. In Fig. 5, each ROC trace uses a unique setting for the wavelet decomposition level (WDL) over a varying wavelet threshold (T ). CUSUM parameters variations (e, a, c) were not investigated, as no increase from ideal detection performance would be gained. For the ns-2 data set, the optimal parameters set P* is stated in Table 2. Detection efficiency under set P* is ideal. The ideal performance is suggested to be an artifact of inadequate ns-2 traffic modeling. These issues have been discussed in earlier sections.

8.2.

Live traffic data sets

Testing results from the live data sets are more representative of the detection method’s field performance. Affected by realistic background traffic, the detector’s ability is more challenged. True and false positive detection percentages are seen to be less than ideal. Discussion of distinctly high false positive rates are deferred until Section 9.

8.2.1.

Statistical (CUSUM) vs. wavelet thresholding

Wavelet analysis of the CUSUM statistic is proposed to obtain higher detection performance. To support this claim, live data

Fig. 6 – Efficiency vs. CUSUM/wavelet thresholding. Varying parameters: wavelet threshold (T ), CUSUM threshold; static parameters: wavelet decomposition level (WDL) [ 6th, CUSUM estimated average memory (e) [ 0.988, CUSUM local averaging memory (a) [ 0.97, CUSUM correction factor (ce) [ 0.20.

608

computers & security 25 (2006) 600–615

sets were analyzed under both CUSUM and wavelet thresholding. For CUSUM thresholding, DoS detection is equivalent to ~ SðkÞ exceeding a threshold TCUSUM, which is the basis for detection in Blazek et al. (2001). Wavelet thresholding evaluates the coefficients dj,l(k) against wavelet threshold T. dj,l(k) is ~ obtained from wavelet analysis of SðkÞ. Fig. 6 is a ROC with two traces. The lower trace provides detection likelihoods under CUSUM threshold variations, while the upper trace denotes wavelet threshold variations. Datapoints on the curves are measures of average detection efficiency. An increase in either threshold T or TCUSUM can lower both the true detection and false detection percentages. The distance between the two curves is of interest. The data set was SLIVE4. Datapoints for the wavelet thresholding trace lie closer to ideal point (0, 1). Higher detection efficiency is achieved from the additional wavelet processing of the CUSUM. Extracted from the ROC of Fig. 6, Table 3 shows the maximum average detection rates for each analysis method. The ratio of true detections vs. false positives percentages for the wavelet analysis is 56% more efficient than CUSUM processing alone. Wavelet analysis also has equal or greater average true detection rates for a given false positive rate. This was also seen for data sets SLIVE7 and SLIVE10 (not shown).

8.2.2.

Fig. 7 – Efficiency vs. wavelet decomposition. Varying parameters: wavelet threshold ( T ) and wavelet decomposition level (WDL). Static parameters: CUSUM estimated average memory (e) [ 0.988, CUSUM local averaging memory (a) [ 0.97, CUSUM correction factor (ce) [ 0.20.

Variation to wavelet decomposition level (WDL)

Wavelet analysis of the CUSUM statistic provides better detection efficiency than CUSUM analysis alone. Wavelet analysis is also capable of enhanced change-point detection through its scaling parameter, the wavelet decomposition level (WDL). WDL equals the value of j in Eq. (9). Fig. 7 shows the detection analysis of the SLIVE4 data set over variation of wavelet decomposition level (WDL) and threshold (T ). Each point represents a different wavelet threshold value, whereas each curve represents a different wavelet decomposition value (WDL). The 6th level of decomposition was the baseline, since it was determined earlier to be better than CUSUM thresholding. Increases in WDL produces datapoints closer to (0, 1), and thus better detection ratios. A maximum is reached at WDL ¼ 10th. Further WDL increases have an adverse affect, producing datapoints which lie below the 6th level. Higher WDL performs poorly in low signal-to-noise environments (Wang, 1995). Table 4 summaries the true detection and false positive rates for WDL ¼ 6th, 10th, and 12th.

8.2.3.

Detection efficiency uncertainty

i The uncertainty of average detection rates is defined by eStrue ðPÞ Si and efalse ðPÞ (see Appendix B). This is shown by the error bars superimposed on the ROC datapoints of Fig. 8. For low threshold values, the error bars show about a 10% spread around the datapoint. The horizontal error bars representing false positive uncertainty increases as the threshold is lowered. At lower values of wavelet thresholding, noise is more likely to causes a false positive, leading to more spread in detection uncertainty.

8.2.4.

Variation of DoS attack strength

The detection method was evaluated against multiple DoS scales. DoS scale is a measure of traffic increase incurred during an attack. The higher the scale, the stronger the attack is. data sets SLIVE4, SLIVE7, and SLIVE10 were constructed with DoS scale factors of 4, 7, and 10, respectively (Appendix A). Fig. 9 shows detection results for WDL ¼ 10th across the three live test data sets (SLIVE4, SLIVE7, SLIVE10). As expected, when the DoS scale is increased, the ROC traces approach the upper left corner, indicating in better detection efficiency. The kink in trace of SLIVE10 needs further

Table 3 – Wavelet vs. CUSUM best detection rate Analysis

CUSUM only CUSUM with wavelet Detection ratio increase (%)

True detection rate (%)

False positive rate (%)

Detection ratio

15 40

18 30

0.83 1.3 56

Table 4 – WDL variation on detection percentages WDL

Threshold

True detection rate (%)

False positive rate (%)

Ratio

6th 10th 12th

43,000 30,000 6000

39 47 30

30 21 39

1.3 2.2 .7

609

computers & security 25 (2006) 600–615

Fig. 8 – Efficiency vs. wavelet thresholding. Varying parameters: wavelet threshold ( T ); static parameters: CUSUM estimated average memory (e) [ 0.988, CUSUM local averaging memory (a) [ 0.97, CUSUM correction factor (ce) [ 0.20.

Fig. 9 – Efficiency vs. DoS scaling. Varying parameters: wavelet threshold ( T ); static parameters: wavelet decomposition level (WDL) [ 10th, CUSUM estimated average memory (e) [ 0.988, CUSUM local averaging memory (a) [ 0.97, CUSUM correction factor (ce) [ 0.20.

investigation. Table 5 shows the maximum detection efficiency per DoS scale variation.

All parameters for the statistical (CUSUM) and wavelet analysis have been evaluated. Table 6 states the resulting parameter set P* acquired from testing our live data sets.

8.2.5.

Variation of CUSUM parameters

The first-stage of processing used the CUSUM statistic of Eq. (8) with three parameters: estimated average memory (3), local averaging memory (a), and a noise correction factor ce. Each of these parameters was varied to determine their effect on detection efficiency. The SLIVE7 data set was used, along with best possible wavelet parameters determined from previous sections: WDL ¼ 10th, wavelet threshold T ¼ 30,000. (a) Estimated average memory (3): parameter 3 determines the amount of past history used in estimating the average packet rate m(k), which is a dominant term in the CUSUM ~ statistic SðkÞ. The variation of 3 on detection efficiency is shown in Fig. 10. The best setting for parameter 3 is 0.98811. (b) Local averaging (a): parameter a determines the amount of past history used to filter arrival noise within input packet time-series Npt(k). The variation of a on detection efficiency is shown in Fig. 11. Below 0.11, the true detections and false positive rates linearly approach 0. Little variation in true detection rate is seen for a values between 0.22 and 1. The best value for a is 0.22. (c) Correction factor (ce): parameter ce reduces the variance of the input traffic on the CUSUM statistic. Its variation on detection efficiency is shown in Fig. 12. The optimum setting for ce is about 0.13. The removal of noise and variance from the arrival process by parameter ce is seen as having a positive affect on detection percentages.

8.3.

Detection delay

Another useful performance metric is detection delay. For a time series in which both the CUSUM and wavelet thresholding indicated a DoS attack, a delta delay measurement was recorded. Detection delay is only valid when both the wavelet and CUSUM threshold were crossed. Let k1 and k2 indicate the time interval at which the wavelet and CUSUM processing, respectively, have first declared a DoS attack:

(11) k1 ¼ min arg dj;0 ðzÞ T ; z˛Zþ z

h i ~ TCUSUM ; k2 ¼ min arg SðzÞ

z˛Zþ

(12)

z

where j is the wavelet decomposition level. The delta delay measurement is defined as d ¼ k2 k1. A positive d indicates how much earlier the wavelet analysis detected the DoS event relative to the CUSUM only processing.

Table 5 – DoS scale performance DoS scale data set True detection False positive Ratio rate (%) rate (%) 4 7 10

SLIVE4 SLIVE7 SLIVE10

46 68 78

21 24 25

2.1 2.8 3.1

610

computers & security 25 (2006) 600–615

Fig. 10 – Efficiency vs. e variation. Varying parameter: CUSUM estimated average memory (e). Static parameters: wavelet decomposition level (WDL) [ 10th; wavelet threshold ( T ) [ 30,000, CUSUM local averaging memory (a) [ 0.97, CUSUM correction factor (ce) [ 0.20.

Fig. 11 – Efficiency vs. a varying parameters: CUSUM local averaging memory (a); static parameters: wavelet decomposition level (WDL) [ 10th, wavelet threshold ( T ) [ 30,000, DoS scale [ 7, CUSUM estimate average memory (e) [ 0.988, CUSUM correction factor (ce) [ 0.20.

The set of d measurements were captured from the SLIVE4 data set. The number of samples was limited (<60), due to low CUSUM detection rates (<25% in Fig. 6). For WDL ¼ 6th, the histogram of Fig. 13 shows the distribution of d delays. The median d delay gain is 11 s with a granularity of the sampling interval (1 s). Eighty-six percent of the DoS attacks were detected in a shorter amount of time by the wavelet analysis. For WDL ¼ 10th, the wavelet analysis results in poorer detection delay (Fig. 14). The median d delay ‘loss’ is 90 s, with only 14% of samples showing a positive delay gain. Increasing the WDL level increases the wavelet analysis’s detection time. The latency is due to the increased amount of signal data required for calculation of higher levels of wavelet coefficients. Table 7 outlines the tradeoff between increasing the WDL level for detection efficiency and corresponding detection delays.

of which 4 nodes were core transit nodes and remaining 96 nodes distributed into 12 stub domains. Background traffic was also ns-2 generated representing over 75% of the traffic. Three flood attack simulations were performed using TCP SYN, UDP, and ICMP protocols. Each attack reached 20% of the total aggregate traffic through either linear or abrupt increases. We found such synthetic test data to lead to inflated

9.

Discussion and conclusion

Wavelet analysis of the CUSUM has better detection efficiency of DoS attacks than the CUSUM approach alone. The increase in detection ratio is 56% (Table 3). Even higher detection ratios are possible with higher levels of wavelet processing (Table 4). The first stage of CUSUM processing lowers the amount of noise of the arrival process, while the subsequent wavelet analysis finds the change-point of the time-series. The change-point is the onset of a DoS attack. We performed extensive testing to determine the true detection rate, false positive rate, and detection delay metrics. For comparisons, consider other CUSUM detection results (Wang et al., 2002; Blazek et al., 2001). In Blazek et al. (2001), the ns-2 simulator was used to construct a network of 100 nodes

Fig. 12 – Efficiency vs. CUSUM correction ce. Varying parameter: CUSUM correction factor (ce); static parameters: wavelet decomposition level (WDL) [ 10th, wavelet threshold (T ) [ 30,000, DoS scale [ 7, CUSUM estimated average memory (e) [ 0.988, CUSUM local averaging memory (a) [ 0.22.

computers & security 25 (2006) 600–615

611

Table 6 – Set P* for SLIVE7 data sets Parameter

Setting

Wavelet decomposition level (WDL) Wavelet coefficient threshold (T ) CUSUM estimated average memory (e) CUSUM local averaging memory (a) CUSUM correction factor (ce)

10th 30,000 0.98811 0.22 0.13

performance results. As expected, all three attacks were detected. False alarm rates ranged from under 1 to 6 alarms per 100 packets monitored. These ideal detection results for ns-2 simulations are comparable to our report in Section 8.1. Another CUSUM algorithm was used to detect TCP SYN attacks in Wang et al. (2002). Their three test data sets included a large company wide-area Internet access point, 10 and 622 Mb/s university Internet access points. From the test data, TCP control traffic (SYN, ACK, RST) was extracted into a time-series, then overlaid with SYN floods of constant intensity. Rates of 33–100 TCP SYNs per second were used. For attacks above 35 and at 33 SYN per second, the detection probability was 100 and 70%, respectively. Detection delays ranged from 8 min down to 20 s. Although this work also used live operational data, comparisons are not valid due to the differing protocols contained in the time-series. Our work attempts to find a protocol-invariant DoS attack within timeseries containing any Internet protocol. Wang et al. (2002) use time-series of TCP control traffic only, without the added burstiness of TCP data packets (e.g. web traffic). We believe our results will be similar if excited by, and tuned to, TCP control traffic only. Wavelet analysis of Barford et al. (2002) evaluated 6 months of router SNMP and IP flows records, sampled at 5 min

Fig. 13 – Detection delay distribution. Static parameters: wavelet decomposition level (WDL) [ 6th; wavelet threshold (T ) [ 30,000, CUSUM threshold [ 140,000, CUSUM estimated average memory (e) [ 0.988, CUSUM local averaging memory (a) [ 0.22, CUSUM correction factor (ce) [ 0.13.

Fig. 14 – Detection delay distribution. Static parameters: wavelet decomposition level (WDL) [ 10th; wavelet threshold (T ) [ 30,000, CUSUM threshold [ 140,000, CUSUM estimated average memory (e) [ 0.988, CUSUM local averaging memory (a) [ 0.22, CUSUM correction factor (ce) [ 0.13.

intervals. The monitoring point was a university wide-area access point. Network engineering analysis of the log data identified over 100 anomalous events including network malfunctions, suspected attacks, and flash crowds. A subset of 39 high confidence anomalies was used for detection evaluation; although it is not clear how many of the anomalies were specifically DoS flood attacks. Only one of the anomalies was missed by their direct application of wavelet analysis to monitored traffic. No false positive results were provided. Detection time had an ambiguity of 1.5 h. Fair comparisons here are would be difficult. Only a small number of network events were analyzed, possibly leading to less than general tuning of their detector’s decision rules. More importantly is the lack of a reported false positive rate, without which comparisons are inappropriate. Different sources of test data were used. Various synthetic data sources were investigated, but reliance was placed on captured live data. Realistic data provides better confidence in test results. Lacking availability of a captured DoS attack, a simple attack model was superimposed on live traffic of high variance and rate. Live traffic dynamics will present more difficulties to the anomaly detection systems, irrespective of the DoS model. Therefore, we suggest our DoS modeling is reasonable when superimposed on Internet ‘noise’, but not absolute. More accurate approaches to DoS attack modeling should be explored. Detection results from the ns-2 and live data sets were noticeably different. ns-2 was ideal over a large range of parameter settings (Table 2). This is due to the synthetic nature of the ns-2 test data, which does not contain enough background traffic variability. Live data does not obtain ideal detection efficiency over any set of parameters settings. The background traffic of the live data set challenges the detection system,

612

computers & security 25 (2006) 600–615

thus reducing its true detection rate and providing a non-zero false positives rate. Tradeoffs in detection efficiency is possible through parameter ‘tuning’ as indicated in the ROC graphs of Section 8. Each has an effect of removing various amounts of arrival process noise or burstiness from the network timeseries. This increases the signal-to-noise ratio, allowing wavelet analysis to detect the abrupt change due to the DoS attack. Detection delay is dependent on amount of wavelet processing. Low wavelet decomposition levels (WDL) provide better detection delay than a purely CUSUM approach, but at higher levels of wavelet analysis, the delay may increase. A tradeoff exists between increasing the WDL for detection efficiency gains and possible detection delays losses (Table 7). Further iterative testing on independently created data sets should be performed to ensure consistency of tuning parameters. DoS flooding attacks cause significant changes in the amount of network traffic. Similarly, regular network activity can have large variations. Traffic fluctuations can occur from topology changes, user activity, or network management. Examples include flash crowds, data backups, network maintenance, and administrative changes. A regular network event was declared as a false positive in several live traffic timeseries. This recurring false positive event is shown in Fig. 15 at interval k ¼ 1.3 104 and 7.3 104. This regularly timed event, possibly a data backup, mimic a DoS attack through a large packet count increase. Such events should be analyzed more closely for better understanding and modeling. It is not clear such events have been explicitly included in testing of other works (Barford et al., 2002; Wang et al., 2002; Blazek et al., 2001). The wavelet-based DoS detection method in its current form cannot distinguish these events as normal activity. New detection rules or algorithms are needed to avoid these false positives. For a live data set, 78% true detections, 37% false positives, for a detection ratio of 2.1 was determined (Fig. 12). We obtained 105 false-detect events while testing 239 time-series, each of which on average was 8 h in duration. The false positive rate equates to 1.11 per day. Although high, approximately five false positives per week are due to our network’s ‘regular’ events described above. For a deployable solution, future improvements are needed, but they should be flexible, as detection of these normal events may be of some interest. Other open issues include limitations for this approach when applied to low intensity attacks, and attacks that ramp up slowly. Other malicious network events also need to be modeled (or captured) for detection evaluation. Viruses, worms, and slow-start DoS attacks are examples. Once detected, attack response countermeasures may be effective or could cause further problems. For example, packet blocking

Table 7 – Detection delay vs. WDL WDL

True detections rate (%)

False positives rate (%)

Detection ratio

Delta detection delay (s)

6th 10th

39 47

30 21

1.3 2.2

11 90

CUSUM

15

18

0.83

–

Fig. 15 – Live traffic false positive. (Top) Live packet time series; (Middle) CUSUM statistic; (Bottom) wavelet decomposition coefficient (d10,0). from a subnet is a common response to a DoS attack. Blocking removes both attack and legitimate traffic. If the detection’s threshold is set too low, the packet filter can be engaged prematurely and obstruct legitimate traffic. Adequate response to attack detection is an open issue. It appears that our detection approach could be placed anywhere in the network. Placement at natural chokepoints, like firewalls, is likely to be a good strategy. Conversely, finding proper locations for sampling network traffic seems like a fruitful topic for future research and is outside the scope of this paper. In our wavelet analysis, the Haar wavelet was selected for its simplicity and previous application for change-point detection. Other mother wavelets functions should be evaluated. Better detection efficiency, delay, or other performance gains may be achieved. The use of matching wavelets looks promising for this task (Bukkapatnam, 1999; Bukkapatnma et al., 1999; Bukkapatnam et al., 2003; Bukkapatnam et al., 2002). Reasonable computational complexity of the waveletbased algorithm allows for on-line implementation. The CUSUM algorithm of Eq. (8) requires little memory and involves simple arithmetic operations. Depending on the operating environment, alterative implementations for the windowing Eq. (7) and estimation functions Eq. (6), less

613

computers & security 25 (2006) 600–615

floating point multiplication, can be sought. Wavelet analysis by the discrete wavelet transform has a memory requirement of 2j. For wavelet decomposition levels of j ¼ 6 or 10, this is a low 26 ¼ 64 or 210 ¼ 1024 units of memory. Computation of the wavelet coefficients through Mallat’s pyramid algorithm requires only elementary operations of addition, subtraction, and shifting. All algorithmic computation can be easily implemented in either software or hardware. DoS attack detection is possible through analysis of network time-series. We presented a new DoS detection algorithm that uses both statistical and wavelet analyses to provide better performance than a purely statistical approach. Higher true detection and lower false positives rates are seen. Detection delay can be reduced or adjusted at the expense of detection efficiency. The algorithm is flexible and could be used throughout the network. It requires minimal system resources. Our evaluation of the DoS detection approach finds that it performs better than competing techniques. Use of realistic network traffic ensures confident in the test results. We also presented an approach to testing DoS detection methods that allows the system to be tested using operational data, without inconveniencing the users of the operational network. We feel that this approach could be generalized to provide test data for many classes of common network attacks. Wider acceptance of this approach would help researchers create network monitoring solutions with lower false positive rates.

3. Choose randomly an integer, astart, between 15 and 75% of atotal_length. 4. Calculate the average, amean, over the integer range [astart, astart þ blength] of time-series a as: amean ¼

astart þblength

X

1 blength

aðkÞ

(14)

k¼astart

5. Define a constant scale > 0. Now, form time-series a0 as: X ½bsub ðk astart Þ þ scale amean uðk astart Þ þ aðkÞ a0 ðkÞ ¼ k

(15) where uðkÞ ¼

1; 0;

0 k blength otherwise

(16)

In the context of DoS attack modeling, time-series bsub(k) represents the additional background traffic generated by a DoS attack. astart is the DoS attack start time. blength is the length of the attack. scale reflects the packet count increase due to a DoS flooding attack. a0 (k) is the resultant time-series containing a modeled DoS attack superimposed on realistic background traffic dynamics. No efforts were made to model possible packet loss due to congestion. Fig. 3 shows a live time-series a0 (k) containing a model DoS attack of scale 7.

Appendix B. Acknowledgements Participation of Dr. Brooks and Mr. Carl is supported by the Office of Naval Research under Award No. N00014-01-1-0859. Any opinions, findings, and conclusions are recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the Office of Naval Research (ONR). Dr. Rai is supported in part by the NSF grants CCR 0073429 and CCR 0310916.

Appendix A.

The following procedure was used to overlay a DoS attack at a specific location. 1. From the live test data, choose at random two time-series, a and b. Let atotal_length and btotal_length be the total number of data samples in a and b, respectively. 2. Define an integer range of length blength. Here blength is a random integer between 5 and 15% of atotal_length. The range’s lower bound is denoted as bstart. bstart is randomly chosen between 5 and 20% of btotal_length. This integer range, [bstart, bstart þ blength], spans blength consecutive samples of timeseries b starting at bstart. Using this range, define a timeseries bsub, with data samples from time-series b, as: bsub ðkÞ ¼

bðk þ bstart Þ; 0 k blength 0; otherwise

(13)

An analysis report was generated for each test iteration of time-series s (Table 8). Rows two through six reflect the operating parameters from set P. nsalerts ðPÞ is the number of DoS events identified during detection analysis. nsexpected is the expected number of DoS attacks. If the time-series contains a DoS event, nsexpected is equal to one, else zero otherwise. During construction of the test data set, each time-series s was identified as containing a single DoS attack or not. The number of true and false detections is the basis for determining the detection rate. For time-series s, under operating parameters P, the number of true detects nstrue is: nstrue ðPÞ ¼

1; 0;

if nsalerts ðPÞ nsexpected otherwise

(17)

Table 8 – Packet time series report Parameter File I e a ce WDL T nsalerts ðPÞ nsexpected nsfalse ðPÞ nstrue ðPÞ

Description

Value

Time-series filename Collection time interval CUSUM estimated average Memory CUSUM local averaging memory CUSUM correction factor Wavelet support (2WDL) Alert threshold Number of DoS alerts Expected number of DoS events Number of false detects Number of correct detects

foo_e.csv 1s 0.988 0.97 0.13 4096 15,000 6 1 5 1

614

computers & security 25 (2006) 600–615

nstrue ˛f0; 1g as each time-series was constructed to have at the most a single DoS attack event. The number of false detects, nsfalse , is given by:

vﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ u 2 uP75 Si;j i u i¼0 pfalse ðPÞ pSfalse ðPÞ 1:96 t Si efalse ðPÞ ¼ pﬃﬃﬃﬃﬃﬃ ð75 1Þ 75

nsfalse ðPÞ ¼ max nsalerts ðPÞ nstrue ðPÞ; 0

i i ðPÞ and efalse ðPÞ represents the interval over which the etrue average detection percentage is reported with 95% confidence.

S

(18)

The empirical detection rate is achieved from the entire data set. For the ns-2 data set, Sns-2, the total number of true and false detects is given by: X ns ðPÞ (19) Nns-2 ðPÞ ¼ true

true

s˛Sns-2 ns-2 Nfalse ðPÞ ¼

X

nsfalse ðPÞ

(20)

s˛Sns-2

The average true detection and false positive percentages, ns-2 ns-2 ðPÞ and pfalse ðPÞ, respectively, are determined from the ptrue total number of detection alerts as: Nns-2 ðPÞ ns-2 100 (21) ðPÞ ¼ P true s ptrue nalerts ðPÞ cs˛Sns-2 Nns-2 ðPÞ P falses 100 nalerts ðPÞ cs˛Sns-2

ns-2 ðPÞ ¼ pfalse

(22)

The live test data sets, SLIVE4, SLIVE7, SLIVE10, have a larger sample population. Empirical average detection percentage was derived across smaller sets Si;j 3Si , where i ˛ {LIVE4, LIVE7, LIVE10} and j ¼ 0,.,75. Each set Si;j contains 70 timeseries randomly chosen from the parent test set Si. Let gi,j,k represent the kth time-series of set Si;j , where k ¼ 1,.,70. The total number of true and false detects within set Si;j is given by: 70 X

S

i;j ðPÞ ¼ Ntrue

g

i;j;k ntrue ðPÞ

(23)

k¼1

S

i;j Nfalse ðPÞ ¼

70 X

g

i;j;k nfalse ðPÞ

(24)

k¼1

The true detection and false positive percentage for set Si;j is: Si;j

S

i;j Ntrue ðPÞ

ptrue ðPÞ ¼ P70

Si;j

(25)

g

k¼1

i;j;k nalerts ðPÞ

S

i;j Nfalse ðPÞ

pfalse ðPÞ ¼ P70

k¼1

(26)

g

i;j;k nalerts ðPÞ

The average detection percentage of the data set Si, i ˛ {LIVE4, LIVE7, LIVE10}, is given by: S

i ðPÞ ¼ ptrue

S

i ðPÞ ¼ pfalse

75 Si;j 1 X ptrue ðPÞ 75 j¼0

(27)

75 Si;j 1 X pfalse ðPÞ 75 j¼0

(28)

S

S

i i The standard error of ptrue ðPÞ and pfalse ðPÞ, respectively, is given as: vﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ u 2 uP75 Si;j Si 1:96u t j¼0 ptrue ðPÞ ptrue ðPÞ Si etrue ðPÞ ¼ pﬃﬃﬃﬃﬃﬃ (29) ð75 1Þ 75

(30)

S

references

Abry P, Baraniuk R, Flandrin P, Riedi R, Veitch D. Multiscale nature of network traffic. IEEE Signal Processing May 2002; 19(3):28–46. Allen W, Marin G. The LoSS technique for detecting new denial of service attacks. In: SoutheastCon; 2004. Anderson T, Roscoe T, Wetherall D. Preventing internet denial-ofservice with capabilities. Computer Communications Review January 2004;34:39. Antoniadis A, Gijbels I. Detecting abrupt changes by wavelet methods. Technical Report, Laboratoire LMC-IMAG. France: Universite Joseph Fourier; 1997. Barford P, Kline J, Plonka D, Ron A. A signal analysis of network traffic anomalies. In: Proceedings of ACM SIGCOMM Internet measurement workshop, Marseilles, France, November 2002. Basseville M, Nikiforov IV. Detection of abrupt changes: theory and applications. Englewood Cliffs, NJ: Prentice-Hall; 1993. Blazek RB, Kim H, Rozovskii B, Tartakovsky A. A novel approach to detection of ‘denial-of-service’ attacks via adaptive sequential and batch-sequential change-point detection methods. In: Proceedings of the IEEE system, man, and cybernetics information assurance workshop, West Point, NY, USA: IEEE Systen, Man, and Cybernetics Society, June 2001. Brooks RR. Disruptive security technologies. Boca Raton, FL: CRC Press; 2005. Bukkapatnam STS, et al. Neighborhood method and its coupling with the wavelet method for nonlinear signal separation of contaminated chaotic time-series data. Signal Processing 2002;82(10):1351–74. Bukkapatnam STS, Nichols JM, Seaver M, Trickey ST, Hunter M. A wavelet-based, distortion energy approach to structural health monitoring. Structural Health Monitoring Journal 2003. Bukkapatnam STS. Compact nonlinear signal representation in machine tool operations. In: ASME design engineering and technology conference, DETC-VIB 8068, September 12–15. Las Vegas, NV; 1999. Bukkapatnma STS, Kumara SRT, Lakhtakia A. Local eigenfunctions-based suboptimal wavelet packet representation of contaminated chaotic signals. IMA Journal of Applied Mathematics 1999;63(2):149–60. Cao J, Cleveland W, Lin D, Sun D. Internet traffic tends toward Poisson and independent as the load increases. Available from: http://cm.bell-labs.com/stat/doc/lrd2poisson.pdf. Carl G, Brooks RR, Rai S, Kesidis G. DoS/DDoS attack detection techniques. IEEE Internet Computing January/February 2006; 10(1):82–9. Dittrich D. DDoS – is there really a threat? In: Usenix security symposium. Available from: http://staff.washington.edu/ dittrich/talks/sec2000/; 2000. Feinstein L. et al. Statistical approaches to DDoS attack detection and response. In: DARPA information survivability conference and exposition proceedings, vol. 1; 2003. p. 303–14. Floyd S, Paxson V. Difficulties in simulating the internet. IEEE/ ACM Transactions on Networking August 2001;9(4):392–403.

computers & security 25 (2006) 600–615

Dutch Botnet Bigger than expected. Information Week. Available from: http://www.informationweek.com/story/showArticle. jhtml?articleID¼172303265; October 21, 2005. Jaffard S. Wavelets. Tools for Science and technologies. Philadelphia Society for Industrial and Applied Mathematics 2001;72–5. Kreibich C, Crowcroft J. Honeycomb – creating intrusion detection signatures using honeypots. Computer Communications Review 2004;34. Lakshminarayanan K, Adkins D, Perrig A, Stoica I. Taming IP packet flooding attacks. SIGCOMM Computer Communications Review 2004;34(1):45–50. Lavielle M. Detection of multiple changes in a sequence of dependent variables. Stochastic Processes and their Applications 1999;83:79–102;

615

Glenn Carl is a Ph.D. student in electrical engineering at the Pennsylvania State University. His research interests include intrusion and anomaly detection, network security testing, and large-scale simulation of inter-domain routing protocols. His email id is [email protected] R. R. Brooks – PI – Associate Professor – Clemson University, ECE Department – Dr. Brooks was PI of the Mobile Ubiquitous Security Environment (MUSE) Project sponsored by ONR (PM: Frank Deckelman) as a Critical Infrastructure Protection University Research Initiative (CIP/URI). MUSE included a game theoretic analysis of DDoS attacks. His book Disruptive Security Technologies is published by CRC Press. He was PI of the Reactive Sensor Networks (RSN) Project sponsored by DARPA IXO (PM: Sri Kumar). As head of the Distributed Systems Department at the Penn State Applied Research Laboratory, he received a DURIP award supporting study of networked systems interacting with the real world. He was co-PI of a NIST project defining security standards for networked building control systems, and a DARPA ISO program coordinating air campaign command and control. Dr. Brooks’ research concentrates on network security, adaptive infrastructure and strategic reasoning. His Ph.D. dissertation received an exemplary achievement certificate from the Louisiana State University graduate school. Dr. Brooks did graduate study in Operations Research at the Conservatoire National des Arts et Metiers in Paris, France. He has a B.A. in Mathematical Sciences from The Johns Hopkins University. He has a broad professional background with computer systems and networks. This includes being technical director of Radio Free Europe’s computer network for many years. His consulting clients include the French stock exchange authority and the World Bank. Dr. Brooks is a senior member of the IEEE. Dr. Rai is a Professor with the Department of Electrical and Computer Engineering at Louisiana State University, Baton Rouge, Louisiana. Dr. Rai has taught and researched in the area of network traffic engineering, ATM, reliability engineering, fault diagnosis, neural net-based logic testing, and parallel and distributed processing. He is a co-author of the book Wave Shaping and Digital Circuits, and tutorial texts Distributed Computing Network Reliability and Advances in Distributed System Reliability; last two published from IEEE Computer Society Press. He has guest edited a special issue of IEEE Transactions on Reliability on the topic Reliability of Parallel and Distributed Computing Networks. He was an Associate Editor for IEEE Transactions on Reliability from 1990 to 2004. Currently, he is on the editorial board of International Journal of Performability Engineering. Dr. Rai received the best paper award at the 1998 IEEE International Performance, Computing & Communication Conference (S. Rai and Y. C. Oh, Analyzing packetized voice and video traffic in an ATM multiplexer, Feb. 16–18, Tempe, Arizone). Dr. Rai is a senior member of the IEEE.