A Deep Long Short-Term Memory based classifier for Wireless Intrusion Detection System

A Deep Long Short-Term Memory based classifier for Wireless Intrusion Detection System

Available online at www.sciencedirect.com ScienceDirect ICT Express xxx (xxxx) xxx www.elsevier.com/locate/icte A Deep Long Short-Term Memory based ...

611KB Sizes 0 Downloads 49 Views

Available online at www.sciencedirect.com

ScienceDirect ICT Express xxx (xxxx) xxx www.elsevier.com/locate/icte

A Deep Long Short-Term Memory based classifier for Wireless Intrusion Detection System Sydney Mambwe Kasongo, Yanxia Sun ∗ Department of Electrical & Electronic Engineering Science, University of Johannesburg, South Africa Received 19 May 2019; accepted 14 August 2019 Available online xxxx

Abstract Wireless networks have evolved over the years and they have become some of the most prominent communication media. These networks generally transmit large volumes of information at any given time. This has engendered a number of security threats and privacy concerns. This paper presents a Deep Long Short-Term Memory (DLSTM) based classifier for wireless intrusion detection system (IDS). Using the NSL-KDD dataset, we compare the DLSTM IDS to existing methods such as Deep Feed Forward Neural Networks, Support Vector Machines, k-Nearest Neighbors, Random Forests and Naïve Bayes. The experimental results suggest that the DLSTM IDS outperformed existing approaches. c 2019 The Korean Institute of Communications and Information Sciences (KICS). Publishing services by Elsevier B.V. This is an open access ⃝ article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Keywords: Machine learning; Deep learning; Intrusion detection; Wireless networks

1. Introduction The recent advances in Information and Communication technology (ICT), Internet of Things (IoT) and Hand-held devices have resulted in an increased use of wireless networks. These networks generally process large volumes of information on an ongoing basis. This has generated a number of privacy concerns and it has exposed wireless networks to various security threats. In order to secure these networks, Intrusion Detection Systems (IDS) were introduced [1]. At the highest level, IDSs are categorized as host based IDS (HIDS) and network based IDS (NIDS) [2]. Moreover, both HIDS and NIDS are classified into: anomaly based IDS, signature-based IDS and hybrid-based IDS. An anomaly based IDS scans the network and flags unusual behaviors whereas a signaturebased IDS relies on predefined patterns to flag an intrusion [3]. The hybrid-based IDS are a combination of anomaly based and signature-based IDS. The complexity of modern wireless networks evolves by the day due to the consistent increase in the number of end users ∗ Corresponding author.

E-mail addresses: [email protected] (S.M. Kasongo), [email protected] (Y. Sun). Peer review under responsibility of The Korean Institute of Communications and Information Sciences (KICS).

which in turn increases the network traffic and the likelihood of intrusions. It is therefore imperative to design IDSs that are robust, effective and accurate. Several studies using machine learning (ML) techniques have been conducted in a bid to develop high performing IDSs. These approaches include: Support Vector Machines (SVM) [4] , K-Nearest-Neighbor (KNN) [5], Naive Bayes (NB) [6], Random Forest (RF) [2] and Artificial Neural Networks (ANNs) [7]. In this research, we focus on ANNs and more specifically on Deep Learning (DL) applied to IDS. DL was firstly introduced by Professor Hinton [8] and it is considered as an advanced sub-field of ML. The proposed DL method for Wireless IDS is based on Deep Long Short-Term Memory (DLSTM) Recurrent Neural Networks (RNNs) using a filter based algorithm based on information gain (IG) [9]. The DLSTM approach is compared to the following methods: Feed Forward Deep Neural Networks (FFDNNs), ANN, SVM, KNN, NB and RF. The results indicate that the DLSTM RNN model performs better than other existing algorithms. The reminder of this paper is organized as follows: Section 2 gives an account of ML and DL techniques for intrusion detection systems. In Section 3, an overview of the LSTM RNN model is provided and the DLSTM RNN IDS

https://doi.org/10.1016/j.icte.2019.08.004 c 2019 The Korean Institute of Communications and Information Sciences (KICS). Publishing services by Elsevier B.V. This is an open access 2405-9595/⃝ article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Please cite this article as: S.M. Kasongo and Y. Sun, A Deep Long Short-Term Memory based classifier for Wireless Intrusion Detection System, ICT Express (2019), https://doi.org/10.1016/j.icte.2019.08.004.

2

S.M. Kasongo and Y. Sun / ICT Express xxx (xxxx) xxx

architecture is presented. Section 4 provides the experimental setups and discusses the results and in Section 5, the conclusion is provided. 2. Related work In [10], the authors presented an intelligent network attack detection method based on LSTM-RNNs. The architecture of the neural network model included an input layer, one mean pooling layer and a logistic regression layer at the output. In order to train and test the model, the NSL-KDD intrusion detection dataset was used. The performance metrics used to evaluate the performance of the model were the Detection Rate (DR), the false alarm rate (FAR) and the accuracy. LSTMRNN model yielded an accuracy of 97.52% on training data, a DR of 98.85% and FAR of 8.75%. These results suggested that the proposed model outperformed other models such as SVM, KNN and Bayesian. In [11], a deep learning approach to intrusion detection system based on RNNs was proposed. The RNN-IDS presented in this work was a simple RNN with an input layer, one hidden layer and an output layer. The RNN-IDS was trained and tested using the NSL-KDD dataset. The evaluation metric used for performance assessment was the accuracy of detection. The model was compared to the following classifiers: Random Forests, Multilayer Perceptron, Support Vector Machines, Na¨ıve Bayes and J48. For the binary classification, the RNN-IDS achieved an accuracy of 99.81% on the NSL-KDD training data and 83.28% on the NSL-KDD test data. For the multiclass classification, the RNN-IDS achieved an accuracy of 99.53% on the NSL-KDD training data and 81.29% on the NSL-KDD test data. These results were superior to those of peer models. In [12], an IDS using LSTM RNNs with Gradient Descent Optimization was developed. The performance metrics used to evaluate the classifier were the precision, the detection rate, the accuracy and the false alarm rate (FAR). The LSTM based IDS was then compared to other IDSs using the following classifier: RNN with Hessian-Free, LSTM RNN using the stochastic gradient descent (SDG) and Feed Forward Neural Networks. The results demonstrated that LSTM RNNs using the Nadam gradient descent optimizer outperformed other IDS models by yielding a detection rate of 98.95% on training data, a precision of 97.69%, a FAR of 9.98% and an accuracy of 97.54%. In [9], a DL method using feed forward deep neural networks (FFDNN) in conjunction with a filter based feature selection algorithm using information gain (IG) was presented. In this research, various experiments were conducted using FFDNN with IG on the NSL-KDD intrusion detection dataset. The FFDNN-IG was compared the following models: SVM, KNN, NB, Random Forest (RF) and Decision Trees (DT). The results suggested that for both the binary and the multiclass classification setups, FFDNN-IG outperformed other models. Moreover, the results demonstrated that depth and the number of neurons in the network influence the model’s accuracy.

Fig. 1. LSTM structure.

3. DLSTM RNN wireless intrusion detection system 3.1. Background on LSTM RNN In comparison to traditional feed forward neural networks, recurrent neural networks (RNNs) contain a directional loop that is able to memorize and remember the previous states and apply them to the current output [11,12]. RNNs suffer from gradient vanishing problem and in order to solve this issue, Long Short Term Memory (LSTM) were introduced [13]. The structure of the LSTM unit used in this research is depicted in Fig. 1. The relationship between inputs and outputs in Fig. 1 is given by the following expressions at time t and t − 1: ⎧ ⎪ ⎪ f t = σ (W f .[h t−1 , xt ] + b f ) ⎪ ⎪ ⎪ i t = σ (Wi .[h t−1 , xt ] + bi ) ⎪ ⎪ ⎪ ⎨C ′ = tanh(W .[h , x ] + b ) t c t−1 t c (1) ′ ⎪ Ct = f t ∗ Ct−1 + i t ∗ Ct ⎪ ⎪ ⎪ ⎪ ⎪ z t = σ (Wz .[h t−1 , xt ] + bz ) ⎪ ⎪ ⎩ h t = z t ∗ tanh(Ct ), tanh(x) =

1 − e−2x 1 + e−2x

(2)

1 (3) 1 + e−x where C is the cell state. σ (the sigmoid) and tanh are the activation functions. The input vector is represented by x, the output is given by h t . W and b are the weights and biases parameters, respectively. f t is the forget function that has the role to filter out unneeded information. i t and C ′ inject new information in the cell state. z t outputs the relevant information. There is no general standard in defining whether a network of LSTMs is deep or not. For the purpose of our research, we consider that any network with more than two LSTM layers is a deep LSTM (DLSTM).

σ =

3.2. DLSTM RNN IDS architecture The overall architecture of the DLSTM RNN IDS proposed in this research is presented in Fig. 2. There are four major steps: data separation, data transformation, model training and

Please cite this article as: S.M. Kasongo and Y. Sun, A Deep Long Short-Term Memory based classifier for Wireless Intrusion Detection System, ICT Express (2019), https://doi.org/10.1016/j.icte.2019.08.004.

3

S.M. Kasongo and Y. Sun / ICT Express xxx (xxxx) xxx Table 1 Dataset breakdown. Dataset name

Normal

DoS

Probe

R2L

U2R

Total

KDDTrain + Full KDDTrain + 75 KDDEvaluation KDDTest + Full

67343 50494 16849 9711

45927 34478 11449 7458

11656 8717 2939 2754

995 749 246 2421

52 42 10 200

125973 94480 31493 22544

During the third step, the DLSTM RNN model is trained using the training set and in the fourth step, the trained DLSTM RNN model is evaluated and tested using the evaluation set and the test set. The structure of the DLSTM RNN models used in this research is depicted in Fig. 3 where there is an input layer, a DLSTM Unit that has n LSTM layers with n > 2 and a Dense Feed Forward Layer (DFFL) that is made of N artificial neurons (AN). N is the number of ANs within the layer. The activation functions applied to the DFFL layer during training are the Rectified Linear Unit (ReLU ), f (x) = max(0, x) and the Sigmoid [9,14]. The ReLU and the Sigmoid act as threshold gates for each neuron in the DFFL and they ensure that the model caters for nonlinear effects and they facilitate the gradient descent during the model’s training. Additionally, a So f tmax function [15] in Eq. (7) is applied before generating the model’s output, where v is the vector of T number of logits (values) generated by the DFFL.

Fig. 2. DLSTM RNN IDS architecture.

evi S(v)i = ∑T

j=1

Fig. 3. DLSTM RNN topology.

model testing. In the first step, the main dataset is divided into a training set and an evaluation set. The test set is an independent dataset. In the second step, the features are transformed using a normalization process and filtered using IG . IG is derived from Information theory and it is able to discover nonlinear relationships between two random variables within a dataset [9]. In information theory, the entropy or the measure of uncertainty of a variable X , H (X ) is given by the following expression: ∑ H (X ) = − P(x)log2 (x) (4) x∈X

Furthermore, the conditional entropy of the two random variables X and Y is calculated using the equation in (5). ∑ ∑ H (X |Y ) = − P(x) P(x|y)log2 (P(x|y)) (5) x∈X

y∈Y

where P is the probability. From the expressions in (4) and (5), I G is defined as: I G(X |Y ) = H (X ) − H (X |Y )

(6)

Therefore, a feature Y has a stronger correlation to feature X than feature C if I G(X |Y ) > I G(C|Y ).

ev j

(7)

The So f tmax activation function has the role to compute the probabilities of the logits. The sum of all the generated probabilities amounts to 1 and the logit with the highest probability is the predicted class. 3.3. Dataset and data processing The benchmark dataset used in this research is the NSLKnowledge Discovery Data mining (NSL-KDD) [10–12]. The NSL-KDD has 41 features including three nonenumeric features (protocol type, service and flag) and 38 are numeric features. Moreover, the NSL-KDD has one class label narrowed down to the following four main groups: Denial of Service (DoS), User to Root (U2R), Remote to User (R2L), Probe, Normal. The distribution of the types of attacks is shown in Table 1. Furthermore, the NSL-KDD full training set called KDDTrain + Full is segmented in two partitions: the KDDTrain + 75 that represents 75% of the full set and the KDDEvaluation that is 25% of the full set. The KDDEvaluation set is used for model validation purposes. The KDDTest + Full is independent from the main training set. Since LSTM RNN models only accept numeric values, the nonenumeric features are encoded using the Python Keras Library [16]. Algorithm 1 provides all the steps that govern the DLSTM RNN IDS.

Please cite this article as: S.M. Kasongo and Y. Sun, A Deep Long Short-Term Memory based classifier for Wireless Intrusion Detection System, ICT Express (2019), https://doi.org/10.1016/j.icte.2019.08.004.

4

S.M. Kasongo and Y. Sun / ICT Express xxx (xxxx) xxx Table 2 IG ranked features.

Algorithm 1 DLSTN RNN IDS algorithm 1: 2:

3: 4: 5: 6: 7: 8:

STEP 1 Split the main data set into a Training set (75%) and an Evaluation set (25%). STEP 2 Extract and Transform Features. Encode nonnumerical features using Keras . Normalize all features from the input feature vector X (x1 , ..., xn ) using: xnor m i = xi −min(xi ) where b = 1 and a = 0. (b − a) max(x i )−min(xi ) STEP 3 Reduce the normalized input feature vector using IG formula: I G i (xnor m i |C) = H (xnor m i ) − H (xnor m i |C) STEP 4 Set the initial DLSTM RNN model’s topology and hyperparameters. STEP 5 Train the chosen DLSTM RNN model on the Training set (75%). STEP 6 Validate the chosen DLSTM RNN model on the Evaluation set (25%). STEP 7 Test the chosen DLSTM RNN model on the Test set STEP 8 Repeat STEP 4 to STEP 7 until the desired results are reached.

4. Experiments and discussions 4.1. Performance metrics The performance indicators used for classification problems are based on the following four possibilities: • True positive (TP): attacks that are successfully classified as intrusions. • True Negative (TN): normal activities are correctly classified as normal. • False positive (FP): normal activity that is wrongly labeled as intrusive by the IDS. • False Negative (FN): intrusive activities that are classified as normal. The accuracy (AC), the Precision and the Recall are derived from the above conditions as follows: TP +TN AC = (8) T P + T N + FP + FN Pr ecision =

TP T P + FP

(9)

TP (10) T P + FN The F1-Score is a measure that takes into consideration both the Precision and the Recall in order to validate the accuracy. It is the harmonic mean of the Recall and the Precision and it is expressed as follows: Pr ecision.Recall F1Scor e = 2 (11) Pr ecision + Recall Recall =

4.2. Hardware and software systems The experiments conducted in this research were carried out using the following two Python based libraries running on

Features numbers: f5, f4, f6, f3, f30, f29, f33, f34, f38, f39, f35, f12, f23, f25, f26, f32, f36, f31 Table 3 Performance of traditional ML models. ML classifier

Val. AC

F1-Score

Test AC

SVM KVM NB RF ANN

95.50% 99.37% 88.75% 99.84% 99.46%

95.16% 99.36% 89.02% 99.83% 99.45%

79.12% 71.94% 75.48% 85.44% 85.31%

Windows 8.1 64-bit Operating System: Keras [16] that is built on TensorFlow [17] and Scikit-Learn [18]. These libraries are extensively used for deep learning and data science research. In terms of hardware, our experiments were implemented on an ASUS laptop Intel Core i3-3217U CPU @1.80 GHz and 4.00G RAM. In order to compare and evaluate objectively the performance of the DLSTM RNN IDS proposed in this research, we set different categories of experiments based on the multiclass classification scheme. Class 0 is normal, Class 1 represents R2L, Class 2 is U2R, Class 3 represents Probe, Class 5 is DoS. Moreover, all the experiments are based on a reduced feature vector made of 18 features that are depicted in Table 2 using their respective numbers in the NSL KDD dataset. This vector was generated in STEP 3 of Algorithm 1. In the first category, we performed the experiments using the following traditional machine learning methods on Scikit Learn [18]: SVM, KNN, NB, RF and ANN. For the SVM classifier, the following parameters were used: random state = 0, the multiclass option was set to ovr (one versus rest), the penalty parameter C was 1.1 and the tolerance stopping criteria, tol = 1e−5 . For the KNN classifier, the number of neighbors, n neighbor s was set to 11. For the NB classifier, the multinomial NB was used. In the case of RF, the number of trees in the forest, n esitmator s, was set to 100 and the maximum depth of the trees, max depth = 3. For the simple ANN, we used the following parameters: solver = ’sgd ’ (stochastic gradient descent), L2 regularization term, alpha = 1e−5 , the learning rate was 0.05, the random state was 1, and the number of neurons in the hidden layer was 100. The results of the first phase are depicted in Table 3 whereby RF is the best performing model with a validation accuracy (Val. AC) of 99.84%, a F1-Score of 99.83% and a test accuracy (Test AC) of 85.44%. In the second phase, we implement FFDNNs [9] and the results presented in Table 4 suggested that the model which performed the best has 150 hidden nodes (HN) that were spread throughout three hidden layers (HL), a leaning rate (LR) of 0.02, a validation accuracy (Val. AC) of 99.47%, a F1-Score of 99.56% and a test accuracy (Test AC) of 86.32%. At this point, we establish that deep learning based IDS outperform traditional ML based IDS.

Please cite this article as: S.M. Kasongo and Y. Sun, A Deep Long Short-Term Memory based classifier for Wireless Intrusion Detection System, ICT Express (2019), https://doi.org/10.1016/j.icte.2019.08.004.

S.M. Kasongo and Y. Sun / ICT Express xxx (xxxx) xxx

5

Table 4 Performance of FFDNNs models. HN

HL

LR

Val. AC

F1-Score

Test AC

30 40 60 80 80 150

3 3 3 3 3 3

0.01 0.01 0.02 0.01 0.05 0.02

99.28% 99.27% 99.43% 99.46% 99.53% 99.47%

99.26% 99.25% 99.42% 99.44% 99.52% 99.56%

84.95% 84.98% 86.26% 85.62% 86.17% 86.32%

Table 5 Performance of DLSTM RNN models. HU

HL

DFFL

Val. AC

F1-Score

Test AC

30 30 75 90 105 90 90

3 3 3 3 3 3 3

R5 R10–5 R25–5 R30–5 R35–5 S30–5 S29–5

99.23% 98.26% 99.28% 99.44% 99.32% 99.20% 99.51%

99.32% 99.50% 99.42% 99.62% 99.49% 99.41% 99.43%

85.34% 86.34% 85.84% 85.45% 85.51% 85.34% 86.99%

In the third category of our experiments, we evaluated the performance of various DLSTM RNN models. The most important aspects of these experiments were the depth of the LSTM layers, the number of LSTM hidden units (HU) and the structure of the Dense Feed Forward Layer (DFFL). The DFFL part of all the DLSTM RNN models has 5 neurons in the last layer because the experiments are based on a multiclass classification scheme whereby the class number is five. Moreover, the So f tmax function is applied to the last layer. In addition to the logit layer, the DFFL can have additional layers for a better generalization. The notation Rn or Sn signifies whether the extra layer has the ReLU or Sigmoid activation function applied to it where n represents the number of neurons. The results presented in Table 5 demonstrate that the best performing DLSTM RNN model has 90 LSTM HU distributed over three hidden layers (HL), a validation accuracy of 99.51%, a F1-Score of 99.43% and a test accuracy of 86.62%. Furthermore, the DFFL structure has 29 sigmoid neurons (S29) in the first layer and 5 So f tmax neurons in the last year. In all our experiments, the major indicator of a model’s efficiency is its performance on test data because the data has not been previously seen by the models. Based on the findings in our research, the DLSTM RNNs with 86.99% outperformed FFDNNs with 86.32%. We also compared the convergence of the DSLTM RNN model versus the FFDNN model by using the KDDTrain + 75 (DLSTM Train, FFDNN Train) and KDDEvaluation (DLSTM Validation , FFDNN Validation) datasets. As shown in Fig. 4, the DLSTM RNN converges faster than the FFDNN just before reaching 15 epochs. Furthermore, the DLSTM RNN IDS presented in our work surpasses the performance of the RNN-IDS presented in [11] with a test accuracy of 64.67%. Moreover, our proposed model outperformed the LSTM-RNN IDS in [10] that had an accuracy of 97.52% on training data whereas the DLSTM RNN IDS achieved 99.51%.

Fig. 4. DLSTM RNN and FFDNN convergence comparison.

5. Conclusion In this research, we designed an IDS based on DLSTM RNNs. The model uses multiple layers of LSTM units coupled to a DFFL with the objective to detect wireless networks intrusions efficiently. In order to assess the performance of the proposed system, the NSL-KDD dataset was used. Additionally, we applied a feature selection algorithm based on information gain with the purpose to reduce the feature vector. The overall accuracy on validation data was 99.51%, the F1-Score was 99.43% and the accuracy on test data was 86.99%. In comparison to traditional ML approaches as well as feed forward DL methods, the system proposed in this paper yielded an increased performance. In future works, we intend to study the performance of individual classes of attacks in the NSLKDD dataset using the DLSTM RNN model. Moreover, we aim at applying the DLSTM RNN model to the UNSW-NB15 wireless intrusion detection dataset in order to investigate further its performance over other existing methods. Declaration of competing interest The authors declare that there is no conflict of interest in this paper. Acknowledgments This research is partially supported by the South African National Research Foundation (Nos: 112108, 112142); South African National Research Foundation Incentive Grant (No. 95687); Eskom Tertiary Education Support Programme Grant; Research grant from URC of University of Johannesburg. References [1] R. Mitchell, I.R. Chen, A survey of intrusion detection in wireless network applications, Comput. Commun. 42 (3) (2014) 1–23.

Please cite this article as: S.M. Kasongo and Y. Sun, A Deep Long Short-Term Memory based classifier for Wireless Intrusion Detection System, ICT Express (2019), https://doi.org/10.1016/j.icte.2019.08.004.

6

S.M. Kasongo and Y. Sun / ICT Express xxx (xxxx) xxx [2] R. R. Samrin, D. Vasumathi, Review on anomaly based network intrusion detection system, in: IEEE Int. Conf. Electr. Electron. Commun. Comput. Optimization Tech. 2017, pp. 141–147. [3] W. Bul’ajoul, A. James, S. Shaikh, A new architecture for network intrusion detection and prevention, IEEE Access 7 (2019) 18558–18573. [4] A. Dastanpour, S. Ibrahim, R. Mashinchi, A. Selamat, Comparison of genetic algorithm optimization on artificial neural network and support vector machine in intrusion detection system, in: Proc. IEEE Conf. Open Syst. 2014, pp. 72–77. [5] B. Xu, C. Shuyu, Z. Hancui, W. Tianshu, Incremental k-NN SVM method in intrusion detection, in: 8th IEEE Int. Conf. Softw. Eng. Service Sci. 2017, pp. 712–717. [6] B. Selvakumar, K. Muneeswaran, Firefly algorithm based feature selection for network intrusion detection, Comput. Security 81 (2019) 148–155. [7] I. Manzoor, K. Neeraj, A feature reduced intrusion detection system using ANN classifier, Expert Syst. Appl. 88 (2017) 249–257. [8] Y. LeCun, Y. Bengio G. Hinton, Deep learning, Nature 521 (2015) 436–444. [9] S.M. Kasongo, Y. Sun, A deep learning method with filter based feature engineering for wireless intrusion detection system, IEEE Access 7 (2019) 38597–38607.

[10] Y. Fu, F. Lou, F. Meng, et al., An intelligent network attack detection method based on rnn, in: 2018 IEEE Third Int. Conf. on Data Science in Cyberspace, 2018, pp. 483–489. [11] C. Yin, Y. Zhu, J. Fei, X. He, A deep learning approach for intrusion detection using recurrent neural networks, IEEE Access 5 (2017) 21954–21961. [12] J. Kim, H. Kim, An effective intrusion detection classifier using long short-term memory with gradient descent optimization, in: IEEE Int. Conf. on Platform Technology and Service, 2017, pp. 1–6. [13] J. Schmidhuber, Deep learning in neural networks: An overview, Neural Netw. 61 (2015) 85–117. [14] X. Jiang, Y. Pang, et al., Deep neural networks with elastic rectified linear units for object recognition, Neurocomputing 275 (2018) 1132–1139. [15] A.A. Mohammed, V. Umaashankar, Effectiveness of hierarchical softmax in large scale classification tasks, in: IEEE Int. Conf. on Advances in Computing, Communications and Informatics, 2018, pp. 1090–1094. [16] Keras: The Python Deep Learning Library [Online]. Available: https: //keras.io/. [17] TensorFlow: An end-to-end open source machine learning platform [Online]. Available: https://www.tensorflow.org/. [18] Scikit-Learn: Machine Learning in Python [Online]. Available: https:/ /scikit-learn.org/stable/.

Please cite this article as: S.M. Kasongo and Y. Sun, A Deep Long Short-Term Memory based classifier for Wireless Intrusion Detection System, ICT Express (2019), https://doi.org/10.1016/j.icte.2019.08.004.