Available online at www.sciencedirect.com Available online at www.sciencedirect.com
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2018) 000–000 Procedia Computer Science (2018) 000–000 Procedia Computer Science 15500 (2019) 378–385
www.elsevier.com/locate/procedia www.elsevier.com/locate/procedia
The 14th International Conference on Future Networks and Communications (FNC) The 14th International Conference on Future Networks and Communications (FNC) August 19-21, 2019, Halifax, Canada August 19-21, 2019, Halifax, Canada
How How much much training training data data is is enough enough to to move move aa ML-based ML-based classifier classifier to to aa different network? different network? Ali Safari Khatouni∗∗ , Nur Zincir − Heywood Ali Safari Khatouni , Nur Zincir − Heywood Dalhousie University, Halifax, Canada Dalhousie University, Halifax, Canada
Abstract Abstract Analyzing and understanding network traffic is a crucial requirement for different network and security monitoring tools. The Analyzing understanding network traffic a crucial requirement different and security monitoring tools. The evolution ofand Internet services and protocols hasiscaused traditional traffic for analysis and network classification approaches to be ineffective on evolution of Internet services and protocols has caused traditional traffic analysis and classification approaches to be ineffective on traffic related to social media, streaming audio and video services. Key causes include: (i) the rise in the usage of dynamic port traffic related to socialapplications media, streaming videobased services. Key causes include: (i) the usage of dynamictraffic port numbers by different causingaudio port and number classification inaccurate, andthe (ii)rise theinincrease in encrypted numbers by payload differenttoapplications causing port number basedinspection classification (ii) theaims increase in encrypted traffic causing the be opaque and therefore, deep packet to fallinaccurate, short. Thisand research to study how to leverage causing the payload to be opaque and therefore, deep packet inspection to fall short. This research aims to study how to leverage machine learning based network traffic analysis and classification work in the presence of encrypted services. In this work, we machine learning based network traffic and classification work in the encrypted services. In this work, we implement and evaluate a decision treeanalysis based machine learning classifier for presence encryptedofsocial media, video, and audio traffic implement and evaluate a decision tree based machine learning classifier for encrypted social media, video, and audio traffic identification without using IP addresses, Port Numbers, application header fields, and payload. The extensive evaluations present identification using addresses, Port Numbers, application header fields, andhow payload. extensive high accuracywithout to classify notIPonly the aforementioned services but also to investigate muchThe more trainingevaluations is necessarypresent when high aforementioned alsoand to traffic investigate how much more training is necessary when such accuracy a classifiertoisclassify moved not to aonly new the network in terms of services location,but time, volume. such a classifier is moved to a new network in terms of location, time, and traffic volume. c 2019 2018 The The Authors. Authors. Published Published by by Elsevier Elsevier B.V. B.V. © c 2018 The Authors. by Elsevier B.V. This is an open accessPublished article under the CC BY-NC-ND BY-NC-ND license license (http://creativecommons.org/licenses/by-nc-nd/4.0/) (http://creativecommons.org/licenses/by-nc-nd/4.0/) This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility responsibility of ofthe theConference ConferenceProgram Program Chairs. Chairs. Peer-review under responsibility of the Conference Program Chairs. Keywords: Encrypted traffic classification; Machine Learning; Feature selection; Robust traffic analysis Keywords: Encrypted traffic classification; Machine Learning; Feature selection; Robust traffic analysis
1. Introduction 1. Introduction A significant body of research has been carried out in the area of monitoring, analyzing and classifying network A significant of the research haschallenges been carried out inworks the area of monitoring, and of classifying traffic. However,body one of biggest of these is the analysis andanalyzing classification encryptednetwork traffic. traffic. However, one of the biggest challenges of these works is the analysis and classification of encrypted traffic. Some applications, such as VoIP, encrypt the packet payload and implement many methods to bypass firewalls or Some applications, such as VoIP, encrypt the packet payload and implement many methods to bypass firewalls or proxies [1, 2]. Recent studies [3, 4] show that encrypted applications using HTTPS protocols are increasing rapidly. proxies [1, 2]. Recent studies [3, 4] show that encrypted applications using HTTPS protocols are increasing rapidly. The task of classification is made more challenging due to the convergence of web based services. HTTP and HTTPS The task of classification is made more challenging due to the convergence of web based services. HTTP and HTTPS ∗ ∗
Corresponding author. Tel.: +1-902-494-2652 ; fax: + +1-902-494-2652. Corresponding Tel.: +1-902-494-2652 ; fax: + +1-902-494-2652. E-mail address:author.
[email protected] E-mail address:
[email protected]
c 2018 The Authors. Published by Elsevier B.V. 1877-0509 c 2018 1877-0509 Thearticle Authors. Published by Elsevier B.V. This is an open access under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) 1877-0509 © 2019 Thearticle Authors. Published by Elsevier B.V. This is an open access under the Conference CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility of the Program Chairs. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility of the Conference Program Chairs. Peer-review under responsibility of the Conference Program Chairs. 10.1016/j.procs.2019.08.053
Ali Safari Khatouni et al. / Procedia Computer Science 155 (2019) 378–385
2
379
Ali Safari Khatouni et al. / Procedia Computer Science 00 (2018) 000–000 Table 1: The summary of related works. Research work Current research [7] [8] [9] [10, 11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [2] [21] [22] [23]
Encrypted Traffic X X
X X X X X X X X X X
Video Stream
Service Types Audio Stream Web X X X X X X X X X X X X
X X X X X
P2P X X X X X X X X X X
X X X X X X X X
VoIP
X X X X X X
Others
Mobile/ WEB
DNS
Application Header
Port Based
Year
W M M W W W W M/W M W M W M M M W W W
X X X X X X X X X X X X X X
X X X X X X X X X X X X X
X X X X X X X X X X X X X X X X X
2018 2018 2018 2017 2017 2017 2016 2016 2016 2016 2016 2015 2015 2015 2011 2009 2009 2006
no longer carry just web pages but include a complex mix of video and images, making it difficult for a traffic classification system to accurately carry out its work. Encrypted traffic makes analysis and classification a non-trivial task because the content of the packet payload (and under certain tunneling conditions even the packet header) is encrypted and therefore unavailable for analysis to determine the type of application. Machine Learning (ML) has shown promise in different domains, where the algorithm can model and learn the underlying behavior based on training data. This can then be leveraged to carry out real-time analytics. We aim to leverage ML-based traffic analysis in order to accurately classify application traffic in the presence of encryption. Specifically, we focus on studying the performance of our proposed system in terms of Precision, Recall, and F1 Score to classify encrypted social media, audio, and video services under different network locations, time, and traffic volume conditions. To explore these research questions, we studied a large set of traffic flow statistics (features), including temporal and spatial features extracted by Tranalyzer2 [5] flow-based traffic exporter. In the previous work [6], Tranalyzer2 feature set shows higher accuracy with respect to three other state-of-the-art traffic analyzers, i.e., Tstat, Argus, and Silk. We then use those traffic flow features to train a Decision Tree to classify the aforementioned encrypted traffic on one network and test it on four different networks in terms of location, time, and traffic volume. The remainder of this paper is organized as follows. Section 2 discusses related works. Section 3 details the methodology in terms of the datasets, features and the ML algorithm used. Section 4 presents the evaluations and results. Finally, Section 5 draws conclusions and discusses the future research directions. 2. Related work In this section, we provide an overview of the recent research studies related to encrypted traffic detection and classification. There is a wide range of systematic reviews focusing on traffic classification approaches by using MLbased techniques. Researchers have carried out initial works to study the issues and challenges with applying ML for traffic classification of specific applications. Previous surveys [24, 25, 26, 27] report a set of trends in the usage of ML-based approaches for traffic analysis in general and encrypted traffic classification in particular. The use of encrypted traffic is growing fast [3, 4]. There are several approaches that avoid decryption of traffic to preserve user’s privacy and security. Trevisan et al. [10, 11] focus on the temporal relationships between traffic flow to classify web traffic. In [21, 12, 13, 28, 15, 17, 19], authors target to classify a specific application that generates encrypted traffic. For instance, Branch et al. [21] proposed a solution to identify Skype traffic from non-Skype traffic, whereas Dong et al. [12] proposed a set of statistical features to classify video traffic. Both of these works could no longer work accurately with the evolution of VoIP and video applications. Xu et al. [20] proposed the automatic generation of mobile application signatures from traffic traces but they only focus on HTTP traffic. Gonzalez et al. [14] illustrate the feasibility of user profiling on HTTPS by using transport fingerprinting despite hurdles such as caching and dynamic content for different device types. They indicate HTTPS make profiling more difficult but still possible by means of customized algorithms. Husak et al. [29] present a mechanism to estimate the User-Agent of a client in HTTPS communication in the passive
380
Ali Safari Khatouni et al. / Procedia Computer Science 155 (2019) 378–385 Ali Safari Khatouni et al. / Procedia Computer Science 00 (2018) 000–000 Visit Web pages
pcap
A B C
3
Test dataset
Preprocessing, flow exporter, and labeling
Test dataset
D
Binary Classifier
Evaluation Classified flows Final evaluation
Evaluation
Multi-label Classifier
Fig. 1: Global view of the proposed system.
Fig. 2: A simple example of merging the output of binary classifiers.
monitoring scenario. Alshammari et al. present an ML-based approach to identify two popular encrypted applications, namely SSH and VoIP, without using the IP addresses, source/destination ports, and payload information [2]. They further analyzed the generalization of such an approach in [18]. Brissaud et al. [8] present a passive method to build a model using the sizes of objects loaded in the HTTPS (HTTP/1.1+TLS1.2) service as a signature. Shbair et al. [16] propose a framework to identify the HTTPS services based on application layer features such as SNI. Aceto et al. [7] present a multi-classification approach to classify of mobile apps. Lotfollahi et al. [9] exploit a deep learning based approach to classify either traffic type or application identification. In this paper, our goal is different from previous works in several aspects, we focus on encrypted web traffic (HTTP/1.1+TLS and HTTP/2+TLS) and we do not rely on specific header fields (e.g., SNI) that can be easily modified. Last but not least, we evaluate the performance of the proposed system on traffic from different networks in terms of location, time and traffic volume - none of which are introduced during training. This makes it a general solution for a wide range of applications. 3. Methodology In this section, we describe our methodology in designing an automatic encrypted traffic classifier. 3.1. Proposed System Fig. 1 presents the four main components of the proposed system: (A) Data collection; (B) Data pre-processing; (C) Classifier training; and (D) Classifier testing. Data collection component generates and/or captures the traffic
4
Ali Safari Khatouni et al. / Procedia Computer Science 155 (2019) 378–385 Ali Safari Khatouni et al. / Procedia Computer Science 00 (2018) 000–000
381
that is of interest. This is then passed on to the data pre-processing component. Tranalyzer2 (version 0.8.1) is used in the pre-processing phase, where the traffic is converted to flows and statistical features are extracted. A flow is uniquely identified by the 5-tuple (ClientIP; ClientPort; ServerIP; ServerPort; Protocol). We consider bidirectional TCP flows in this work. Tranalyzer2 extracts 180 statistical features. However, we only use 101 of them. The main criteria in the selection step are to choose metrics such that dependency on the type and timing of a monitored network is minimized. Therefore, we do not consider features specific to a device (e.g., MAC addresses, IP addresses), client related properties (e.g., client-side TTL), absolute timestamps (e.g., connection start time), port numbers, TLS information, and DNS query information. We believe this approach enables us to minimize the potential bias for the classifier and obtain a well-generalized model. We could not list all extracted feature names and corresponding descriptions for the sake of brevity - details can be found at the website for Tranalyze21 . It should be noted here that there are few categorical features that need to be encoded. The last step of this component is data normalization, which makes the proposed system less sensitive to the outliers. Once the data is labelled using metadata collected during the data collection phase, it is passed on to the classifier training component. In this component, the labeled dataset is divided into two disjoint sets - training and test datasets. Training data is used to train a binary Decision Tree classifier for each type of service (e.g., video vs. others). Then, the output of each binary classifier is used for a second layer classifier to label all service types. The one-vs-rest [30] strategy provides important information about the main characteristics of the class of interest. Subsequently, we use the ensemble classifier to create a multi-label classifier. Finally, the accuracy of the whole system is evaluated with a test dataset on which the system was not trained. Based on the training component, we identify features and illustrate the most important features that would enable us to model and understand the encrypted applications. This component consists of four main steps: i) the binary classifier is applied for each service type, thus, it obtains a trained model for each class; ii) the result of each binary classifier is evaluated. We use the Confusion Matrix and F1 Score to evaluate the accuracy of the correctly and incorrectly classified flows; iii) the output of all binary classifiers for all service types is directed to a multi-label classifier to assign a class to each traffic flow; and iv) the accuracy of the multi-label classifier is evaluated. As a result, we can decide to retrain the model with a larger dataset or different feature sets based on the given criteria. This can be the desired accuracy threshold or being lower / higher than a given threshold etc. Figure 2 shows a hypothetical dataset with two types of services of interest, namely C1, C2, and background (others). It presents a simple approach to ensemble the output of two binary classifiers. The outputs of the binary classifiers are used as inputs for the multilabel classifier. Finally, it gives the class label as C1, C2, and background. It is crucial to choose the algorithm most suited for the binary classifier. To this end, we use a large set of popular ML-based classifiers to obtain the bestfitted model for our goal. In [6], we identify that the Decision Tree classifier is the most suitable classifier (in terms of computational cost and accuracy) for classifying the applications of interest. In component D, we test the whole model by using a test dataset to evaluate the trained classifiers. The dashed red line in Fig. 1 shows the test workflow. To this end, we use the confusion matrix to present the accuracy of the classifier. Confusion matrix provides the number of flows classified as True In-class (TI), True Out-class (TO), False In-class (FI), and False Out-class (FO). The higher TI/TO is, the more accurate the classifier becomes. Moreover, we consider other performance metrics, TI such as Precision, Recall, and F1 score, to evaluate the proposed solution. i) Precision is defined as: T I+FI . It shows TI the ability of the classifier not to label a negative sample as positive. ii) Recall is defined as: T I+FO . It shows the ability of the classifier to identify all the positive samples. iii) F1 Score is the harmonic balance of the precision and recall: (2 ∗ (Precision∗Recall) Precision+Recall ). In this paper, we use the Python programming language and scikit-learn [31] ML implementation in Python3. 3.2. Datasets In this section, we present the datasets employed in this paper. 3.2.1. NIMS2018 Dataset The NIMS2018 dataset is a collection of datasets captured at different locations which use different networking technologies. It was collected in 2018 and consists of 10 different types of labeled services. Table 2 shows the number 1
https://www.tranalyzer.com
Ali Safari Khatouni et al. / Procedia Computer Science 155 (2019) 378–385
382
Ali Safari Khatouni et al. / Procedia Computer Science 00 (2018) 000–000
(a) TI for imbalanced training.
(b) TI for balanced training. (c) TO for imbalanced training.
5
(d) TO for balanced training.
Fig. 3: Distribution of the TI and TO for all classes using Tranalyzer2 feature set.
of flows for each service type in the NIMS2018 dataset. More information on this dataset can be found in [6]. To support the reproducibility of the research, NIMS2018 dataset will be made publicly available. 3.2.2. UNB2015 Dataset The UNB2015 dataset was collected in 2015 on a university campus. It consists of labeled traffic for a number of encrypted applications. Table 2 shows the number of flows for each service type in the UNB2015 dataset. This is a publicly available dataset and more details on this dataset could be found in [32]. Table 2: Overview of the NIMS2018 and UNB2015 datasets. Service Type Mail Weather Bank Web browsing Chat Video Online shop Back ground News Social network Audio
NIMS2018 dataset Number of flows Web service 20938 outlook.live, mail.google, ... 30471 theweathernetwork, accuweather 11569 rbcroyalbank, td 21569 calendar.google, wikipedia, ... 18628 hangout, messenger, ... 36582 twitch, YouTube, ... 40857 amazon, kijiji 58251 ssh, browser connection, ... 55198 BBC, CNN, ... 73155 linkedin, pinterest, ... 20447 soundcloud, radio-canada-online, ...
UNB2015 dataset Service Type Web Browsing mail Chat Video File Transfer VoIP P2P
Web service Firefox and Chrome SMPTS, POP3S and IMAPS ICQ, AIM, Skype, Facebook and Hangouts Vimeo and Youtube Skype, FTP over SSH (SFTP) and FTP over SSL (FTPS) Facebook, Skype and Hangouts voice calls uTorrent
Number of flows 5000 929 1375 640 2119 2737 1851
4. Evaluations and Results In this section, we present the results of our evaluations. To this end, NIMS2018 and UNB2015 datasets are used for testing purposes of the best trained model of the Decision Tree classifier. The results illustrate the normalized confusion matrix distribution for 10-fold cross-validation, which provides less sensitive performance metrics to the partitioning (training/testing) of the dataset. First, we conducted experiments to compare the different models when carrying out a binary classification since the proposed solution uses this approach in its first layer. Table 3 shows the results for different algorithms, i.e., the overall accuracy, training time, and testing time. In these evaluations, we use the whole NIMS2018 dataset. Results indicate that the Decision Tree obtains the second-best accuracy with fairly low computational costs, almost five times faster than the Random Forest classifier. Tree-based solutions provide human interpretable solutions, which lead us to investigate the properties of services analyzed. As a result, we consider Decision Trees as the binary classifier for the rest of this work. Table 3: Performance of the different binary classifiers for Video traffic type using NIMS2018 dataset. Algorithm Random forest Decision tree Complement Naive Bayes Multinomial Naive Bayes kNN Bernoulli Naive Bayes Linear SVM Classifier using Ridge Regression NearestCentroid SVM Passive-Aggressive Perceptron linear Models with Stochastic Gradient Descent
Accuracy 0.82 0.81 0.79 0.79 0.71 0.68 0.68 0.67 0.64 0.62 0.53 0.48 0.46
Training time [s] 9.89 1.41 0.03 0.02 1.27 0.09 6.37 0.11 0.06 292.61 0.12 0.1 0.13
testing time [s] 1.49 0.08 0.02 0.02 45.11 0.11 0.04 0.04 0.07 263.07 0.02 0.02 0.03
We present results related to the impact of a balanced training dataset on the proposed binary classification methodology. We use 10-fold cross-validation on the NIMS2018 dataset (Home Network). In each iteration, we consider two
6
Ali Safari Khatouni et al. / Procedia Computer Science 155 (2019) 378–385 Ali Safari Khatouni et al. / Procedia Computer Science 00 (2018) 000–000
(a) Average F1 Score for all service types.
383
(b) Falsely classified flows for video binary classifier.
Fig. 4: Average F1 Score for all service types with 4 different feature sets and Falsely (in-class - FI) classified flows.
scenarios: (i) Retain an imbalanced training dataset; and (ii) Retain a balanced training dataset. Figure 3 shows the distribution of the TI and TO rate, the percentage of the population that is correctly classified for each class is shown by the y-axis and class label is shown by the x-axis for imbalanced and balanced training datasets. As depicted from Figures 3a and 3b, the balancing approach has a significant improvement on the TI with a slight decrease on the TO. Fig. 4a shows the average F1 Score for all service types with 4 different feature sets (using Tranalyzer2) on the NIMS2018 dataset (Home Network). The average value is calculated by using the 10-fold cross-validation. As expected, the blue line represents results with the highest F1 Score. This is for tests with feature sets that have IP addresses, Port, and absolute time stamps. There is around a 2% F1 Score drop by removing Port and IP addresses. The delta time does not show a considerable impact on F1 Score. However, we need to keep in mind that including IP addresses, port numbers, and absolute time creates two limitations in classification. First, it may cause overfitting, which could be due to the client contacting different servers based on geographical locations and differences in user behaviours. Second, these are categorical features that may require retraining on new networks [18]. On the other hand, Web and Chat traffic have the highest prediction rate. This could be due to the fact that these are the simplest services with the most deterministic traffic behavior due to the lower number of objects and advertisements embedded in these services compared to the other two services (video and audio) studied in this paper. These results show that removing IP addresses, port numbers, and absolute times of the flow degrade the accuracy by 5%. However, it provides the initial feature set as independent as possible from the network on which the training data was captured. 4.1. Performance Evaluation Here, we present further results in evaluating the performance of the proposed solution. These tests are carried out on the remaining NIMS2018 and UNB2015 datasets. We focus on the classification performance for network traces different from the ones on which the system was trained. This kind of analysis enables us to evaluate the robustness of the classifier. We expand the evaluation to illustrate the amount of data from the new dataset needed to obtain reasonable performance. In this experiment, we train the model using the 50% of the NIMS2018 Home Network dataset and test against the UNB2015 and the remaining NIMS2018 datasets considering binary classifier for the video traffic type. We repeat the evaluations by retraining the model using a small part of the unknown datasets to explore the amount of the data needed in training to be able to obtain an acceptable performance under testing conditions. Figure 5 shows the Precision, Recall, and F1 Score for video binary classifier (y-axis) and the percentage of data used for training (x-axis) in 5 different networks. The x-axis presents 0 to 50 percent of the UNB2015/other NIMS2018 datasets used for training. The x-axis for NIMS2018 Home Network shows 50% plus the value shown in the plot, e.g., 10 on plot means 60% of the NIMS2018 Home Network is used for training and 40% for the test. Thus, x=0 shows the scenario where the model is only trained on 50% of the NIMS2018 Home Network. Figure 5 indicates
Ali Safari Khatouni et al. / Procedia Computer Science 155 (2019) 378–385
384
Ali Safari Khatouni et al. / Procedia Computer Science 00 (2018) 000–000
(a) Precision.
(b) Recall.
7
(c) F1 Score.
Fig. 5: Evaluations under different scenarios with different portions of the dataset used for testing.
low performance without training the model with the other datasets. Interestingly, all performance metrics increases rapidly even by just using 5% of the UNB2015/other NIMS2018 datasets in training. Furthermore, it is critical to understand misclassified flows in order to identify the features that could help us to classify them correctly. Fig. 4b illustrates the percentage of video flows classified as other applications - referred to as FI. This test was carried out with a binary video classifier using the Tranalyzer2 feature set (NIMS2018, Home Network). The majority of FI belongs to traffic flows of weather, social network, online shoping, and news apps. Indeed, the similarity of these flows (causing FI) mainly comes from the advertisements and trackers embedded in these kinds of web pages. The advertisements typically come from a set of main advertisement companies and these are composed of some photos / images. Future research is necessary to dig deeper into this type of traffic in order to differentiate them from regular video traffic. 5. Conclusions and Future Work Encrypted traffic poses a challenge for traditional traffic classification approaches including Deep Packet Inspection. ML techniques have been proposed as a means of classifying network traffic which is encrypted. The success of such techniques is based on the premise that all applications have repeatable temporal and spatial characteristics associated with the network traffic they generate. This premise becomes viable by finding such characteristics in network traffic flows of the encrypted applications of interest. In this research, we investigate various aspects of designing a robust and broadly applicable ML-based classifier to detect the underlying applications when network traffic is encrypted. We focus specifically on identifying encrypted Social Media, Audio and Video services/applications. To this end, we evaluate a Decision Tree-based classifier using Trananlyzer2 flow features on five different publicly available datasets from four different organizations. Our results show that an already trained classifier’s performance will drop when it is deployed to a new network (in terms of location, time and volume). To be able to minimize such a drop in the performance, we need around an additional five percent of training data from the new network. This will provide the retrained model to generalize well under new location, time and volumes of traffic. Moreover, among the three classes of interest video seems to be the most challenging one since many services carry such application traffic for tracking and advertisement purposes. The limited number of encrypted traffic and services need to be expanded in the future study. Moreover, future research needs to investigate the performance of commercially available traffic analysis products. Additionally, collecting more data in different networks under different infrastructures and designing different ensemble classifiers would be natural future research directions to follow. Acknowledgements This research is supported by the Mitacs and Solana Networks funding program. The research is conducted as part of the Dalhousie NIMS Lab at: https://projects.cs.dal.ca/projectx/. References [1] Matthews, Philip and Rosenberg, Jonathan and Wing, Dan and Mahy, Rohan, Session Traversal Utilities for NAT (STUN), no. 5389 in Request for Comments, RFC Editor, 2008. [2] R. Alshammari, A. N. Zincir-Heywood, Can encrypted traffic be identified without port numbers, ip addresses and payload inspection?, Computer Networks 55 (6) (2011) 1326–1350.
8
Ali Safari Khatouni et al. / Procedia Computer Science 155 (2019) 378–385 Ali Safari Khatouni et al. / Procedia Computer Science 00 (2018) 000–000
385
[3] D. Naylor, A. Finamore, I. Leontiadis, Y. Grunenberger, M. Mellia, M. Munaf`o, K. Papagiannaki, P. Steenkiste, The cost of the ”s” in https, in: Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies, CoNEXT ’14, ACM, New York, NY, USA, 2014, pp. 133–140. [4] A. S. Khatouni, M. Trevisan, L. Regano, A. Viticchi´e, Privacy issues of ISPs in the modern web, in: 2017 8th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), 2017, pp. 588–594. [5] S. Burschka, B. Dupasquier, Tranalyzer: Versatile high performance network traffic analyser, in: 2016 IEEE Symposium Series on Computational Intelligence (SSCI), 2016, pp. 1–8. [6] A. Safari Khatouni, N. Zincir-Heywood, Integrating machine learning with off-the-shelf traffic flow features for http/https traffic classification, in: 2019 The 24th Symposium on Computers and Communications (ISCC), 2019. [7] G. Aceto, D. Ciuonzo, A. Montieri, A. Pescap´e, Multi-classification approaches for classifying mobile app traffic, Journal of Network and Computer Applications 103 (2018) 131 – 145. [8] P.-O. Brissaud, J. Francois, I. Chrisment, T. Cholez, O. Bettan, Passive monitoring of https service use, in: CNSM’18 - 14th International Conference on Network and Service Management, Rome, Italy, 2018, p. 7. [9] M. Lotfollahi, R. S. H. Zade, M. J. Siavoshani, M. Saberian, Deep packet: A novel approach for encrypted traffic classification using deep learning, CoRR abs/1709.02656. [10] M. Trevisan, I. Drago, M. Mellia, H. H. Song, M. Baldi, What: A big data approach for accounting of modern web services, in: 2016 IEEE International Conference on Big Data (Big Data), 2016, pp. 2740–2745. [11] M. Trevisan and I. Drago and M. Mellia and H. H. Song and M. Baldi, AWESoME: Big Data for Automatic Web Service Management in SDN, IEEE Transactions on Network and Service Management PP (99) (2017) 1. [12] Y. ning Dong, J. jie Zhao, J. Jin, Novel feature selection and classification of internet video traffic based on a hierarchical scheme, Computer Networks 119 (2017) 102 – 111. [13] J. J. Davis, E. Foo, Automated feature engineering for http tunnel detection, Comput. Secur. 59 (C) (2016) 166–185. [14] R. Gonzalez, C. Soriente, N. Laoutaris, User Profiling in the Time of HTTPS, in: Proceedings of the 2016 Internet Measurement Conference, IMC ’16, ACM, New York, NY, USA, 2016, pp. 373–379. [15] Y. Fu, H. Xiong, X. Lu, J. Yang, C. Chen, Service Usage Classification with Encrypted Internet Traffic in Mobile Messaging Apps, IEEE Transactions on Mobile Computing 15 (11) (2016) 2851–2864. [16] W. M. Shbair, T. Cholez, J. Francois, I. Chrisment, A multi-level framework to identify https services, in: NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium, 2016, pp. 240–248. [17] V. F. Taylor, R. Spolaor, M. Conti, I. Martinovic, AppScanner: Automatic Fingerprinting of Smartphone Apps from Encrypted Network Traffic, in: 2016 IEEE European Symposium on Security and Privacy (EuroS P), 2016, pp. 439–454. [18] R. Alshammari, A. N. Zincir-Heywood, How robust can a machine learning approach be for classifying encrypted voip?, Journal of Network and Systems Management 23 (4) (2015) 830–869. [19] Q. Wang, A. Yahyavi, B. Kemme, W. He, I know what you did on your smartphone: Inferring app usage over encrypted data traffic, in: 2015 IEEE Conference on Communications and Network Security (CNS), 2015, pp. 433–441. [20] Q. Xu, Y. Liao, S. Miskovic, Z. M. Mao, M. Baldi, A. Nucci, T. Andrews, Automatic generation of mobile app signatures from traffic observations, in: 2015 IEEE Conference on Computer Communications (INFOCOM), 2015, pp. 1481–1489. [21] P. A. Branch, A. Heyde, G. J. Armitage, Rapid identification of skype traffic flows, in: Proceedings of the 18th International Workshop on Network and Operating Systems Support for Digital Audio and Video, NOSSDAV ’09, ACM, New York, NY, USA, 2009, pp. 91–96. [22] W. Li, M. Canini, A. W. Moore, R. Bolla, Efficient application identification and the temporal and spatial stability of classification schema, Computer Networks 53 (6) (2009) 790 – 809, traffic Classification and Its Applications to Modern Networks. [23] L. Bernaille, R. Teixeira, I. Akodkenou, A. Soule, K. Salamatian, Traffic classification on the fly, SIGCOMM Comput. Commun. Rev. 36 (2) (2006) 23–26. [24] F. Pacheco, E. Exposito, M. Gineste, C. Baudoin, J. Aguilar, Towards the deployment of machine learning solutions in network traffic classification: A systematic survey, IEEE Communications Surveys Tutorials (2018) 1–1. [25] T. T. T. Nguyen, G. Armitage, A survey of techniques for internet traffic classification using machine learning, IEEE Communications Surveys Tutorials 10 (4) (2008) 56–76. [26] N. Namdev, S. Agrawal, S. Silkari, Recent advancement in machine learning based internet traffic classification, Procedia Computer Science 60 (2015) 784 – 791, knowledge-Based and Intelligent Information & Engineering Systems 19th Annual Conference, KES-2015, Singapore, September 2015 Proceedings. ˇ ˇ [27] P. Velan, M. Cerm´ ak, P. Celeda, M. Draˇsar, A survey of methods for encrypted traffic classification and analysis, Netw. 25 (5) (2015) 355–374. [28] J. Datta, N. Kataria, N. Hubballi, Network traffic classification in encrypted environment: A case study of google hangout, in: 2015 Twenty First National Conference on Communications (NCC), 2015, pp. 1–6. [29] M. Hus´ak, M. Cerm´ak, T. Jirs´ık, P. Celeda, Network-based https client identification using ssl/tls fingerprinting, in: 2015 10th International Conference on Availability, Reliability and Security, 2015, pp. 389–396. [30] F. Haddadi, A. N. Zincir-Heywood, Benchmarking the effect of flow exporters and protocol filters on botnet traffic classification, IEEE Systems Journal 10 (4) (2016) 1390–1401. [31] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: Machine Learning in {P}ython, Journal of Machine Learning Research 12 (2011) 2825–2830. [32] G. Draper-Gil, A. H. Lashkari, M. S. I. Mamun, A. A. Ghorbani, Characterization of encrypted and vpn traffic using time-related features, in: ICISSP, 2016.