Online detection of Botnets on Network Flows using Stream Mining
The threat posed by botnets of infecting a large number of devices and using them together to perform several malicious actions has become a growing issue to the Internet security. One way to deal with it is to have methods able to correctly identify those botnets and then run necessary countermeasures. Many approaches using machine learning (ML) have been proposed over the years to cope with botnet detection. Nonetheless, the algorithms commonly employed cannot adapt to new data without significant effort. In this sense, a ML research topic referred to as stream mining may be a solution. Stream mining algorithms are specially tailored to learn incrementally with new instances, without consuming significant memory or time. This work proposes an approach using the Very Fast Decision Tree, a classification algorithm used on stream mining that can learn incrementally when needed, to identify botnets by observing network flows. When evaluating the approach on multiple scenarios with different botnets, we were able to achieve high performance metrics on the majority of scenarios, while using a significantly low number of labelled instances.
Chen, W., Luo, X., and Zincir-Heywood, A. N. (2017). Exploring a service-based normal In 2017 IFIP/IEEE Symposium on behaviour proling system for botnet detection. Integrated Network and Service Management (IM), pages 947–952.
Costa, V. G. T., Barbon Jr, S., Miani, R. S., Rodrigues, J. J. P. C., and Zarpelão, B. B. (2017). Detecting Mobile Botnets Through Machine Learning and System Calls Analysis. Proceedings of the 2017 IEEE International Conference on Communications (ICC), pages 917–922.
Domingos, P. and Hulten, G. (2000). Mining high-speed data streams. Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 71–80.
Eslahi, M., Rohmad, M. S., Nilsaz, H., Naseri, M. V., Tahir, N. M., and Hashim, H. (2015). Periodicity classication of http trafc to detect http botnets. In 2015 IEEE Symposium on Computer Applications Industrial Electronics (ISCAIE), pages 119–123.
Gama, J., Rocha, R., and Medas, P. (2003). Accurate decision trees for mining highspeed data streams. Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining KDD ’03, (January 2003):523.
Gama, J., Rodrigues, P. P., Spinosa, E., and Carvalho, A. (2010). Knowledge Discovery from Data Streams. Web Intelligence and Security Advances in Data and Text Mining Techniques for Detecting and Preventing Terrorist Activities on the Web, pages 125–138.
García, S., Grill, M., Stiborek, J., and Zunino, A. (2014). An empirical comparison of botnet detection methods. Computers & Security, 45(Supplement C):100 – 123.
Grill, M. and Pevníya, T. (2016). Learning combination of anomaly detectors for security domain. Computer Networks journal, 107:24–43.
Hammerschmidt, C., Marchal, S., State, R., Pellegrino, G., and Verwer, S. (2016). Efcient Learning of Communication Proles from IP Flow Records. Proceedings Conference on Local Computer Networks, LCN, pages 559–562.
Hammerschmidt, C., Marchal, S., State, R., and Verwer, S. (2017). Behavioral clustering of non-stationary IP ow record data. 2016 12th International Conference on Network and Service Management, CNSM 2016 and Workshops, 3rd International Workshop on Management of SDN and NFV, ManSDN/NFV 2016, and International Workshop on Green ICT and Smart Networking, GISN 2016, pages 297–301.
Ijaz, S., Hashmi, F. A., Asghar, S., and Alam, M. (2017). Vector Based Genetic Algorithm to optimize predictive analysis in network security. Applied Intelligence.
Jianguo, J., Qi, B., Zhixin, S., Wang, Y., and Lv, B. (2016). Botnet Detection Method Analysis on the Effect of Feature Extraction. Trustcom/BigDataSE/ISPA, IEEE, pages 1884–1890.
Khanchi, S., Vahdat, A., Heywood, M. I., and Zincir-Heywood, A. N. (2017). On botnet detection with genetic programming under streaming data label budgets and class imbalance. Swarm and Evolutionary Computation, (August):1–18.
Kidmose, E., Stevanovic, M., and Pedersen, J. M. (2016). Correlating intrusion detection alerts on bot malware infections using neural network. 2016 International Conference on Cyber Security and Protection of Digital Services, Cyber Security 2016.
Krawczyk, B., Minku, L., Gama, J., and Stefanowski, J. (2017). Ensemble learning for data stream analysis: A survey. Information, pages 1–86.
Le, D. C., Zincir-Heywood, A. N., and Heywood, M. I. (2016). Data analytics on network trafc ows for botnet behaviour detection. In 2016 IEEE Symposium Series on Computational Intelligence (SSCI), pages 1–7.
Livadas, C., Walsh, R., Lapsley, D., and Strayer, W. T. (2006). Using machine learning techniques to identify botnet trafc. Proceedings Conference on Local Computer Networks, LCN, (1):967–974.
Pfahringer, B., Holmes, G., and Kirkby, R. (2008). Handling numeric attributes in hoeffding trees. In Proceedings of the 12th Pacic-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD’08, pages 296–307, Berlin, Heidelberg. Springer-Verlag.
Saad, S., Traore, I., Ghorbani, A., Sayed, B., Zhao, D., Lu, W., Felix, J., and Hakimian, P. (2011). Detecting P2P botnets through network behavior analysis and machine learning. 2011 9th Annual International Conference on Privacy, Security and Trust, PST 2011, pages 174–180.
Silva, S. S. C., Silva, R. M. P., Pinto, R. C. G., and Salles, R. M. (2013). Botnets: A survey. Computer Networks, 57(2):378–403.
Stevanovic, M. and Pedersen, J. M. (2014). An efcient ow-based botnet detection using supervised machine learning. 2014 International Conference on Computing, Networking and Communications (ICNC), pages 797–801.
Trammell, B. and Boschi, E. (2008). Bidirectional ow export using ip ow information export (ipx). RFC 5103, RFC Editor.
Wang, J. and Paschalidis, I. C. (2016). Botnet Detection based on Anomaly and Community Detection. IEEE Transactions on Control of Network Systems, 5870(c):1–1.