Methods of Intelligent System Event Analysis for Multistep Cyber-Attack Detection: Using Machine Learning Methods

Igor V. Kotenko; Котенко Игорь Витальевич; D. A. Levshun; Левшун Диана Альбертовна

doi:10.14357/20718594230301

Methods of Intelligent System Event Analysis for Multistep Cyber-Attack Detection: Using Machine Learning Methods

Authors: Kotenko I.V.¹, Levshun D.A.¹
Affiliations:
1. St. Petersburg Federal Research Center of the Russian Academy of Sciences
Issue: No 3 (2023)
Pages: 3-15
Section: Knowledge Representation
URL: https://bakhtiniada.ru/2071-8594/article/view/270269
DOI: https://doi.org/10.14357/20718594230301
ID: 270269

Cite item

Abstract

This study presents a classification and comparative analysis of intelligent system event methods for the detection of multi-step cyber-attacks. Such attacks are a sequence of interrelated steps of an attacker pursuing a specific goal of intrusion. The paper analyzes approaches to multistep cyber-attack detection based on system event learning methods, including supervised learning, unsupervised learning, and semi-supervised learning. The approaches considered are analyzed according to the following criteria: the method of extracting knowledge about scenarios of system events and attacks, the method for scenario knowledge representation, the method for security events analysis, the security problem to be solved, and the data set used. The paper gives the main advantages and disadvantages of learning-based approaches to the detection of multi-step cyberattacks, as well as possible directions of research in this area.

Keywords

intelligent systems, knowledge bases, cybersecurity, multistep attack, security events, incident management

Full Text

About the authors

Igor V. Kotenko

St. Petersburg Federal Research Center of the Russian Academy of Sciences

Author for correspondence.
Email: ivkote@comsec.spb.ru

Doctor of Technical Sciences, Professor. Chief Researcher, Head of Laboratory of Computer Security Problems

Russian Federation, St. Petersburg

D. A. Levshun

St. Petersburg Federal Research Center of the Russian Academy of Sciences

Email: gaifulina@comsec.spb.ru

Junior Researcher of Laboratory of Computer Security Problems

Russian Federation, St. Petersburg

References

Kotenko I.V., Saenko I.B., Doynikova E.V., Novikova E.S., Sharov A.V., Chechulin A.A., Desnitsky V.A. Intellektual'nye servisy zashchity informacii v kriticheskih infrastrukturah [Intelligent information security services in critical infrastructures]. St. Petersburg: BHV-Petersburg, 2019. 400 p.
Kotenko I., Gaifulina D., Zelichenok I. Systematic Literature Review of Security Event Correlation Methods // IEEE Access. 2022. V. 10. P. 43387-43420.
Kotenko I.V., Levshun D.A. Metody intellektual'nogo analiza sistemnyh sobytij dlya obnaruzheniya mnogoshagovyh kiberatak: ispol'zovanie baz znanij. Chast' 1 [Methods of intelligent system event analysis for multistep cyber-attack detection: using knowledge bases]. Iskusstvennyj intellekt i prinyatie reshenij [Scientific and Technical Information Processing].2023. No 2. P. 3-14.
Poletaeva N. G. Klassifikaciya sistem mashinnogo obucheniya [Classification of machine learning systems] // Vestnik Baltijskogo federal'nogo universiteta im. I. Kanta. Seriya: Fiziko-matematicheskie i tekhnicheskie nauki [Vestnik Immanuel Kant Baltic Federal University. Series: Physical, mathematical and technical sciences]. 2020. No 1. P. 5-22.
Joloudari J. H., Haderbadi M., Mashmool A., GhasemiGol M., Band S. S., Mosavi A. Early detection of the advanced persistent threat attack using performance analysis of deep learning // IEEE Access. 2020. V. 8. P. 186125-186137.
Pavlychev A. V., Starodubov M. I., Galimov A. D. Ispol'zovanie algoritma mashinnogo obucheniya Random Forest dlya vyyavleniya slozhnyh komp'yuternyh incidentov [Using the Random Forest Machine Learning Algorithm to Identify Complex Computer Incidents] // Voprosy kiberbezopasnosti [Cybersecurity issues]. 2022. No 5. P. 74-81.
Kotenko I., Saenko I., Branitskiy A. Framework for mobile Internet of Things security monitoring based on big data processing and machine learning // IEEE Access. 2018. V. P. 72714-72723.
Li G., Nguyen T. H., Jung J. J. Traffic incident detection based on dynamic graph embedding in vehicular edge computing // Applied Sciences. 2021. V. 11. No 13. P. 5861.
Chen H., Xiao R., Jin S. Real-time detection of cloud tenant malicious behavior based on CNN // 2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). IEEE, 2020. P. 998-1005.
Do Xuan C., Dao M. H. A novel approach for APT attack detection based on combined deep learning model //Neural Computing and Applications. 2021. V. 33. No 20. P. 13251-13264.
Mao B., Liu J., Lai Y., Sun M. MIF: A multi-step attack scenario reconstruction and attack chains extraction method based on multi-information fusion // Computer Networks. 2021. V. 198. P. 108340.
Ramaki A. A., Amini M., Atani R. E. RTECA: Real time episode correlation algorithm for multi-step attack scenarios detection // Computers & Security. 2015. V. 49. P. 206-219.
Pivarníková M., Sokol P., Bajtoš T. Early-stage detection of cyber attacks // Information. 2020. V. 11. No 12. P. 560.
Zimba A., Chen H., Wang Z. Bayesian network based weighted APT attack paths modeling in cloud computing // Future Generation Computer Systems. 2019. V. 96. P. 525-537.
Luo W., Zhang H., Yang X., Bo L., Yang X., Li Z., Ye J. Dynamic heterogeneous graph neural network for real-time event prediction // Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020. P. 3213-3223.
Levshun D. A. Model' kombinirovannogo primeneniya intellektual'nyh metodov korrelyacii sobytij informacionnoj bezopasnosti [Model of combined application of intelligent methods for correlation of information security events] // Izvestiya vysshikh uchebnykh zavedeniy. Priborostroenie [Journal of Instrument Engineering]. 2022. V. 65. No 11. P. 833-841.
Du M., Li F., Zheng G., Srikumar V. Deeplog: Anomaly detection and diagnosis from system logs through deep learning // Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. 2017. P. 1285-1298.
Shen Y., Mariconti E., Vervier P. A., Stringhini G. Tiresias: Predicting security events through deep learning //Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 2018. P. 592-605.
Heigl M., Weigelt E., Urmann A., Fiala D., Schramm M. Exploiting the outcome of Outlier Detection for novel Attack Pattern Recognition on Streaming Data // Electronics. 2021. V. 10. No 17. P. 2160.
Wang X., Gong X., Yu L., Liu J. MAAC: Novel Alert Correlation Method To Detect Multi-step Attack // 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). IEEE, 2021. P. 726-733.
Gadal S., Mokhtar R., Abdelhaq M., Alsaqour R., Ali E. S., Saeed, R. Machine Learning-Based Anomaly Detection Using K-Mean Array and Sequential Minimal Optimization// Electronics. 2022. V. 11. No 14. P. 2158.
Shittu R., Healing A., Ghanea-Hercock R., Bloomfield R., Rajarajan M. Intrusion alert prioritisation and attack detection using post-correlation analysis // Computers & Security. 2015. V. 50. P. 1-15.
Tao X. L., Shi L., Zhao F., L, S., Peng Y. A. Hybrid Alarm Association Method Based on AP Clustering and Causality//Wireless Communications and Mobile Computing. 2021. V. 2021. P. 1-10.
Abramov E.S., Tarasov Y.V. Primenenie kombinirovannogo nejrosetevogo metoda dlya obnaruzheniya nizkointensivnyh DDoS-atak na web-servisy [Application of a combined neural network method to detect low-intensity DDoS attacks on web services] // Inzhenernyj vestnik Dona [Engineering Journal of Don]. 2017. V. 46. No 3. P. 59-77.
Dhaou A., Bertoncello A., Gourvénec S., Garnier J., Le Pennec E. Causal and Interpretable Rules for Time Series Analysis // Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2021. P. 2764-2772.
Xie T., Zheng Q., Zhang W. Mining temporal characteristics of behaviors from interval events in e-learning // Information Sciences. 2018. V. 447. P. 169-185.
Liu F., Wen Y., Zhang D., Jiang X., Xing X., Meng D. Log2vec: A heterogeneous graph embedding based approach for detecting cyber threats within enterprise // Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 2019. P. 1777-1794.
Lanoe D., Hurfin M., Totel E. A scalable and efficient correlation engine to detect multi-step attacks in distributed systems // 2018 IEEE 37th Symposium on Reliable Distrib-uted Systems (SRDS). IEEE, 2018. P. 31-40.
Hassan W. U., Noureddine M. A., Datta P., Bates A. OmegaLog: High-fidelity attack investigation via transparent multi-layer log analysis // Network and distributed system security symposium. 2020. P 1-16.
Gurina A. Guzev O. Eliseev V. Obnaruzhenie anomal'nyh sobytij na hoste s ispol'zovaniem avtokodirovshchika [Host Anomalies Detection using Autoencoders] // International Journal of Open Information Technologies. 2020. V. 8. No 8. P. 26-36.
Vasilyev V. I., Vulfin A. M., Gvozdev V. E., Kartak V. M., Atarskaya E. A. Obespechenie informacionnoj bezopasnosti kiberfizicheskih ob"ektov na osnove prognozirovaniya i obnaruzheniya anomalij ih sostoyaniya [Ensuring information security of cyber-physical objects based on predicting and detecting anomalies in their state] // Sistemy upravleniya, svyazi i bezopasnosti [Systems of Control, Communication and Security]. 2021. No 6. P. 90-119.
Min B., Yoo J., Kim S., Shin D., Shin D. Network anomaly detection using memory-augmented deep autoencoder // IEEE Access. 2021. V. 9. P. 104695-104706.
Wang L., Qu Z., Li Y., Hu K., Sun J., Xue K., Cui M. Method for extracting patterns of coordinated network attacks on electric power CPS based on temporal–topological correlation // IEEE Access. 2020. V. 8. P. 57260-57272.
Zhang Y., Zhao S., Zhang J. RTMA: Real time mining algorithm for multi-step attack scenarios reconstruction // 2019 IEEE Intl Conf on High Performance Computing and Communications; Smart City; Data Science and Systems (HPCC/SmartCity/DSS). 2019. P. 2103-2110.
Hossain M., Xie J. Third eye: Context-aware detection for hidden terminal emulation attacks in cognitive radio-enabled IoT networks // IEEE Transactions on Cognitive Communications and Networking. 2020. V. 6. No 1. P. 214-228.
Zegeye W. K., Dean R. A., Moazzami F. Multi-layer hidden markov model based intrusion detection system // Machine Learning and Knowledge Extraction. 2018. V. 1. No 1. P. 265-286.
Shawly T., Elghariani A., Kobes J., Ghafoor A. Architectures for detecting interleaved multi-stage network attacks using hidden Markov models // IEEE Transactions on Dependable and Secure Computing. 2019. V. 18. No 5. P. 2316-2330.
Saaudi A., Tong Y., Farkas C. Probabilistic Graphical Model on Detecting Insiders: Modeling with SGD-HMM // ICISSP. 2019. P. 461-470.
Glasser J., Lindauer B. Bridging the gap: A pragmatic approach to generating insider threat data // 2013 IEEE Security and Privacy Workshops. IEEE, 2013. P. 98-104.
Xu W., Huang L., Fox A., Patterson D., Jordan, M. Online system problem detection by mining patterns of console logs // 2009 ninth IEEE international conference on data mining. IEEE, 2009. P. 588-597.
DARPA intrusion detection scenario specific dataset// Electronic resource. URL: https://www.ll.mit.edu/r-d/datasets/2000-darpa-intrusion-detection-scenario-specific-datasets (accessed 20.12.2022)
Sharafaldin I., Lashkari A. H., Ghorbani A. A. Toward generating a new intrusion detection dataset and intrusion traf-fic characterization // ICISSp. 2018. V. 1. P. 108-116.
Meidan Y., Bohadana M., Mathov Y., Mirsky Y., Shabta, A., Breitenbacher D., Elovici Y. N-baiot – network-based detection of IoT botnet attacks using deep autoencoders // IEEE Pervasive Computing. 2018. V. 17. No 3. P. 12-22.
Garcia S., Grill M., Stiborek J., Zunino, A. An empirical comparison of botnet detection methods // Computers & Security. 2014. V. 45. P. 100-123.
Creech G., Hu J. A semantic approach to host-based intrusion detection systems using contiguousand discontiguous system call patterns // IEEE Transactions on Computers. 2013. V. 63. No 4. P. 807-819.
Shiravi, A., Shiravi, H., Tavallaee, M., Ghorbani, A. A. Toward developing a systematic approach to generate benchmark datasets for intrusion detection // Computers & Security. 2012. V. 31. No 3. P. 357-374.
Tavallaee M., Bagheri E., Lu W., Ghorbani A. A. A detailed analysis of the KDD CUP 99 data set // 2009 IEEE symposium on computational intelligence for security and defense applications. 2009. P. 1-6.
Moustafa N., Slay J. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set) // 2015 Military Communications and Information Systems Conference (MilCIS). IEEE, 2015. P. 1-6.
Shin H. K., Lee W., Yun J. H., Ki, H. HAI 1.0:HIL-based Augmented ICS Security Dataset // 13th USENIX Workshop on Cyber Security Experimentation and Test (CSET 20). 2020. P 1-5.
Autiosalo J. Platform for industrial internet and digital twin focused education, research, and innovation: Ilmatar the overhead crane // 2018 IEEE 4th World Forum on Internet of Things (WF-IoT). IEEE, 2018. P. 241-244.
Kotenko I.V., Saenko I.B. Sozdanie novyh system monitoringa i upravleniya kiberbezopasnostyu [Creating new-generation cybersecurity monitoring and management systems] // Vestnik Rossijskoj Akademii nauk [Herald of the Russian Academy of Sciences]. 2014. Vol.84. No 11. P.993-1001.
Kotenko I., Fedorchenko A., Saenko I., Kushnerevich A. Technologii boljshih dannyh dlya korrelyatsii sobytij bezopasnosti na osnove ucheta tipov svyazej [Big Data Technologies for Security Event Correlation Based on Event Type Accounting] // Voprosy ciberbezopasnosti [Cyberse-curity issues]. 2017. No 5 (23). P. 2-16.