Indexed by:
Abstract:
Anomaly detection over HTTP traffic has attracted much attention in recent years, which plays a vital role in many domains. This article proposes an efficient machine learning approach to detect anomalous HTTP traffic that addresses the problems of existing methods, such as data redundancy and high training complexity. This algorithm draws on natural language processing (NLP) technology, uses the Word2vec algorithm to deal with the semantic gap, and implements Term Frequency-Inverse Document Frequency (TF-IDF) weighted mapping of HTTP traffic to construct a low-dimensional paragraph vector representation to reduce training complexity. Then we employs boosting algorithm Light Gradient Boosting Machine (LightGBM) and Categorical Boosting (CatBoost) to build an efficient and accurate anomaly detection model. The proposed method is tested on some artificial data sets, such as HTTP DATASET CSIC 2010, UNSW-NB15, and Malicious-URLs. Experimental results reveal that both the boosting algorithms have high detection accuracy, high true positive rate, and low false positive rate. Compared with other anomaly detection methods, the proposed algorithms require relatively short running time and low CPU memory consumption.
Keyword:
Reprint 's Address:
Email:
Version:
Source :
IEEE ACCESS
ISSN: 2169-3536
Year: 2020
Volume: 8
Page: 141787-141798
3 . 3 6 7
JCR@2020
3 . 4 0 0
JCR@2023
ESI Discipline: ENGINEERING;
ESI HC Threshold:132
JCR Journal Grade:2
CAS Journal Grade:2
Cited Count:
WoS CC Cited Count: 22
SCOPUS Cited Count: 33
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 5
Affiliated Colleges: