The Weighted Word2vec Paragraph Vectors for Anomaly Detection Over HTTP Traffic - Details

author：

Li, Jieling (Li, Jieling.) ^[1] | Zhang, Hao (Zhang, Hao.) ^[2] (Scholars：张浩) | Wei, Zhiqiang (Wei, Zhiqiang.) ^[3]

Indexed by：

EI Scopus SCIE

Abstract：

Anomaly　detection　over　HTTP　traffic　has　attracted　much　attention　in　recent　years,　which　plays　a　vital　role　in　many　domains.　This　article　proposes　an　efficient　machine　learning　approach　to　detect　anomalous　HTTP　traffic　that　addresses　the　problems　of　existing　methods,　such　as　data　redundancy　and　high　training　complexity.　This　algorithm　draws　on　natural　language　processing　(NLP)　technology,　uses　the　Word2vec　algorithm　to　deal　with　the　semantic　gap,　and　implements　Term　Frequency-Inverse　Document　Frequency　(TF-IDF)　weighted　mapping　of　HTTP　traffic　to　construct　a　low-dimensional　paragraph　vector　representation　to　reduce　training　complexity.　Then　we　employs　boosting　algorithm　Light　Gradient　Boosting　Machine　(LightGBM)　and　Categorical　Boosting　(CatBoost)　to　build　an　efficient　and　accurate　anomaly　detection　model.　The　proposed　method　is　tested　on　some　artificial　data　sets,　such　as　HTTP　DATASET　CSIC　2010,　UNSW-NB15,　and　Malicious-URLs.　Experimental　results　reveal　that　both　the　boosting　algorithms　have　high　detection　accuracy,　high　true　positive　rate,　and　low　false　positive　rate.　Compared　with　other　anomaly　detection　methods,　the　proposed　algorithms　require　relatively　short　running　time　and　low　CPU　memory　consumption.

Keyword：

Anomaly detection Boosting CatBoost Classification algorithms Feature extraction LightGBM Prediction algorithms TF-IDF Training Word2vec

Community：

[ 1 ] [Zhang, Hao]Fuzhou Univ, Coll Math & Comp Sci, Fuzhou 350116, Peoples R China
[ 2 ] [Zhang, Hao]Fuzhou Univ, Fujian Key Lab Network Comp & Intelligent Informa, Fuzhou 350116, Peoples R China

Reprint 's Address：

张浩
[Zhang, Hao]Fuzhou Univ, Coll Math & Comp Sci, Fuzhou 350116, Peoples R China

Email：

zhanghao@fzu.edu.cn

Show more details

Version：

The Weighted Word2vec Paragraph Vectors for Anomaly Detection over HTTP Traffic
2020，IEEE Access
The Weighted Word2vec Paragraph Vectors for Anomaly Detection over HTTP Traffic
2020，IEEE Access

Related Keywords：

Fault Diagnosis in Power Line Inspection Using Normalized Multihierarchy Embedding Matching
2023，IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
Anomaly detection of control rod drive mechanism using long short-term memory-based autoencoder and extreme gradient boosting
2022，NUCLEAR SCIENCE AND TECHNIQUES
Self-Supervised Multi-Scale Cropping and Simple Masked Attentive Predicting for Lung CT-Scan Anomaly Detection
2023，IEEE Transactions on Medical Imaging
Detection of Low-Frequency and Multi-Stage Attacks in Industrial Internet of Things
2020，IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY

Source ：

IEEE ACCESS

ISSN： 2169-3536

Year： 2020

Volume： 8

Page： 141787-141798

3 . 3 6 7

JCR@2020

3 . 4 0 0

JCR@2023

ESI Discipline： ENGINEERING;

ESI HC Threshold：132

JCR Journal Grade：2

CAS Journal Grade：2

Cited Count：

WoS CC Cited Count： 22

SCOPUS Cited Count： 33

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 25

Affiliated Colleges：

计算机与大数据学院、软件学院本学院/部未明确归属的数据

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to