A Landslide Susceptibility Assessment Method Integrating Training Sample Optimization and Machine Learning; [训练样本采样优化与机器学习结合的滑坡易发性评价方法] - Details

author：

Weng, M. (Weng, M..) ^[1] | Xiao, G. (Xiao, G..) ^[2]

Indexed by：

Scopus

Abstract：

[Objectives]　The　quality　of　training　samples　significantly　impacts　model　performance　and　prediction　accuracy.　In　regions　with　limited　sample　data,　the　small　number　of　samples　and　their　uneven　spatial　distribution　may　prevent　the　model　from　effectively　learning　the　features　of　disaster-inducing　factors.　This　increases　the　risk　of　overfitting　and　ultimately　affects　the　accuracy　of　model　predictions.　Therefore,　it　is　crucial　to　collect　and　optimize　training　samples　based　on　regional　characteristics.　[Methods]　To　address　this　issue,　this　study　proposes　a　sampling　optimization　method　for　training　samples.　The　method　combines　the　Prototype　Sampling　(PBS)　approach　for　selecting　landslide-positive　samples　with　an　unsupervised　clustering　model　for　training　sample　selection.　This　results　in　a　screened　and　expanded　positive　sample　dataset　and　an　objectively　extracted　negative　sample　dataset,　forming　an　optimized　training　sample　dataset.　Subsequently,　the　Random　Forest　(RF)　and　Support　Vector　Machine　(SVM)　models,　which　are　well　suited　for　handling　small　sample　data,　were　employed　to　construct　a　landslide　susceptibility　evaluation　model.　Comparative　experiments　were　conducted　using　Raw　Data　(RD),　a　dataset　with　only　Data　Augmentation　(DA),　and　the　optimized　dataset.　Model　prediction　performance　was　assessed　using　metrics　such　as　the　Area　Under　the　Curve　(AUC).　Additionally,　the　frequency　ratio　method　was　applied　to　optimize　the　results　of　landslide　susceptibility　zoning.　Finally,　a　case　study　was　conducted　in　Putian　City,　where　landslide　sample　data　is　relatively　scarce,　to　verify　the　effectiveness　and　generalization　capability　of　the　proposed　sampling　optimization　method.　[Results]　The　results　indicate　that　models　trained　on　the　SO　dataset　achieved　AUC　improvements　of　10.69%　and　18.23%　compared　to　those　trained　on　the　RD　and　DA　datasets,　respectively,　demonstrating　a　significant　enhancement　in　predictive　performance.　This　suggests　that　selecting　and　expanding　positive　samples　while　objectively　extracting　negative　samples　can　improve　model　accuracy　and　mitigate　the　overfitting　problem　during　training.　Furthermore,　the　frequency　ratio　analysis　revealed　that　the　SO-RF　model　achieved　higher　frequency　ratios　in　regions　with　extremely　high　and　high　susceptibility　than　the　SO-SVM　model,　indicating　that　SO-RF　is　more　suitable　for　evaluating　landslide　susceptibility　in　regions　with　limited　landslide　sample　data,　such　as　Putian　City.　[Conclusions]　The　proposed　training　sample　optimization　approach,　combined　with　machine　learning　evaluation　methods,　demonstrates　high　applicability　and　accuracy.　Therefore,　the　findings　of　this　study　provide　valuable　insights　into　machine　learning-based　sampling　strategies　for　landslide　susceptibility　assessment.　©　2025　Science　Press.　All　rights　reserved.

Keyword：

landslide positive sample augmentation Putian random forest Support Vector Machine susceptibility training sample sampling unsupervised clustering

Community：

[ 1 ] [Weng M.]The Academy of Digital China(Fujian), Fuzhou University, Fuzhou, 350108, China
[ 2 ] [Weng M.]Key Laboratory of Spatial Data Mining and Information Sharing of Ministry of Education, Fuzhou University, Fuzhou, 350108, China
[ 3 ] [Xiao G.]The Academy of Digital China(Fujian), Fuzhou University, Fuzhou, 350108, China
[ 4 ] [Xiao G.]Key Laboratory of Spatial Data Mining and Information Sharing of Ministry of Education, Fuzhou University, Fuzhou, 350108, China

Reprint 's Address：

Email：

Show more details

Related Keywords：

Dynamic evaluation of landslide susceptibility based on machine learning and InSAR technology; [基于机器学习和 InSAR 技术的滑坡易发性动态评价]
2025，Journal of Natural Disasters
Dynamic Risk Assessment of Landslide Hazard for Large-Scale Photovoltaic Power Plants under Extreme Rainfall Conditions
2023，WATER
A Landslide Susceptibility Assessment Method Integrating Training Sample Optimization and Machine Learning
2025，Journal of Geo-Information Science
Application of Multi-System Combination Precise Point Positioning in Landslide Monitoring
2021，APPLIED SCIENCES-BASEL
A case study of the Tangjiashan landslide dam-break
2015，JOURNAL OF HYDRODYNAMICS

Source ：

Journal of Geo-Information Science

ISSN： 1560-8999

Year： 2025

Issue： 5

Volume： 27

Page： 1113-1128

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 4

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search SCOPUS

Type
Departments

All Years Choose Year From to