Indexed by:
Abstract:
Knowledge graph of landscape plants provides potential uses in the selection of greening tree species considering regional adaptability, ornamental and ecological factors. Entity and relationship extraction of the plant's description text is a key issue in the construction of knowledge graph. Until now, there has been no publicly available annotated data set for the plant domain. In this paper, a conceptual architecture of landscape plants was defined and briefly described, and the landscape plant corpus was constructed. Existing language models such as word2vec, ELMo, and BERT have various disadvantages, e.g., they can't solve the problem of polysemous words and have poor ability of context fusion and computational efficiency. In this paper, we proposed a named entity recognition model, ALBERT- BiGRU- CRF, and a relationship extraction model, ALBERT-BiGRU-Attention, which were embedded with ALBERT (A Lite Bidirectional Encoder Representation from Transformers) pre-training language model. In the ALBERT-BiGRU-CRF model, the ALBERT model was used to extract text features, the Bi-GRU model was used to learn and excavate deep semantic features between sentences, and the CRF model was used to calculate the probability distribution of the annotation sequence to determine the entities contained in the description text. The ALBERT-BiGRU-Attention model was based on the results of the named entity recognition model. Similarly, the attention model was used to improve the weight of keywords to determine the relationship between entities. The proposed models have the following advantages: (1) The method can effectively identify and extract entities and relationships of landscape plants' knowledge; (2) The models can represent the semantic and sentence characteristics of characters with a good accuracy. The validity of the method was verified on the landscape plant corpus constructed in this paper and compared with other models. Our experimental results of quantitative evaluation show that: (1) The F1 index of the ALBERT- BiGRU-CRF model was 0.9517, indicating that it had good performance in named entity recognition task and can effectively identify 23 main entity types; (2) After comparative experiments and analysis of the relationship extraction results, the F1 index of the ALBERT-BiGRU-Attention model was 0.9161, indicating that it performed well in the relationship extraction of landscape plants; (3) By selecting 6 representative examples to further evaluate the extraction performance of this method, the results show that the method can well identify the knowledge triples of common single- relation and multi- relation texts. Therefore, the entity relationship extraction task based on ALBERT model can effectively improve the recognition and extraction results. It can be applied to the entity relationship extraction task of plant description text, providing a method for automatic construction of landscape plant knowledge graph. © 2021, Science Press. All right reserved.
Keyword:
Reprint 's Address:
Email:
Source :
Journal of Geo-Information Science
ISSN: 1560-8999
CN: 11-5809/P
Year: 2021
Issue: 7
Volume: 23
Page: 1208-1220
Cited Count:
WoS CC Cited Count: 0
SCOPUS Cited Count: 5
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 4
Affiliated Colleges: