TY - GEN
T1 - Hierarchical Lexicon Embedding Architecture for Chinese Named Entity Recognition
AU - Hu, Jiahao
AU - Ouyang, Yuanxin
AU - Li, Chen
AU - Wang, Chuanrui
AU - Rong, Wenge
AU - Xiong, Zhang
N1 - Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - Named entity recognition (NER) is one of the most fundamental tasks in a variety of natural language applications. Due to the lack of delimiters in the Chinese language, Chinese NER task has been suffering from the shortage of word boundary information. Recently, incorporating word information has been proven an effective mechanism to alleviate this problem. However, how to integrate word information into the character-based model more effectively and efficiently is still a challenge. In this work, we propose a hierarchical lexicon embedding architecture for Chinese NER task. The words matched by the input sentence are divided into two categories, i.e., main words and auxiliary words, to help the model better capture useful information. In addition, the modification mainly lies in the embedding layer, as such it can be easily incorporated with different sequence modeling architectures. Experimental studies on four Chinese NER datasets have shown our method’s promising potential.
AB - Named entity recognition (NER) is one of the most fundamental tasks in a variety of natural language applications. Due to the lack of delimiters in the Chinese language, Chinese NER task has been suffering from the shortage of word boundary information. Recently, incorporating word information has been proven an effective mechanism to alleviate this problem. However, how to integrate word information into the character-based model more effectively and efficiently is still a challenge. In this work, we propose a hierarchical lexicon embedding architecture for Chinese NER task. The words matched by the input sentence are divided into two categories, i.e., main words and auxiliary words, to help the model better capture useful information. In addition, the modification mainly lies in the embedding layer, as such it can be easily incorporated with different sequence modeling architectures. Experimental studies on four Chinese NER datasets have shown our method’s promising potential.
KW - Boundary information
KW - Chinese named entity recognition
KW - Lexicon
UR - https://www.scopus.com/pages/publications/85115678640
U2 - 10.1007/978-3-030-86383-8_28
DO - 10.1007/978-3-030-86383-8_28
M3 - 会议稿件
AN - SCOPUS:85115678640
SN - 9783030863821
T3 - Lecture Notes in Computer Science
SP - 345
EP - 356
BT - Artificial Neural Networks and Machine Learning – ICANN 2021 - 30th International Conference on Artificial Neural Networks, Proceedings
A2 - Farkaš, Igor
A2 - Masulli, Paolo
A2 - Otte, Sebastian
A2 - Wermter, Stefan
PB - Springer Science and Business Media Deutschland GmbH
T2 - 30th International Conference on Artificial Neural Networks, ICANN 2021
Y2 - 14 September 2021 through 17 September 2021
ER -