Skip to main navigation Skip to search Skip to main content

Chinese personal name recognition using N-gram model and rules

  • Beihang University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Chinese personal name recognition plays an important role in Chinese word segmentation and it's difficult to recognize whether a sequence of characters is a name or not for its complexity. This paper presents a new algorithm based on N-gram model and recognition rules to resolve this problem. In order to increase efficiency and accuracy, we also build several dictionaries such as a surname dictionary and a person-name dictionary. Experiments on different corpora show that the improved tokenizer using this algorithm performs stably and achieves more than 10 percent word segmentation accuracy increase than the original one. Averagely the improved tokenizer's recall rate and accuracy rate are both over 92%.

Original languageEnglish
Title of host publicationProceedings - 2012 7th International Conference on Computing and Convergence Technology (ICCIT, ICEI and ICACT), ICCCT 2012
Pages450-453
Number of pages4
StatePublished - 2012
Event2012 7th International Conference on Computing and Convergence Technology (ICCIT, ICEI and ICACT), ICCCT 2012 - Seoul, Korea, Republic of
Duration: 3 Dec 20125 Dec 2012

Publication series

NameProceedings - 2012 7th International Conference on Computing and Convergence Technology (ICCIT, ICEI and ICACT), ICCCT 2012

Conference

Conference2012 7th International Conference on Computing and Convergence Technology (ICCIT, ICEI and ICACT), ICCCT 2012
Country/TerritoryKorea, Republic of
CitySeoul
Period3/12/125/12/12

Keywords

  • Chinese personal name recognition
  • N-gram model
  • recognition rules

Fingerprint

Dive into the research topics of 'Chinese personal name recognition using N-gram model and rules'. Together they form a unique fingerprint.

Cite this