跳到主要导航 跳到搜索 跳到主要内容

Syntax encoding with application in authorship attribution

  • Beijing University of Chemical Technology
  • National Research Council of Canada
  • University of Ottawa

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

We propose a novel strategy to encode the syntax parse tree of sentence into a learnable distributed representation. The proposed syntax encoding scheme is provably information-lossless. In specific, an embedding vector is constructed for each word in the sentence, encoding the path in the syntax tree corresponding to the word. The one-to-one correspondence between these “syntax-embedding” vectors and the words (hence their embedding vectors) in the sentence makes it easy to integrate such a representation with all word-level NLP models. We empirically show the benefits of the syntax embeddings on the Authorship Attribution domain, where our approach improves upon the prior art and achieves new performance records on five benchmarking data sets.

源语言英语
主期刊名Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
编辑Ellen Riloff, David Chiang, Julia Hockenmaier, Jun'ichi Tsujii
出版商Association for Computational Linguistics
2742-2753
页数12
ISBN(电子版)9781948087841
出版状态已出版 - 2018
活动2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 - Brussels, 比利时
期限: 31 10月 20184 11月 2018

出版系列

姓名Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018

会议

会议2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
国家/地区比利时
Brussels
时期31/10/184/11/18

指纹

探究 'Syntax encoding with application in authorship attribution' 的科研主题。它们共同构成独一无二的指纹。

引用此