Skip to main navigation Skip to search Skip to main content

Incrementally learning the hierarchical softmax function for neural language models

  • Hong Kong University of Science and Technology
  • Beihang University

Research output: Contribution to conferencePaperpeer-review

Abstract

Neural network language models (NNLMs) have attracted a lot of attention recently. In this paper, we present a training method that can incrementally train the hierarchical softmax function for NNMLs. We split the cost function to model old and update corpora separately, and factorize the objective function for the hierarchical softmax. Then we provide a new stochastic gradient based method to update all the word vectors and parameters, by comparing the old tree generated based on the old corpus and the new tree generated based on the combined (old and update) corpus. Theoretical analysis shows that the mean square error of the parameter vectors can be bounded by a function of the number of changed words related to the parameter node. Experimental results show that incremental training can save a lot of time. The smaller the update corpus is, the faster the update training process is, where an up to 30 times speedup has been achieved. We also use both word similarity/relatedness tasks and dependency parsing task as our benchmarks to evaluate the correctness of the updated word vectors.

Original languageEnglish
Pages3267-3273
Number of pages7
StatePublished - 2017
Event31st AAAI Conference on Artificial Intelligence, AAAI 2017 - San Francisco, United States
Duration: 4 Feb 201710 Feb 2017

Conference

Conference31st AAAI Conference on Artificial Intelligence, AAAI 2017
Country/TerritoryUnited States
CitySan Francisco
Period4/02/1710/02/17

Fingerprint

Dive into the research topics of 'Incrementally learning the hierarchical softmax function for neural language models'. Together they form a unique fingerprint.

Cite this