跳到主要导航 跳到搜索 跳到主要内容

Summarizing source code through heterogeneous feature fusion and extraction

  • Juncai Guo
  • , Jin Liu*
  • , Xiao Liu
  • , Li Li
  • *此作品的通讯作者
  • Wuhan University
  • Deakin University

科研成果: 期刊稿件文章同行评审

摘要

Code summarization, which seeks to automatically produce a succinct natural-language description to summarize the functionality of source code, plays an essential role in maintaining the software. Currently, plentiful approaches have been proposed to first encode the source code based on its Abstract Syntax Tree (AST), and then decode it into a textual summary. However, most existing works interpret the AST-based syntax structure as a homogeneous graph, without discriminating the different relations between graph nodes (e.g., the parent–child and sibling relations) in a heterogeneous way. To mitigate this issue, this paper proposes HETCOS to extract the syntactic and sequential features of source code by exploring its inherent heterogeneity for code summarization. Specifically, we first build a Heterogeneous Code Graph (HCG) that fuses the syntax structure and code sequence with eight types of edges/relations designed between graph nodes. Moreover, we present a heterogeneous graph neural network for capturing the diverse relations in HCG. The represented HCG is then fed into a Transformer decoder, followed by a multi-head attention-based copying mechanism to support high-quality summary generation. Extensive experiments on the major Java and Python datasets illustrate the superiority of our approach over sixteen state-of-the-art baselines. To promote reproducibility studies, we make the implementation of HETCOS publicly accessible at https://github.com/GJCEXP/HETCOS.

源语言英语
文章编号102058
期刊Information Fusion
103
DOI
出版状态已出版 - 3月 2024

指纹

探究 'Summarizing source code through heterogeneous feature fusion and extraction' 的科研主题。它们共同构成独一无二的指纹。

引用此