跳到主要导航 跳到搜索 跳到主要内容

AECodec: High Fidelity Neural Audio Codec Based On Speech And Electroglottograph

  • Beihang University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

With the rapid development of communication devices and voice interactions, speech is playing an increasingly important role. Given the growing demand for voice transmission, efficiently transmitting high-quality speech signals within limited bandwidth becomes a challenge. This paper introduces a high-fidelity speech codec that utilizes both speech signals and electroglottograph signals for speech compression. The codec is capable of compressing and decompressing speech, significantly reducing the bitrate while maintaining high fidelity. The proposed codec employs a streaming encoder-decoder architecture and is trained in an end-to-end manner. In addition to using speech signals as input, the electroglottograph signal is used as an auxiliary input, leveraging its ability to capture vocal fold movement characteristics such as closure degree, closure speed, and cycle duration, thus enhancing the model’s feature extraction capability. Moreover, the encoder-decoder structure integrates a Transformer Encoder module with residual connections, further improving the model's ability to process time series data. To validate the effectiveness of this approach, we conducted extensive objective evaluations and experimental studies across various bandwidths, proving our approach is superior to the baselines methods.

源语言英语
主期刊名Proceedings - 2024 3rd International Conference on Artificial Intelligence, Human-Computer Interaction and Robotics, AIHCIR 2024
出版商Institute of Electrical and Electronics Engineers Inc.
111-115
页数5
ISBN(电子版)9798331534035
DOI
出版状态已出版 - 2024
活动3rd International Conference on Artificial Intelligence, Human-Computer Interaction and Robotics, AIHCIR 2024 - Hong Kong, 中国
期限: 15 11月 202417 11月 2024

出版系列

姓名Proceedings - 2024 3rd International Conference on Artificial Intelligence, Human-Computer Interaction and Robotics, AIHCIR 2024

会议

会议3rd International Conference on Artificial Intelligence, Human-Computer Interaction and Robotics, AIHCIR 2024
国家/地区中国
Hong Kong
时期15/11/2417/11/24

指纹

探究 'AECodec: High Fidelity Neural Audio Codec Based On Speech And Electroglottograph' 的科研主题。它们共同构成独一无二的指纹。

引用此