Skip to main navigation Skip to search Skip to main content

Tongue Model-Driven Method Based on Fully Connected Neural Network

  • Shaochuan Zhang
  • , Fengji Li
  • , Li Wang
  • , Jie Zhou
  • , Haijun Niu*
  • *Corresponding author for this work
  • Beihang University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Ultrasound technology, capable of capturing the tongue's contour, has received significant attention in speech visualization. There is an increasing interest in utilizing tongue motion data from ultrasound imaging to drive 3D tongue models. However, traditional driving methods have not fully utilized all the contour information of tongue, typically using a limited number of contour points to drive the tongue model. These approaches often lead to pathological shapes that deviate from natural speech articulation. To address this issue, we propose an innovative method that drives the tongue model by utilizing the entire tongue contour captured from ultrasound images. Initially, the complete tongue contour is extracted from the ultrasound images. Subsequently, a mapping model is developed to establish the relationship between ultrasound tongue contour and model control parameters. Finally, Root mean squared error is used to evaluate the reconstructed model control parameters, and the curve similarity index is used to assess the resemblance between the ultrasound tongue contour and the model midsagittal shape. This evaluation determines the accuracy of the driven tongue model. The results demonstrate that the reconstruction error of the control parameters is within 3%, and the average contour curve similarity between the tongue model of each phoneme and the ultrasound tongue contour is approximately 95%. These findings indicate the feasibility of driving tongue models using the entire tongue contour, effectively generating 3D tongue models that match 2D ultrasound images and avoiding issues with pathological shapes in the driven tongue model.

Original languageEnglish
Title of host publication2024 14th International Symposium on Chinese Spoken Language Processing, ISCSLP 2024
EditorsYanmin Qian, Qin Jin, Zhijian Ou, Zhenhua Ling, Zhiyong Wu, Ya Li, Lei Xie, Jianhua Tao
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages121-125
Number of pages5
ISBN (Electronic)9798331516826
DOIs
StatePublished - 2024
Event14th International Symposium on Chinese Spoken Language Processing, ISCSLP 2024 - Beijing, China
Duration: 7 Nov 202410 Nov 2024

Publication series

Name2024 14th International Symposium on Chinese Spoken Language Processing, ISCSLP 2024

Conference

Conference14th International Symposium on Chinese Spoken Language Processing, ISCSLP 2024
Country/TerritoryChina
CityBeijing
Period7/11/2410/11/24

Keywords

  • speech visualization
  • tongue contour extraction
  • tongue model
  • tongue model control
  • ultrasound imaging

Fingerprint

Dive into the research topics of 'Tongue Model-Driven Method Based on Fully Connected Neural Network'. Together they form a unique fingerprint.

Cite this