Three-dimensional visualization of tongue articulation based on mandarin phonemes

  • Shaochuan Zhang
  • , Haijun Niu
  • , Fengji Li
  • , Li Wang
  • , Yihuan Zhang
  • , Ping He
  • , Jie Zhou
  • , Zhenchang Wang*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Objective To explore a 3D visualization method for tongue articulation based on ultrasound images and to construct a corresponding visualization database. Methods A high-fidelity statistical tongue model was constructed from MRI data and parameterized with six independent, physiologically interpretable control parameters to capture tongue-shape variation. To provide speech-specific data for model fitting, mid-sagittal ultrasound images were collected for each phoneme in the corpus, and tongue contours were manually annotated. A fully connected neural network was then trained to map the ultrasound-derived tongue contours to the model's control parameters. The estimated parameters were further refined through manual adjustment to obtain 3D tongue shapes that accurately matched the observed contours. Finally, model-fitting accuracy was quantitatively evaluated, and statistical analyses were conducted to examine tongue-shape differences among easily confusable phonemes. Results For the majority of phonemes, the similarity between the 3D model′s mid-sagittal contour and the ultrasound-derived tongue contour exceeded 90%, and the average root mean square error(RMSE) was reduced by approximately 28% compared with conventional tongue models, thereby enabling the detection of subtle articulatory distinctions among phonemes. Conclusion The constructed 3D tongue articulation visualization database for Mandarin phonemes provides a valuable tool for speech rehabilitation in individuals with hearing impairment and for visualization-based instruction in second-language learning, demonstrating strong potential for dissemination and application.

Translated title of the contribution基于汉语音素的三维可视化发音舌
Original languageEnglish
Pages (from-to)55-60
Number of pages6
JournalChinese Journal of Otorhinolaryngology Head and Neck Surgery
Volume61
Issue number1
DOIs
StatePublished - 7 Jan 2026

Keywords

  • 3D visualization
  • Articulatory control parameters
  • Contour extraction
  • Statistical shape modeling
  • Tongue modeling
  • Ultrasound imaging

Fingerprint

Dive into the research topics of 'Three-dimensional visualization of tongue articulation based on mandarin phonemes'. Together they form a unique fingerprint.

Cite this