Distribution-valued data graphical model estimation based on M-LDQ feature embedding

  • Qiying Wu
  • , Huiwen Wang
  • , Shan Lu*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Understanding and modeling distribution-valued data, an important form of symbolic data, has garnered significant attention in statistics because of its effectiveness in handling large datasets. Conventional statistical inference methods are not directly applicable to distribution-valued data, which has prompted extensive research efforts aimed at addressing this challenge. However, graphical models, which are powerful tools in applied statistics, have not yet been fully developed for distribution-valued data. To fill this gap, this study proposes a novel nonparametric graphical model estimation method for distribution-valued data. The proposed method first removes the inherent constraints of distributions, effectively capturing both position information (as a scalar) and shape information (as a function). We subsequently propose an aggregation method, which is based on the conditional independence test, to integrate the position information and shape information for graphical model estimation. Several numerical simulations have validated that our method outperforms other potential competing methods. Furthermore, we apply our method to construct the network of stocks that constitute the SSE 50 Index using daily distribution-valued data of five-minute returns. The empirical results reveal sector-specific relationships as well as cross-sector influences, highlighting the evolving interconnections between stocks from different sectors over time.

Original languageEnglish
JournalJournal of Applied Statistics
DOIs
StateAccepted/In press - 2025

Keywords

  • Distribution-valued data
  • conditional independence test
  • feature embedding
  • graphical models
  • kernel method

Fingerprint

Dive into the research topics of 'Distribution-valued data graphical model estimation based on M-LDQ feature embedding'. Together they form a unique fingerprint.

Cite this