Abstract
High-dimensional sparse clustering with compositional data is of great practical importance, as exemplified by applications in high-throughput gene expression profiles analysis. In this paper, we develop a compositional clustering framework based on convex clustering, which is a convex relaxation of hierarchical clustering that incorporates a fused penalty term on the cluster prototypes. To explicitly deal with the issue of high dimensionality and sparsity, we propose the Compositional Convex Clustering with Sparse Group Lasso (CCC-SGL). The isometric logratio (ilr) transformation is first applied to transform the composition in the simplex space to the standard Euclidean geometry. Then, a group lasso penalty and a lasso penalty are imposed on the cluster centers, which effectively selects informative features and promotes within-feature sparsity. The proposed convex clustering formulation is numerically and efficiently solved with the proximal gradient descent algorithm within the Alternating Direction Method of Multipliers (ADMM) framework. Simulation studies are carried out to evaluate the performance of the proposed methodology and also a real data set in microbiome sequencing is analyzed.
| Original language | English |
|---|---|
| Pages (from-to) | 23-36 |
| Number of pages | 14 |
| Journal | Neurocomputing |
| Volume | 425 |
| DOIs | |
| State | Published - 15 Feb 2021 |
Keywords
- ADMM
- Compositional data
- Convex clustering
- Sparse-group-lasso
Fingerprint
Dive into the research topics of 'Convex clustering method for compositional data via sparse group lasso'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver