TY - JOUR
T1 - HodgeRankWeight
T2 - An Integration Algorithm for Feature Ranking Based on Weight Quantization
AU - Meng, Chaolu
AU - Shi, Yunyun
AU - Zou, Quan
AU - Liu, Ruijun
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - The identification of protein sequences depends on the effective selection of an optimized set of features. Traditional algorithms prioritize global feature importance, often overshadowing the significance of local metrics. Addressing this imbalance, we introduce an innovative algorithm that fuses feature ranking with an advanced weight quantization technique. This algorithm unfolds in two pivotal stages: initially, it generates a weighted directed graph based on normal distribution metrics; subsequently, it employs the HodgeRank algorithm to amalgamate these rankings. Specifically, the algorithm evaluates feature score normality by employing z-scores for skewness and kurtosis, resulting in a graph that quantitatively reflects both local and global feature contributions. HodgeRank then translates this graph into a Laplacian matrix, enabling the calculation of a comprehensive scoring function for each feature. We refine the initial rankings by incorporating weights during the integration phase, capturing a holistic view of feature significance. The proposed method, termed HodgeRankWeight, showcases superior performance, achieving accuracy rates of 87.02%, 92.84%, and 74.51% across different datasets. In head-to-head comparisons, HodgeRankWeight outstripped existing models, achieving an overall accuracy of 82.6923% and setting a new benchmark for precision in protein sequence identification. We also offer a complimentary web server for related research.
AB - The identification of protein sequences depends on the effective selection of an optimized set of features. Traditional algorithms prioritize global feature importance, often overshadowing the significance of local metrics. Addressing this imbalance, we introduce an innovative algorithm that fuses feature ranking with an advanced weight quantization technique. This algorithm unfolds in two pivotal stages: initially, it generates a weighted directed graph based on normal distribution metrics; subsequently, it employs the HodgeRank algorithm to amalgamate these rankings. Specifically, the algorithm evaluates feature score normality by employing z-scores for skewness and kurtosis, resulting in a graph that quantitatively reflects both local and global feature contributions. HodgeRank then translates this graph into a Laplacian matrix, enabling the calculation of a comprehensive scoring function for each feature. We refine the initial rankings by incorporating weights during the integration phase, capturing a holistic view of feature significance. The proposed method, termed HodgeRankWeight, showcases superior performance, achieving accuracy rates of 87.02%, 92.84%, and 74.51% across different datasets. In head-to-head comparisons, HodgeRankWeight outstripped existing models, achieving an overall accuracy of 82.6923% and setting a new benchmark for precision in protein sequence identification. We also offer a complimentary web server for related research.
KW - HodgeRank
KW - Protein sequence recognition
KW - feature processing
KW - ranking integration
UR - https://www.scopus.com/pages/publications/105017134588
U2 - 10.1109/TCBBIO.2024.3524677
DO - 10.1109/TCBBIO.2024.3524677
M3 - 文章
AN - SCOPUS:105017134588
SN - 1545-5963
VL - 22
SP - 528
EP - 536
JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics
JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics
IS - 2
ER -