TY - GEN
T1 - Two-tier Graph Contextual Embedding for Cross-device User Matching
AU - Huang, Hongren
AU - Guo, Shu
AU - Li, Chen
AU - Sheng, Jiawei
AU - Wang, Lihong
AU - Li, Jianxin
AU - Liu, Jing
AU - Zhong, Shenghai
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/10/30
Y1 - 2021/10/30
N2 - The cross-device user matching task is to identify the behavior-logs (i.e., behavior sequences) on multiple devices that belong to one real person. Due to its anonymous and long-term properties, most previous methods of learning behavior embeddings cannot effectively capture two important features in the sequences, namely high-order connections and long-range dependencies. To this end, we propose a novel framework called Two-tier Graph Contextual Embedding (TGCE) to solve the above problems simultaneously. In the first tier, we construct behavior evolutionary graphs (BEGs) for behavior sequences and design an order-preserving neighbor aggregation network to collectively model transitions of behaviors with their neighbors. As repeated behaviors can be grouped into single nodes, our model joints neighboring environments around behaviors in a collective way, and behavior embeddings can be enriched. In the second tier, we further build scaled shortcut graphs (SSGs) by refining BEGs with random walk-based edge addition, then a position-aware graph attention network is further imposed on SSGs to facilitate fast information propagation. As distant graph nodes can be directly connected by shortcut edges, we can further capture long-range dependencies. By stacking two graph tiers, our approach can obtain graph contextual embeddings for behaviors to further improve user matching. Experimental results on the benchmark dataset show that our model outperforms various baselines in the user matching task. Our code is released on https://github.com/13061051/TGCE_2021.
AB - The cross-device user matching task is to identify the behavior-logs (i.e., behavior sequences) on multiple devices that belong to one real person. Due to its anonymous and long-term properties, most previous methods of learning behavior embeddings cannot effectively capture two important features in the sequences, namely high-order connections and long-range dependencies. To this end, we propose a novel framework called Two-tier Graph Contextual Embedding (TGCE) to solve the above problems simultaneously. In the first tier, we construct behavior evolutionary graphs (BEGs) for behavior sequences and design an order-preserving neighbor aggregation network to collectively model transitions of behaviors with their neighbors. As repeated behaviors can be grouped into single nodes, our model joints neighboring environments around behaviors in a collective way, and behavior embeddings can be enriched. In the second tier, we further build scaled shortcut graphs (SSGs) by refining BEGs with random walk-based edge addition, then a position-aware graph attention network is further imposed on SSGs to facilitate fast information propagation. As distant graph nodes can be directly connected by shortcut edges, we can further capture long-range dependencies. By stacking two graph tiers, our approach can obtain graph contextual embeddings for behaviors to further improve user matching. Experimental results on the benchmark dataset show that our model outperforms various baselines in the user matching task. Our code is released on https://github.com/13061051/TGCE_2021.
KW - behavior embedding
KW - graph neural network
KW - user matching
UR - https://www.scopus.com/pages/publications/85119192704
U2 - 10.1145/3459637.3482308
DO - 10.1145/3459637.3482308
M3 - 会议稿件
AN - SCOPUS:85119192704
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 730
EP - 739
BT - CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 30th ACM International Conference on Information and Knowledge Management, CIKM 2021
Y2 - 1 November 2021 through 5 November 2021
ER -