跳到主要导航 跳到搜索 跳到主要内容

Gradient Inversion Attack in Federated Learning: Exposing Text Data through Discrete Optimization

  • Ying Gao*
  • , Yuxin Xie
  • , Huanghao Deng
  • , Zukun Zhu
  • *此作品的通讯作者
  • Zhongguancun Laboratory
  • Beihang University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Federated learning has emerged as a potential solution to overcome the bottleneck posed by the near exhaustion of public text data in training large language models. There are claims that the strategy of exchanging gradients allows using text data including private information. Although recent studies demonstrate that data can be reconstructed from gradients, the threat for text data seems relatively small due to its sensitivity to even a few token errors. However, we propose a novel attack method FET, indicating that it is possible to Fully Expose Text data from gradients. Unlike previous methods that optimize continuous embedding vectors, we directly search for a text sequence with gradients that match the known gradients. First, we infer the total number of tokens and the unique tokens in the target text data from the gradients of the embedding layer. Then we develop a discrete optimization algorithm, which globally explores the solution space and precisely refines the obtained solution, incorporating both global and local search strategies. We also find that gradients of the fully connected layer are dominant, providing sufficient guidance for the optimization process. Our experiments show a significant improvement in attack performance, with an average increase of 39% for TinyBERT6, 20% for BERTbase and 15% for BERTlarge in exact match rates across three datasets. These findings highlight serious privacy risks in text data, suggesting that using smaller models is not an effective privacy-preserving strategy.

源语言英语
主期刊名Main Conference
编辑Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
出版商Association for Computational Linguistics (ACL)
2582-2591
页数10
ISBN(电子版)9798891761964
出版状态已出版 - 2025
活动31st International Conference on Computational Linguistics, COLING 2025 - Abu Dhabi, 阿拉伯联合酋长国
期限: 19 1月 202524 1月 2025

出版系列

姓名Proceedings - International Conference on Computational Linguistics, COLING
ISSN(印刷版)2951-2093

会议

会议31st International Conference on Computational Linguistics, COLING 2025
国家/地区阿拉伯联合酋长国
Abu Dhabi
时期19/01/2524/01/25

指纹

探究 'Gradient Inversion Attack in Federated Learning: Exposing Text Data through Discrete Optimization' 的科研主题。它们共同构成独一无二的指纹。

引用此