跳到主要导航 跳到搜索 跳到主要内容

T4Di: A Hybrid TTT-Transformer Backbone for Scalable and Efficient Diffusion Model

  • Xirui Wu
  • , Haixia Pan
  • , Ruijun Liu*
  • , Biao Dong
  • , Ying Zheng
  • , Huolong Ye
  • *此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Diffusion models have achieved significant progress in image generation, with backbone architectures evolving from U-Net to Transformers. However, the quadratic complexity of Transformer-based diffusion models limits their scalability and efficiency, and this limitation becomes more prominent with increasing resolution. Linear complexity models such as Mamba partially address this issue but struggle with spatial continuity when applied to two-dimensional image data. To tackle these challenges, we propose T4Di, a hybrid backbone architecture combining the efficiency of Test-Time Training (TTT) with the global modeling capability of Transformers. By introducing multidirectional scanning and lightweight local feature enhancement modules, T4Di adapts TTT to 2D image signals, improving spatial continuity and local coherence. Moreover, we explore adaptive block composition, adjusting the ratio between Transformer and TTT components to achieve a favorable balance between generation quality and computational cost. We evaluate T4Di on both unconditional and class-conditional image generation tasks across CIFAR-10, CelebA, and ImageNet benchmarks. Experimental results demonstrate that T4Di consistently outperforms existing diffusion models in terms of both generation quality and computational efficiency, establishing it as a scalable and effective solution for image synthesis.

源语言英语
主期刊名Advanced Intelligent Computing Technology and Applications - 21st International Conference, ICIC 2025, Proceedings
编辑De-Shuang Huang, Qinhu Zhang, Chuanlei Zhang, Wei Chen
出版商Springer Science and Business Media Deutschland GmbH
162-173
页数12
ISBN(印刷版)9789819698110
DOI
出版状态已出版 - 2025
活动21st International Conference on Intelligent Computing, ICIC 2025 - Ningbo, 中国
期限: 26 7月 202529 7月 2025

出版系列

姓名Lecture Notes in Computer Science
15859 LNCS
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议21st International Conference on Intelligent Computing, ICIC 2025
国家/地区中国
Ningbo
时期26/07/2529/07/25

指纹

探究 'T4Di: A Hybrid TTT-Transformer Backbone for Scalable and Efficient Diffusion Model' 的科研主题。它们共同构成独一无二的指纹。

引用此