TY - GEN
T1 - InteractGAN
T2 - 28th ACM International Conference on Multimedia, MM 2020
AU - Gao, Chen
AU - Liu, Si
AU - Zhu, Defa
AU - Liu, Quan
AU - Cao, Jie
AU - He, Haoqian
AU - He, Ran
AU - Yan, Shuicheng
N1 - Publisher Copyright:
© 2020 ACM.
PY - 2020/10/12
Y1 - 2020/10/12
N2 - Compared with the widely studied Human-Object Interaction DE-Tection (HOI-DET), no effort has been devoted to its inverse problem, i.e. to generate an HOI scene image according to the given relationship triplet , to our best knowledge. We term this new task "Human-Object Interaction Image Generation"(HOI-IG). HOI-IG is a research-worthy task with great application prospects, such as online shopping, film production and interactive entertainment. In this work, we introduce an Interact-GAN to solve this challenging task. Our method is composed of two stages: (1) manipulating the posture of a given human image conditioned on a predicate. (2) merging the transformed human image and object image to one realistic scene image while satisfying the ir expected relative position and ratio. Besides, to address the large spatial misalignment issue caused by fusing two images content with reasonable spatial layout, we propose a Relation-based Spatial Transformer Network (RSTN) to adaptively process the images conditioned on their interaction. Extensive experiments on two challenging datasets demonstrate the effectiveness and superiority of our approach. We advocate for the image generation community to draw more attention to the new Human-Object Interaction Image Generation problem. To facilitate future research, our project will be released at: http://colalab.org/projects/InteractGAN.
AB - Compared with the widely studied Human-Object Interaction DE-Tection (HOI-DET), no effort has been devoted to its inverse problem, i.e. to generate an HOI scene image according to the given relationship triplet , to our best knowledge. We term this new task "Human-Object Interaction Image Generation"(HOI-IG). HOI-IG is a research-worthy task with great application prospects, such as online shopping, film production and interactive entertainment. In this work, we introduce an Interact-GAN to solve this challenging task. Our method is composed of two stages: (1) manipulating the posture of a given human image conditioned on a predicate. (2) merging the transformed human image and object image to one realistic scene image while satisfying the ir expected relative position and ratio. Besides, to address the large spatial misalignment issue caused by fusing two images content with reasonable spatial layout, we propose a Relation-based Spatial Transformer Network (RSTN) to adaptively process the images conditioned on their interaction. Extensive experiments on two challenging datasets demonstrate the effectiveness and superiority of our approach. We advocate for the image generation community to draw more attention to the new Human-Object Interaction Image Generation problem. To facilitate future research, our project will be released at: http://colalab.org/projects/InteractGAN.
KW - HOI-iG
KW - interactGAN
KW - relation-based image merging
KW - relation-based transformation
UR - https://www.scopus.com/pages/publications/85102812039
U2 - 10.1145/3394171.3413854
DO - 10.1145/3394171.3413854
M3 - 会议稿件
AN - SCOPUS:85102812039
T3 - MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia
SP - 165
EP - 173
BT - MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
Y2 - 12 October 2020 through 16 October 2020
ER -