Skip to main navigation Skip to search Skip to main content

Absolute Story: Visual Storytelling with Consistent Subject and Style

  • Lipeng Wang
  • , Hongxing Fan
  • , Zehuan Huang
  • , Lu Sheng*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Recently, visual generative models have made significant progress in single-image synthesis while facing the dual challenges of multi-subject fine-grained consistency and cross-frame style coordination in serialized storyline generation. To address this, we propose an innovative framework named Absolute Story, which enhances subject consistency and style coherence through fine-grained feature alignment and context-aware generation. The framework consists of three core components: (1) We propose Related Subject Selection that utilizes a vision language model to map textual descriptions with reference images, constructing subject-focused masks; (2) We design Storyline ReferenceNet, which integrates a Plot Fusion Module, to encode fine-grained visual features from reference storyline, ensuring spatiotemporal consistency of subject and style; (3) We develop a Story Consistency Attention Block to achieve context-consistency generation by leveraging fine-grained and subject-focused features. Experiments verified that our framework outperformed existing advanced methods in key metrics (FID ↓ 7.48%, CLIP-T ↑ 3.04 % at most). The visualization results indicate that our approach leads in terms of subject consistency and style coherence.

Original languageEnglish
Title of host publicationPattern Recognition and Computer Vision - 8th Chinese Conference, PRCV 2025, Proceedings
EditorsJosef Kittler, Hongkai Xiong, Weiyao Lin, Jian Yang, Xilin Chen, Jiwen Lu, Jingyi Yu, Weishi Zheng
PublisherSpringer Science and Business Media Deutschland GmbH
Pages388-402
Number of pages15
ISBN (Print)9789819555666
DOIs
StatePublished - 2026
Event8th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2025 - Shanghai, China
Duration: 15 Oct 202518 Oct 2025

Publication series

NameLecture Notes in Computer Science
Volume16276 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference8th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2025
Country/TerritoryChina
CityShanghai
Period15/10/2518/10/25

Keywords

  • Image synthesis
  • Style coherent
  • Subject consistency
  • Visual storytelling

Fingerprint

Dive into the research topics of 'Absolute Story: Visual Storytelling with Consistent Subject and Style'. Together they form a unique fingerprint.

Cite this