Understanding text corpora with multiple facets

  • Lei Shi*
  • , Furu Wei
  • , Shixia Liu
  • , Li Tan
  • , Xiaoxiao Lian
  • , Michelle X. Zhou
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Text visualization becomes an increasingly more important research topic as the need to understand massive-scale textual information is proven to be imperative for many people and businesses. However, it is still very challenging to design effective visual metaphors to represent large corpora of text due to the unstructured and high-dimensional nature of text. In this paper, we propose a data model that can be used to represent most of the text corpora. Such a data model contains four basic types of facets: time, category, content (unstructured), and structured facet. To understand the corpus with such a data model, we develop a hybrid visualization by combining the trend graph with tag-clouds. We encode the four types of data facets with four separate visual dimensions. To help people discover evolutionary and correlation patterns, we also develop several visual interaction methods that allow people to interactively analyze text by one or more facets. Finally, we present two case studies to demonstrate the effectiveness of our solution in support of multi-faceted visual analysis of text corpora.

Original languageEnglish
Title of host publicationVAST 10 - IEEE Conference on Visual Analytics Science and Technology 2010, Proceedings
Pages99-105
Number of pages7
DOIs
StatePublished - 2010
Externally publishedYes
Event1st IEEE Conference on Visual Analytics Science and Technology, VAST 10 - Salt Lake City, UT, United States
Duration: 24 Oct 201029 Oct 2010

Publication series

NameVAST 10 - IEEE Conference on Visual Analytics Science and Technology 2010, Proceedings

Conference

Conference1st IEEE Conference on Visual Analytics Science and Technology, VAST 10
Country/TerritoryUnited States
CitySalt Lake City, UT
Period24/10/1029/10/10

Keywords

  • Multi-facet data visualization
  • Text visualization

Fingerprint

Dive into the research topics of 'Understanding text corpora with multiple facets'. Together they form a unique fingerprint.

Cite this