Parallel Tag Clouds to Explore Faceted Text Corpora

Contributors

Christopher Collins, Fernanda B. Viégas, Martin Wattenberg

Abstract

Do court cases differ from place to place? What kind of picture do we get by looking at a country’s collection of law cases? We introduce Parallel Tag Clouds: a new way to visualize differences amongst facets of very large metadata-rich text corpora. We have pointed Parallel Tag Clouds at a collection of over 600,000 US Circuit Court decisions spanning a period of 50 years and have discovered regional as well as linguistic differences between courts. The visualization technique combines graphical elements from parallel coordinates and traditional tag clouds to provide rich overviews of a document collection while acting as an entry point for exploration of individual texts. We augment basic parallel tag clouds with a details-in-context display and an option to visualize changes over a second facet of the data, such as time. We also address text mining challenges such as selecting the best words to visualize, and how to do so in reasonable time periods to maintain interactivity.

Publications

  • C. Collins, F. B. Viégas, and M. Wattenberg, “Parallel Tag Clouds to Explore and Analyze Facted Text Corpora,” in Proc. of the IEEE Symp. on Visual Analytics Science and Technology (VAST), 2009.
    [Bibtex] [PDF] [DOI]
    @InProceedings{COL2009b,
      key =     {COL2009b},
      author =   {Christopher Collins and Fernanda B. Vi\'egas and Martin Wattenberg},
      title =   {Parallel Tag Clouds to Explore and Analyze Facted Text Corpora},
      booktitle =   {Proc. of the IEEE Symp. on Visual Analytics Science and Technology (VAST)},
      year =   2009,
      page = {91 -- 98},
      doi = {10.1109/VAST.2009.5333443}
    }

Media

Slides from the presentation at IEEE VAST 2009.

[Download high resolution mp4]

 


Acknowledgements

This research was conducted at the Visual Communications Lab at IBM Research.

Research

Visual Analytics for Topic Model Optimization based on User-Steerable Speculative Execution

ThreadReconstructor: Modeling Reply-Chains to Untangle Conversational Text through Visual Analytics

Detecting Negative Emotion for Mixed Initiative Visual Analytics

EduApps – Supporting Non-Native English Speakers to Overcome Language Transfer Effects

Metatation: Annotation as Implicit Interaction to Bridge Close and Distant Reading

DataTours: A Data Narratives Framework

Perceptual Biases in Font Size as a Data Encoding

Progressive Learning of Topic Modeling Parameters: A Visual Analytics Framework

Abbreviating Text Labels on Demand

NEREx: Named-Entity Relationship Exploration in Multi-Party Conversations

ConToVi: Multi-Party Conversation Exploration using Topic-Space Views

PhysioEx: Visual Analysis of Physiological Event Streams

Using Visual Analytics of Heart Rate Variation to Aid in Diagnostics

Off-Screen Desktop

PivotSlice

Reading Comprehension on Mobile Devices

#FluxFlow: Visual Analysis of Anomalous Information Spreading on Social Media

Optimizing Hierarchical Visualizations with the Minimum Description Length Principle

Lexichrome

SentimentState: Exploring Sentiment Analysis on Twitter

Facilitating Discourse Analysis with Interactive Visualization

DimpVis

Glidgets

TandemTable

Simple Multi-Touch Toolkit

Exploring Text Entities with Descriptive Non-photorealistic Rendering

Investigating the Semantic Patterns of Passwords

Bubble Sets: Revealing Set Relations with Isocontours over Existing Visualizations

Parallel Tag Clouds to Explore Faceted Text Corpora

VisLink: Revealing Relationships Amongst Visualizations

DocuBurst: Visualizing Document Content using Language Structure

Tabletop Text Entry Techniques

Lattice Uncertainty Visualization: Understanding Machine Translation and Speech Recognition

WordNet Visualization

// Where the sidebar information is stored
| © Copyright vialab | Dr. Christopher Collins, Canada Research Chair in Linguistic Information Visualization |