Optimizing Hierarchical Visualizations with the Minimum Description Length Principle

Contributors

Rafael Veras, Christopher Collins

Abstract

In this paper we propose a new approach for adjusting the level of abstraction of hierarchical visualizations as a function of display size and dataset. Using the Minimum Description Length (MDL) principle, we efficiently select tree cuts that feature a good balance between clutter and information. We present MDL formulae for selecting tree cuts tailored to treemap and sunburst diagrams, and discuss how the approach can be extended to other types of multilevel visualizations. In addition, we demonstrate how such tree cuts can be used to enhance drill-down interaction in hierarchical visualizations by enabling quick exposure of important outliers. The paper features applications of the proposed technique on treemaps of the Directory Mozilla (DMOZ) dataset (over 500,000 nodes), and on the Docuburst text visualization tool (over 100,000 nodes).

Validation is done with the feature congestion measure of clutter in views of a subset of the current DMOZ web directory. The results show that MDL views achieve near constant clutter level across display resolutions. We also present the results of a crowdsourced user study where participants were asked to find targets in views of DMOZ generated by our approach and a set of baseline aggregation methods. The results suggest that, in some conditions, participants are able to locate targets (in particular, outliers) faster using the proposed approach.

Source Code

https://github.com/rafaveguim/treecut.js

VIS 16 Slides

http://vialab.science.uoit.ca/treecut/slides/

Publications

  • R. Veras and C. Collins, ” Optimizing Hierarchical Visualizations with the Minimum Description Length Principle ,” IEEE Transactions on Visualization and Computer Graphics , vol. 23 , iss. 1 , pp. 631-640, 2017.
    [Bibtex] [PDF] [DOI]
    @Article{ver2017,
    Author = { Rafael Veras and Christopher Collins },
    Journal= {IEEE Transactions on Visualization and Computer Graphics },
    Title= { Optimizing Hierarchical Visualizations with the Minimum Description Length Principle },
    Year= {2017},
    Volume = { 23 },
    Number = { 1 },
    Pages= { 631--640},
    Keywords = { Hierarchy data, data aggregation, multiscale visualization, tree cut, antichain },
    DOI = { 10.1109/TVCG.2016.2598591 },
    ISSN = { 1077-2626 },
    Month = jan,
    }
  • R. Veras and C. Collins, “Prioritizing Nodes in Hierarchical Visualizations with the Tree Cut Model,” Proc. of IEEE Conf. on Information Visualization (InfoVis), 2014.
    [Bibtex] [PDF]
    @poster{Ver2014b,
    author = {Rafael Veras and Christopher Collins},
    title = {Prioritizing Nodes in Hierarchical Visualizations with the Tree Cut Model},
    venue = {Proc. of IEEE Conf. on Information Visualization (InfoVis)},
    address = {Paris, France},
      series =   {Poster},
    year = 2014
    }

Video

Acknowledgements

Research

EduApps – Supporting Non-Native English Speakers to Overcome Language Transfer Effects

Metatation: Annotation as Implicit Interaction to Bridge Close and Distant Reading

DataTours: A Data Narratives Framework

Perceptual Biases in Font Size as a Data Encoding

Progressive Learning of Topic Modeling Parameters: A Visual Analytics Framework

Abbreviating Text Labels on Demand

NEREx: Named-Entity Relationship Exploration in Multi-Party Conversations

ConToVi: Multi-Party Conversation Exploration using Topic-Space Views

PhysioEx: Visual Analysis of Physiological Event Streams

Using Visual Analytics of Heart Rate Variation to Aid in Diagnostics

Off-Screen Desktop

PivotSlice

Reading Comprehension on Mobile Devices

#FluxFlow: Visual Analysis of Anomalous Information Spreading on Social Media

Optimizing Hierarchical Visualizations with the Minimum Description Length Principle

Lexichrome

SentimentState: Exploring Sentiment Analysis on Twitter

Facilitating Discourse Analysis with Interactive Visualization

DimpVis

Glidgets

TandemTable

Simple Multi-Touch Toolkit

Exploring Text Entities with Descriptive Non-photorealistic Rendering

Investigating the Semantic Patterns of Passwords

Bubble Sets: Revealing Set Relations with Isocontours over Existing Visualizations

Parallel Tag Clouds to Explore Faceted Text Corpora

VisLink: Revealing Relationships Amongst Visualizations

DocuBurst: Visualizing Document Content using Language Structure

Tabletop Text Entry Techniques

Lattice Uncertainty Visualization: Understanding Machine Translation and Speech Recognition

WordNet Visualization

// Where the sidebar information is stored
| © Copyright vialab | Dr. Christopher Collins, Canada Research Chair in Linguistic Information Visualization |