Semantic Concept Spaces: Guided Topic Model Refinement using Word-Embedding Projections

Contributors

Mennatallah El-Assady, Rebecca Kehlbeck, Christopher Collins, Daniel Keim, and Oliver Deussen

Abstract

We present a framework that allows users to incorporate the semantics of their domain knowledge for topic model refinement while remaining model-agnostic. Our approach enables users to (1) understand the semantic space of the model, (2) identify regions of potential conflicts and problems, and (3) readjust the semantic relation of concepts based on their understanding, directly influencing the topic modeling. These tasks are supported by an interactive visual analytics workspace that uses word-embedding projections to define concept regions which can then be refined. The user-refined concepts are independent of a particular document collection and can be transferred to related corpora. All user interactions within the concept space directly affect the semantic relations of the underlying vector space model, which, in turn, change the topic modeling. In addition to direct manipulation, our system guides the users’ decision- making process through recommended interactions that point out potential improvements. This targeted refinement aims at minimizing the feedback required for an efficient human-in-the-loop process. We confirm the improvements achieved through our approach in two user studies that show topic model quality improvements through our visual knowledge externalization and learning process.

Publications

  • M. El-Assady, R. Kehlbeck, C. Collins, D. Keim, and O. Deussen, “Semantic Concept Spaces: Guided Topic Model Refinement using Word-Embedding Projections,” IEEE Transactions on Visualization and Computer Graphics (Proc. IEEE VAST), 2019.
    [Bibtex] [PDF]
    @Article{ela2019a,
        author =    {Mennatallah El-Assady and Rebecca Kehlbeck and Christopher Collins and Daniel Keim and Oliver Deussen},
        title =    {Semantic Concept Spaces: Guided Topic Model Refinement using Word-Embedding Projections},
        journal =   {IEEE Transactions on Visualization and Computer Graphics (Proc. IEEE VAST)},
        year = 2019
    }

Videos

Research

A Visual Analytics Framework for Adversarial Text Generation

Design by Immersion: A Transdisciplinary Approach to Problem-Driven Visualizations

Semantic Concept Spaces: Guided Topic Model Refinement using Word-Embedding Projections

Discriminability Tests for Visualization Effectiveness and Scalability

Saliency Deficit and Motion Outlier Detection in Animated Scatterplots

ActiveInk: (Th)Inking with Data

Visual Analytics for Topic Model Optimization based on User-Steerable Speculative Execution

ThreadReconstructor: Modeling Reply-Chains to Untangle Conversational Text through Visual Analytics

Detecting Negative Emotion for Mixed Initiative Visual Analytics

EduApps – Supporting Non-Native English Speakers to Overcome Language Transfer Effects

Metatation: Annotation as Implicit Interaction to Bridge Close and Distant Reading

DataTours: A Data Narratives Framework

Perceptual Biases in Font Size as a Data Encoding

Progressive Learning of Topic Modeling Parameters: A Visual Analytics Framework

Abbreviating Text Labels on Demand

NEREx: Named-Entity Relationship Exploration in Multi-Party Conversations

ConToVi: Multi-Party Conversation Exploration using Topic-Space Views

PhysioEx: Visual Analysis of Physiological Event Streams

Using Visual Analytics of Heart Rate Variation to Aid in Diagnostics

Off-Screen Desktop

PivotSlice

Reading Comprehension on Mobile Devices

#FluxFlow: Visual Analysis of Anomalous Information Spreading on Social Media

Balancing Clutter and Information in Large Hierarchical Visualizations

Lexichrome

SentimentState: Exploring Sentiment Analysis on Twitter

Facilitating Discourse Analysis with Interactive Visualization

DimpVis

Glidgets

TandemTable

Simple Multi-Touch Toolkit

Exploring Text Entities with Descriptive Non-photorealistic Rendering

Investigating the Semantic Patterns of Passwords

Bubble Sets: Revealing Set Relations with Isocontours over Existing Visualizations

Parallel Tag Clouds to Explore Faceted Text Corpora

VisLink: Revealing Relationships Amongst Visualizations

DocuBurst: Visualizing Document Content using Language Structure

Tabletop Text Entry Techniques

Lattice Uncertainty Visualization: Understanding Machine Translation and Speech Recognition

WordNet Visualization

// Where the sidebar information is stored
| © Copyright vialab | Dr. Christopher Collins, Canada Research Chair in Linguistic Information Visualization |