Learn, Generate, Rank, Explain: A Case Study of Visual Explanation by Generative Machine Learning


Chris Kim, Xiao Lin, Christopher Collins, Graham W. Taylor, and Mohamed R. Amer.


While the computer vision problem of searching for activities in videos is usually addressed by using discriminative models, their decisions tend to be opaque and difficult for people to understand. We propose a case study of a novel machine learning approach for generative searching and ranking of motion capture activities with visual explanation. Instead of directly ranking videos in the database given a text query, our approach uses a variant of Generative Adversarial Networks (GANs) to generate exemplars based on the query and uses them to search for the activity of interest in a large database. Our model is able to achieve comparable results to its discriminative counterpart, while being able to dynamically generate visual explanations. In addition to our searching and ranking method, we present an explanation interface that enables the user to successfully explore the model’s explanations and its confidence by revealing query-based, model-generated motion capture clips that contributed to the model’s decision. Finally, we conducted a user study with 44 participants to show that by using our model and interface, participants benefit from a deeper understanding of the model’s conceptualization of the search query. We discovered that the XAI system yielded a comparable level of efficiency, accuracy, and user-machine synchronization as its black-box counterpart, if the user exhibited a high level of trust for AI explanation.


  • [IMG]
    C. Kim, X. Lin, C. Collins, G. W. Taylor, and M. R. Amer, “Learn, Generate, Rank, Explain: A Case Study of Visual Explanation by Generative Machine Learning,” ACM Trans. Interact. Intell. Syst., vol. 11, iss. 3–4, 2021.
    [Bibtex] [PDF] [URL] [DOI]

    author = {Kim, Chris and Lin, Xiao and Collins, Christopher and Taylor, Graham W. and Amer, Mohamed R.},
    title = {Learn, Generate, Rank, Explain: A Case Study of Visual Explanation by Generative Machine Learning},
    year = {2021},
    issue_date = {December 2021},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    volume = {11},
    number = {3–4},
    issn = {2160-6455},
    url = {doi.org/10.1145/3465407},
    doi = {10.1145/3465407},
    journal = {ACM Trans. Interact. Intell. Syst.},
    month = aug,
    articleno = {23},
    numpages = {34},
    keywords = {model-generated explanation, trust and reliance, Explainable artificial intelligence, user study}


Our featured Medium story for this research paper can be found here.


Presentation from IUI 2022


Case Study Participant Interface

The web-based study interface is available for public access at http://gr.ckprototype.com/.



Learn, Generate, Rank, Explain: A Case Study of Visual Explanation by Generative Machine Learning

Professional Differences: A Comparative Study of Visualization Task Performance and Spatial Ability Across Disciplines

Card-IT: a Dynamic FSM-based Flashcard Generator for Learning Italian Verb Morphology

Visual Analytics Tools for Academic Advising

Érudit and Vialab Collaboration Projects

Academia is Tied in Knots

Tilt-Responsive Techniques for Digital Drawing Boards

Textension: Digitally Augmenting Document Spaces in Analog Texts

Eye Tracking for Target Acquisition in Sparse Visualizations

Guidance in the human–machine analytics process

H-Matrix: Hierarchical Matrix for Visual Analysis of Cross-Linguistic Features in Large Learner Corpora

A Visual Analytics Framework for Adversarial Text Generation

Design by Immersion: A Transdisciplinary Approach to Problem-Driven Visualizations

Semantic Concept Spaces: Guided Topic Model Refinement using Word-Embedding Projections

Discriminability Tests for Visualization Effectiveness and Scalability

Saliency Deficit and Motion Outlier Detection in Animated Scatterplots

ActiveInk: (Th)Inking with Data

Visual Analytics for Topic Model Optimization based on User-Steerable Speculative Execution

ThreadReconstructor: Modeling Reply-Chains to Untangle Conversational Text through Visual Analytics

Detecting Negative Emotion for Mixed Initiative Visual Analytics

EduApps – Supporting Non-Native English Speakers to Overcome Language Transfer Effects

Metatation: Annotation as Implicit Interaction to Bridge Close and Distant Reading

DataTours: A Data Narratives Framework

Perceptual Biases in Font Size as a Data Encoding

Progressive Learning of Topic Modeling Parameters: A Visual Analytics Framework

Abbreviating Text Labels on Demand

NEREx: Named-Entity Relationship Exploration in Multi-Party Conversations

ConToVi: Multi-Party Conversation Exploration using Topic-Space Views

PhysioEx: Visual Analysis of Physiological Event Streams

Using Visual Analytics of Heart Rate Variation to Aid in Diagnostics

Off-Screen Desktop


Reading Comprehension on Mobile Devices

#FluxFlow: Visual Analysis of Anomalous Information Spreading on Social Media

Balancing Clutter and Information in Large Hierarchical Visualizations

Lexichrome: Text Construction and Lexical Discovery with Word-Color Associations Using Interactive Visualization

SentimentState: Exploring Sentiment Analysis on Twitter

Facilitating Discourse Analysis with Interactive Visualization




Simple Multi-Touch Toolkit

Exploring Text Entities with Descriptive Non-photorealistic Rendering

Investigating the Semantic Patterns of Passwords

Bubble Sets: Revealing Set Relations with Isocontours over Existing Visualizations

Parallel Tag Clouds to Explore Faceted Text Corpora

VisLink: Revealing Relationships Amongst Visualizations

DocuBurst: Visualizing Document Content using Language Structure

Tabletop Text Entry Techniques

Lattice Uncertainty Visualization: Understanding Machine Translation and Speech Recognition

WordNet Visualization

// Where the sidebar information is stored
| © Copyright vialab | Dr. Christopher Collins, Canada Research Chair in Linguistic Information Visualization |