Self-Supervised Learning of Face Representations for Video Face Clustering (FG 2019)

Paper

Vivek Sharma, Makarand Tapaswi, M. Saquib Sarfraz, and Rainer Stiefelhagen

IEEE International Conference on Automatic Face and Gesture Recognition, FG 2019

Abstract

Analyzing the story behind TV series and movies often requires understanding who the characters are and what they are doing. With improving deep face models, this may seem like a solved problem. However, as face detectors get better, clustering/identification needs to be revisited to address increasing diversity in facial appearance. In this paper, we address video face clustering using unsupervised methods. Our emphasis is on distilling the essential information, identity, from the representations obtained using deep pre-trained face networks. We propose a self-supervised Siamese network that can be trained without the need for video/track based supervision, and thus can also be applied to image collections. We evaluate our proposed method on three video face clustering datasets. The experiments show that our methods outperform current state-of-the-art methods on all datasets. Video face clustering is lacking a common benchmark as current works are often evaluated with different metrics and/or different sets of face tracks For more details and evaluation results, please check out our paper.

Citation

If you find the code and datasets useful in your research, please cite:

@inproceedings{ssiam,
    author    = {Sharma, Vivek and Tapaswi, Makarand and Sarfraz, M. Saquib and Stiefelhagen, Rainer}, 
    title     = {Self-Supervised Learning of Face Representations for Video Face Clustering}, 
    booktitle = {IEEE International Conference on Automatic Face and Gesture Recognition},
    year      = {2019}
}

Requirements and Dependencies

MATLAB (we test with MATLAB R2018b on Ubuntu 16.04)
Cuda & Cudnn (we test with Cuda 8.0 and Cudnn 5.1)

Installation

Please install MatConvNet in your own path, you need to change the corresponding path path2MatconNet in demo.m.

Training and Testing SSiam

To train and test SSiam from scratch, first download the training datasets:

$ mkdir -p experiments/input_data && cd experiments/input_data
$ wget https://cvhci.anthropomatik.kit.edu/~datasets-publisher/published_datasets/face_and_body/BBT_Buffy_ACCIO_VGG_face_reps/VGG2.tar.gz
$ tar -xzf VGG2.tar.gz 
$ cd ../..

This script will train and test the SSiam method on BBT-0101 dataset:

>> demo()

Note that we only train/test our code on single-GPU mode. Running demo.m will reproduce the results in our paper for BBT-0101 for both track-level and frame-level representation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Self-Supervised Learning of Face Representations for Video Face Clustering (FG 2019)

Paper

Table of Contents

Abstract

Citation

Requirements and Dependencies

Installation

Training and Testing SSiam

Files

README.md

Latest commit

History

README.md

File metadata and controls

Self-Supervised Learning of Face Representations for Video Face Clustering (FG 2019)

Paper

Table of Contents

Abstract

Citation

Requirements and Dependencies

Installation

Training and Testing SSiam