Week 4 #

Topic: The Self-Distillation Family in Self-Supervised Learning

Keynote Speaker: Lifan Lin, Yue Wu, Shengjie Niu

Time: Jul 13, 19:30 - 21:30 pm

Venue: Lecture Hall 3, 302 (SUSTech)

Compendium #

I. Introduction: Concept and Mechanism of Self-Distillation

Introduce the basic concept and structure of self-distillation. Discuss the mechanisms of self-distillation and its relation with Knowledge distillation.
Discuss the basic idea of contrastive learning and its pretext task (SimCLR).
The learning objective: A representation mapping invariant of transformation(augmentation). Alignment and uniformity are two key requirement.
Collapse(trivial solution). Optimal but unwanted result. Basic concept in preventing collapse.

II. Contrastive Learning without Negative Samples

III. Examination of BYOL-like model: Road to prevent collapse

Mythology: Studying the components of the model through ablation to understand whether they are necessary in preventing collapse.
Hypothesis proposed: Procedures taken are actually solving an underlying optimization problem. The optimization is EM-like and thus do well in searching for a representation.
Validating the hypothesis vis experience.

IV. Combining Transformer with Self-Distillation

Discuss the difficulty faced by the ViT(Vision Transformer). Specifically, the tokenizer of images.
Tokenizer learn better deep semantic information using self-distillation.
Discussing the trade-off between alignment and uniformity in ViT.
Improvement: Introduction MIM(Masked Image Modelling, much similar to masked language modelling) to self-distillation to create more effective pretext tasks.

I. Slide for Intro to Self-Supervised Learning from Shengjie Niu.

II. Slides 1 and 2 for Self-Distillation Family from Yue Wu and Lifan Lin.