Week4

Week 4 #

Topic: The Self-Distillation Family in Self-Supervised Learning

Keynote Speaker: Lifan Lin, Yue Wu, Shengjie Niu

Time: Jul 13, 19:30 - 21:30 pm

Venue: Lecture Hall 3, 302 (SUSTech)

Online Link: TencentMeeting

Compendium #

I. Introduction: Concept and Mechanism of Self-Distillation

  • Introduce the basic concept and structure of self-distillation. Discuss the mechanisms of self-distillation and its relation with Knowledge distillation.
  • Discuss the basic idea of contrastive learning and its pretext task (SimCLR).
  • The learning objective: A representation mapping invariant of transformation(augmentation). Alignment and uniformity are two key requirement.
  • Collapse(trivial solution). Optimal but unwanted result. Basic concept in preventing collapse.

II. Contrastive Learning without Negative Samples

  • Introduce BYOL, a self-distillation model learning without negative pairs.
  • Connections between BYOL and other works
  • Provide some illuminations about why BYOL performs well and avoids collapse

III. Examination of BYOL-like model: Road to prevent collapse

  • Mythology: Studying the components of the model through ablation to understand whether they are necessary in preventing collapse.
  • Hypothesis proposed: Procedures taken are actually solving an underlying optimization problem. The optimization is EM-like and thus do well in searching for a representation.
  • Validating the hypothesis vis experience.

IV. Combining Transformer with Self-Distillation

  • Discuss the difficulty faced by the ViT(Vision Transformer). Specifically, the tokenizer of images.
  • Tokenizer learn better deep semantic information using self-distillation.
  • Discussing the trade-off between alignment and uniformity in ViT.
  • Improvement: Introduction MIM(Masked Image Modelling, much similar to masked language modelling) to self-distillation to create more effective pretext tasks.

Material #

I. Slide for Intro to Self-Supervised Learning from Shengjie Niu.

II. Slides 1 and 2 for Self-Distillation Family from Yue Wu and Lifan Lin.

References #

  1. Balestriero, R et al. A Cookbook of Self-supervised Learning

  2. Bootstrap your own latent: A new approach to self-supervised Learning

  3. Exploring Simple Siamese Representation Learning

  4. Emerging Properties in Self-Supervised Vision Transformers

  5. A blog about BYOL

  6. A Simple Framework for Contrastive Learning of Visual Representations (arxiv.org)

  7. Bootstrap your own latent: A new approach to self-supervised Learning (arxiv.org)

  8. Exploring Simple Siamese Representation Learning (arxiv.org)

  9. Emerging Properties in Self-Supervised Vision Transformers (arxiv.org)

  10. iBOT: Image BERT Pre-Training with Online Tokenizer (arxiv.org)

  11. blog_quarto - BYOL: Contrastive Representation-Learning without Contrastive Losses (drscotthawley.github.io)

  12. Understanding self-supervised and contrastive learning with “Bootstrap Your Own Latent” (BYOL) - generally intelligent