Self-supervised Pre-training with Masked Image Modeling

Date:

Seminar recording

Abstract: Consistent success of BERT-like models in language processing has lead to attempts to adapt masked modeling task to other data domains and create a universal framework for pre-training without manual annotations. In this seminar, we will have an overview of relatively recent (2021-2022) approaches for masked image modeling (MIM). We will start with a brief history of self-supervised methods for images and then discuss different masking strategies, image-processing pipelines and what targets are suitable for masked modeling. We will consider BEiT, SimMIM, MAE, UM-MAE and MaskFeat.

Papers

  • https://arxiv.org/abs/2106.08254
  • https://arxiv.org/abs/2111.09886
  • https://arxiv.org/abs/2111.06377
  • https://arxiv.org/abs/2205.10063
  • https://arxiv.org/abs/2112.09133