Questions for "GroupViT: Semantic Segmentation Emerges from Text Supervision" ----------------------------------------------------------------------------------------------- Please send your answers to: bechtolj@cs.uni-freiburg.de by 13:15 on 29.07.2022 Please be concise in your answers. 1. Name the benefits of training jointly on image-text data. 2. Describe the hierarchical grouping in your own words. 3. Which contrastive losses do they use and what's the intuition?