Seminar on Current Works in Computer Vision
Prof. Thomas BroxVisual representations are an important part of human intelligence. The goal of Computer Vision is to imitate the flexibility and robustness of the human visual system. Research has made significant progress in recent years particularly due to deep learning and there are strong solutions for most standard visual tasks. Almost all research in Computer Vision has shifted to deep learning based methods, which is why most innovation nowadays is in terms of machine learning.
In this seminar we will have a mix of papers that are centered around world models, i.e., models that inform agents about the state and dynamics of their surroundings and themselves. For each paper there will be one person, who performs a more detailed investigation of a research paper and its background and who will give a presentation. The presentation is followed by a discussion with all participants about the merits and limitations of the respective paper. You will learn to read and understand contemporary research papers, to give a good oral presentation, to ask questions, and to openly discuss a research problem.
This seminar will happen in presence only. It is meant for students, who are interested in research and not just the ECTS credits. It is not a lazy-life seminar.
|
![]() |
Material:
Seminar organizationGiving a good presentation
Proper scientific behavior
Slides of the introductory lecture
Powerpoint template for your presentation (optional)
Papers:
Date | Paper | Questions | Presenting student | Slides | Advisor |
S1 | Dreamer v4 | ||||
S2 | V-JEPA | ||||
S3 | GigaFlow | ||||
S4 | Counterfactual reasoning | ||||
S5 | Generalization with video diffusion | ||||
S6 | Self-forcing for video diffusion model training | ||||
S7 | Scale and compositional generalization | ||||
S8 | Diffusion Policy Adaptation | ||||
S9 | Body pose conditioned video prediction |