Seminar on Current Works in Computer Vision

Prof. Thomas Brox

Visual representations are an important part of human intelligence. The goal of Computer Vision is to imitate the flexibility and robustness of the human visual system. Research has made significant progress in recent years particularly due to deep learning and there are strong solutions for most standard visual tasks. Almost all research in Computer Vision has shifted to deep learning based methods, which is why most innovation nowadays is in terms of machine learning.
In this seminar we will have a mix of papers that are centered around world models, i.e., models that inform agents about the state and dynamics of their surroundings and themselves. For each paper there will be one person, who performs a more detailed investigation of a research paper and its background and who will give a presentation. The presentation is followed by a discussion with all participants about the merits and limitations of the respective paper. You will learn to read and understand contemporary research papers, to give a good oral presentation, to ask questions, and to openly discuss a research problem.

This seminar will happen in presence only. It is meant for students, who are interested in research and not just the ECTS credits. It is not a lazy-life seminar.

Seminar: (2 SWS)	Wednesday, 14:00, building 52, room 02-17 Contact person: Karim Farid
Beginning:	If you want to participate, register in HisInOne for the course, attend the introduction meeting on October 15 14:00, and send an email with your name and your paper priorities (S1-S9, favorite paper first) to Karim Farid before October 21.
Recommended semester:	6 (Bachelor), any (Master)
Requirements:	Background in computer vision
Remarks:	The language in this course is English.
	There is a strongly related Blockseminar on Deep Learning offered by apl Prof. Olaf Ronneberger from Google DeepMind. The introduction meeting will be jointly for both seminars. Students, who did not attend the introduction meeting on Oct. 15, cannot participate in the seminar. For all students, who attended the introduction meeting and submitted their paper preferences, seat assignment will be done centrally via HisInOne via the provided priorities. For all students with an assigned seat, we will assign topics by preference. We want to avoid that people grab a topic and then jump off during the semester. Thus, please have a coarse look at all available papers to make an informed decision before you commit. The listed papers are not yet sorted by the date of presentation. If you don't attend the meeting (or not send a paper preference) but choose this seminar together with only other overbooked seminars in HisInOne, you may end up without a seminar place this semester. Students who just need to attend (failed SL from previous semester), need not send a preference for a paper, but just reply with "SL only". All participants must read all papers and answer a few questions. The questions will be available in the 'Questions' column of the table below at least one week before the corresponding presentation. The answers must be sent to the advisor of the paper before the paper is presented. All participants must attend all sessions.

Material:

Seminar organization
Giving a good presentation
Proper scientific behavior

Slides of the introductory lecture
Powerpoint template for your presentation (optional)

Papers:

Date	Paper	Questions	Presenting student	Slides	Advisor

26.11	V-JEPA	questions	Karl Erik Bode	slides	Artur Jesslen
03.12	Self-forcing for video diffusion model training	questions	Niklas Pant		Johannes Dienert
10.12	Dreamer v4	questions	Amal Abed		Karim Farid
17.12	Counterfactual reasoning	questions	Henrik Günther		Simon Schrodi
07.01	Generalization with video diffusion	questions	Mahmoud Hafez		Leonard Sommer
14.01	GigaFlow	questions	Nitishkumar Solpure		Sudhanshu Mittal
21.01	Scale and compositional generalization	questions	Lasse Rennig Kurz		Elias Kempf
28.01	Diffusion Policy Adaptation	questions	Kushal Savla		Silvio Galesso
04.02	Body pose conditioned video prediction	questions	Maksim Velikanov		Jelena Bratulic