Motion Estimation and Optical Flow

We permanently work on improving the quality of optical flow estimation and other motion estimation methods, such as point tracking or scence flow estimation. As optical flow is the corner stone of all video analysis, we believe that even the smallest improvement has large effects on the overall performance of video related methods. In the past we made a couple of important contributions to the field, the lastest and most revolutionary one being a convolutional network that can predict high accuracy optical flow almost in real-time.

Thomas Brox, A. Bruhn, N. Papenberg, J. Weickert
European Conference on Computer Vision (ECCV), Springer, LNCS, Vol.3024: 25-36, May 2004

This work largely reshaped the field of optical flow estimation. It contributes a theoretical justification for earlier warping approaches, an effective numerical scheme, the use of gradient constancy, and the use of the total variation norm in optical flow estimation. It is the most cited paper from our group.

Thomas Brox, J. Malik
IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(3):500-513, 2011

We approached a general problem in optical flow estimation, that of large motion, by integrating combinatorial optimization ideas into a variational model.

N. Sundaram, Thomas Brox, K. Keutzer
European Conference on Computer Vision (ECCV), Springer, LNCS, Sept. 2010

High quality optical flow allows to build point trackers that can track far more points than previous trackers, offer higher accuracy and, thanks to large displacement optical flow, can track even fast moving body limbs.

A. Wedel, Thomas Brox, T. Vaudrey, C. Rabe, U. Franke, D. Cremers
International Journal of Computer Vision, 95(1):29-51, 2011

We also worked on stereo videos, where a combination of optical flow and disparity estimation allows to estimate 3D motion vectors in 3D space.

Alexey Dosovitskiy, Philipp Fischer, Eddy Ilg, P. Häusser, C. Hazırbaş, V. Golkov, P. Smagt, D. Cremers, Thomas Brox
IEEE International Conference on Computer Vision (ICCV), Dec 2015.

The latest finding is that optical flow estimation can be formulated as a learning problem. A convolutional network is trained end-to-end to predict the optical flow field for two input images. The network achieves competitive accuracy on Sintel and KITTI datasets at frame rates of 5 to 10 fps.

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017

The improvement of our earlier FlowNet yields state-of-the-art accuracy while being orders of magnitude faster than competing optical flow methods. FlowNet 2.0 puts special emphasize on the synthesized data used for training and uses stacking of multiple networks to refine the optical flow computed at earlier stages.

Demo video of FlowNet

Flownet video

Demo videos on large displacement optical flow

color code  

The following demo videos show the optical flow computed on a couple of sequences. The color code for interpreting the direction of the flow vectors is shown on the left. For instance, green means a pixel is moving to the left. Videos are shown in the original resolution and frame rate. There are quantization artifacts from video compression even in the high quality material. The uncompressed results are as smooth as shown in the still image.

Shot from Miss Marple: A pocket full of rye
Low quality video (200kB)  High quality video (4MB)  Slow motion (4MB)   Input images (1MB)

Another shot from Miss Marple: A pocket full of rye
Low quality video (500kB)  High quality video (13MB)  Slow motion (13MB)   Input images (5MB)

Tennis sequence
Low quality video (1MB)  High quality video (36MB)  Slow motion (36MB)   Input images (173MB)

Monkey sequence
Low quality video (1.6MB)  High quality video (last part only, 12MB)  Slow motion (last part only, 12MB)   Input images (21MB)