W I L L K O M M E N
Universität Freiburg - Institut für Informatik
Lehrstuhl für Mustererkennung und Bildverarbeitung (LMB)
[-]LMB
   [+]Mitarbeiter
   [+]Forschung
   [+]Studien-, Diplom- und Bachelor...
   [+]Lehrveranstaltungen
    |-Veranstaltungen
    |-Publikationen
   [+]Awards ()
   [+]Stellenangebote
    |-TILDA
   [+]Software ()
   [+]Interna


 Diese Seite optimiert für Ausdruck

Segmentation of Image Sequences for Object Oriented Coding


Author: Sven Siggelkow


Introduction

The widespread coding techniques like MPEG-1 and 2 work on rectangular blocks resulting in visual remarkable effects at high compression ratios (e.g. blocking). Therefore second generation coding techniques like MPEG-4 work on basis of objects where the image is partitioned into objects instead of blocks. The image segmentation is done with respect to the human visual system reducing visual artefacts. Furthermore an object based data access is supported. The temporal stability of the segmentation is of major importance both for an efficient predictive coding of objects and for object tracking combined with content dependent quality. E.g. the speaker of a scene might be transmitted in good quality whereas the background may be coded lossy.


Segmentation

The segmentation relies on centroid linkage region growing [1] and DRF edge detection [2], in order to combine good global stability with high local correctness. It has been developed regarding
  • the characteristics of the human visual system,
  • the temporal stability for object based data access and for predictive coding and
  • the subjective image partition.
For taking advantage of the characteristics of the human visual system not only the luminance information but also the chrominance information is used for the segmentation process which is often neglected in the literature [3] (see Figure 1).

Figure 1: Segmentation with and without chrominance information

The second of the above listed requirements is difficult to achieve when segmentation is done for each image seperately. So processing of an image takes into account the segmentation of the preceding image if no scene change has been detected. The segmentation works in two different modes: intraframe segmentation and interframe segmentation. Processing in intraframe mode is done purely 2D whereas in interframe mode the segmentation is done also on basis of the segmentation of the preceding image. Compared to other existing approaches the temporal stability is improved by including also motion information into the segmentation process. Results can be seen in Figure 2 or as MPEG video: with or without boundary adaptation. The solution seems to be something inbetween.

Figure 2: Temporal Segmentation: Images 1, 2, and 15 (with and without boundary adaptation)

In order to achieve a segmentation conformable to a subjective image partition, a hierarchical merging of regions is done. The goal is difficult to reach since there is no information on the image semantics. Nevertheless motion information has already been proven to be useful for semantic segmentation [4, 5]. In our contribution chrominance information is used as a second semantic feature, e.g. it can be regarded as a property of the object material.

So a three layered hierarchy is built up. The first layer represents the segmentation described above. In the second layer regions of the first layer are merged with respect to similar chrominance and motion. The highest layer corresponds to a semantic segmentation, where regions of the second layer are merged in case of similar motion (see Figure 3).

Figure 3: Three Segmentation Layers

This hierarchy also has been temporal stabilized. So, when building up the hierarchy for the actual image, it is tried first to reconstruct the relations of the hierarchy of the preceding image. For the reconstruction the duration of a particular constellation in the past is considered.

On basis of these hierarchy levels coding of the sequence can be done. The layers are suitable both for an efficient prediction of old objects and for supporting an object based data access up to choosing semantic objects.


Object Oriented Coding of Chrominance Information

The second segmentation layer has been successfully used for efficient coding of chrominance information. As the human eye isn't as sensible to chrominance information as to luminance information, coding can be done there very lossy, whereas for luminance coding a more accurate technique should be used. Compression factors of 1000 were reached for QCIF sources (compared to full chrominance information) supposing that coding of object contours and positions has to be done for greyscale images anyway. As can be seen in Figure 4, the colours become less brilliant, but chrominance edges stay steep, just detailed information is lost, e.g. the sky in the background.

Figure 4: Original image and coded with only 22 different chrominance pairs (instead of 25344)


References

[1] R. M. Haralick and L. G. Shapiro: Image segmentation techniques, Computer Vision, Graphics, and Image Processing, vol. 29, pp. 100-132, 1985

[2] J. Shen and S. Castan: Further results on DRF method for edge detection, in 9th ICPR, Rome, 1988

[3] P. Salembier, L. Torres, F. Meyer, and Ch. Gu: Region-based video coding using mathematical morphology, Proceedings of the IEEE, vol. 83, no. 6, pp. 843-857, June 1995

[4] M. Hötter: Objetorientierte Analyse-Synthese-Codierung basierend auf dem Modell bewegter, zweidimensionaler Objekte, PhD thesis, 1992, published as VDI-Fortschrittbereicht (Reihe 10, Nr. 217), VDI-Verlag

[5] F. Fechter: Konturgesteuerte Bildmischung durch Bewegungssegmentierung, Fernseh- und Kino-Technik, vol. 49, no. 11, pp.651-660, November 1995


If there are any questions, feel free to contact me