Visual Information Retrieval
Visual information retrieval is a new subject of research in information technology. Its purpose is to retrieval from a database, images or image sequences that are relevant to a query. It is an extension of traditional information retrieval designed to include visual media.
The variety of knowledge required in visual information retrieval is large. Different research fields, which have involved separately, provide valuable contributions to this new research subject. Information retrieval, visual data modeling and representation, image/video analysis and processing, pattern recognition and computer vision, multimedia database organization, multidimensional indexing, psychological modeling of user behavior, man-machine interaction and data visualization, are the most important research fields that contribute to visual information retrieval.
New-generation visual information retrieval systems support full retrieval by visual content. Access to visual information is not only performed at a conceptual level, using keywords as in the textual domain, but also at a perceptual level, using objective measurements of visual content and appropriate similarity models. The contents include:
In typical content-based image retrieval systems (Figure 1), the visual
contents of the images in the database are extracted and described by
multi-dimensional feature vectors. The
feature vectors of the images in the database form a feature database. To
retrieve images, users provide the retrieval system with example images or
sketched figures. The system then changes these examples into its internal
representation of feature vectors. The similarities /distances between the
feature vectors of the query example or sketch and those of the images in the
database are then calculated and retrieval is performed with the aid of an
indexing scheme. The indexing scheme provides an efficient way to search for the
image database. Recent retrieval systems have incorporated users' relevance
feedback to modify the retrieval process in order to generate perceptually and
semantically more meaningful retrieval results.
Following are the problems to be solved in a CBIR system:
We have developed a demo image retrieval system-SIMBA(Search IMages By Appearance). Our approach is based on invariant features, i.e. features that do not vary if the image is transformed by some transformation group (we will consider translation and rotation here). Schulz-Mirbach introduced an algorithm for the construction of invariant features [Schulz-Mirbach:1995] which is very suitable because of its robustness to slight topological deformations and even to independent motion of objects within the image. The major advantage is that it does not require the extraction of objects (segmentation), or distinct points (key-points) from the image, but can be applied directly to the original image data.
However, in order to improve the algorithm's robustness in an image retrieval application - especially for supporting partial matches - we had to modify it, so that more local information is preserved in the final features. Thus we constructed feature histograms [Siggelkow, Burkhardt:1998], which are very similar to the well known color histograms but consider features drawn from a local neighborhood of each pixel instead of just using the color value of each pixel only. Thus we incorporate also textural information.
Recently the method was further enhanced by a fast estimation of the features instead of a tedious calculation. Thus the extracted features will have a small error which, however, can be well estimated [Siggelkow, Schael:1999].
References
Links of related projects