Seminar: Algorithms for Cluster Analysis

Seminar: Algorithms for Cluster Analysis
SS 10

The Chair of Pattern Recognition and Image Processing offers in SS10 a seminar with the title "Algorithms for Cluster analysis".

Motivation and Aim:

Organizing data into meaningful groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification divides animals into a system of ranked taxa: domain, kingdom, phylum, class, etc..
Cluster analysis is the formal study of methods and algorithms for grouping, or clustering, objects according to measured or perceived intrinsic characteristics or similarity. Cluster analysis does not use category labels that tag objects with prior identifiers, i.e., class labels. The absence of category information distinguishes data clustering (unsupervised learning) from classification or discriminant analysis (supervised learning).
The aim of clustering is to find structure in data and is therefore exploratory in nature. One of the most popular clustering algorithms, K-means, was first published in 1955. In spite of the fact that K-means was proposed over 50 years ago and thousands of clustering algorithms have been published since then, K-means is still widely used. This speaks to the difficulty of designing a general purpose clustering algorithm and the ill-posed problem of clustering.
The aim of this seminar is to provide a brief overview of clustering, summarize well known clustering methods and test the performence of known algorithms on provided data sets.

Workflow:

*	During the "Vorbesprechung", topics will be bindingly distributed and presentation dates will be fixed.
*	An introduction talk on cluster analysis algorithms will be given in the following week.
*	Each participant should give an oral presentation of 45 minutes on his topic.
*	The algorithms shall be implemented and tested on the provided database. Results will be discussed and interpreted.

Key information:

Time:	Wed. 16:15-17:45
Room:	Geb. 106, Multimediaraum (SR 00-007)
Vorbesprechung:	Wed. 21.04.2010, 16:15, Geb. 106, SR 00-007
Participants:	Students of coumputer science, mathematics, physics or microsystem technology
Language:	Talks can be given in German or English
Organisation:	Maja Temerinac-Ott (temerina at informatik.uni-freiburg.de)
Registration:	Preregistration in LSF system or per email. Binding registration in the seminar introduction (Wed. 21.04.2010, 16:15, 106-00-007)

Topic	Student	Betreuer

Clustering ensembles		Matthias Schlachter
Self Organizing Maps		Robert Bensch
Kernel K-means clustering		Thorsten Schmidt
K-means with different metrics		Wan Nural Jawahir
Multiview Clustering or Fuzzy c-means clustering		Nikos Canterakis
Semi-supervised clustering		Qing Wang
Hierarchical Clustering		Henrik Skibbe
Spectral Clustering (Normalized Cuts)		Lingyu Ma
Mean Shift Clustering		Margret Keuper
EM-based mixture model clustering		Maja Temerinac-Ott

Seminar Wiki

To Seminar Wiki

Data

Literature:

[1]	A.K. Jain, "Data Clustering: 50 Years Beyond K-Means", Pattern Recognition Letters (In press, Published online), 2009.
[2]	J. Shi and J. Malik, "Normalized cuts and image segmentation", IEEE PAMI, Vol. 22, No. 8 (2000) pp. 888-905.
[3]	A.W. Moore, "Very fast EM-based mixture model clustering using multiresolution kd-trees", NIPS, (1998) pp. 543-549.
[4]	S. Eschrich, J. Ke, L.O. Hall abd D.B.Goldgof, "Fast accurate fuzzy clustering through data reduction", IEEE TFS, Vol. 11, No. 2 (2003), pp. 262-270.
[5]	T. Kohonen, "Kohnonen network", Scholarpedia (2007).
[6]	A. Banerjee, S. Merugu, I. Dhillon and J. Ghosh. 2004. "Clustering with bregman divergences", Journal of Machine Learning Research, 234-245.
[7]	H. Kashima, J. Hu, B. Ray and M. Singh 2008 (Dec.). "K-means clustering of proportional data using L1 distance", ICPR.
[8]	B. Scholkopf, A. Smola and K.-R. Mueller. 1998. "Nonlinear component analysis as a kernel eigenvalue problem", Neural Computation, 10(5), 1299-1319.
[9]	F. Wang, J. Wang, C. Zhang and H. C. Shen. "Semi-Supervised Classification Using Semi- Linear Neighborhood Propagation", Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp.160- 167. New York University, New York, New York, USA, June 17-22, 2006
[10]	A. Fred and A.K. Jain. 2002. "Data clustering using evidence accumulation", In: Proceedings of the International Conference on Pattern Recognition (ICPR).
[11]	R. Bekkermann, R. El-Yaniv & A. McCallum, 2005. Multi-way distributional clustering via pairwise interactions. Pages 41-48 of: Proceedings of the 22nd International Conference on Machine learning (ICML).New York, NY, USA: ACM.
[12]	Y. Cheng: "Mean Shift, Mode Seeking, and Clustering", IEEE PAMI, 17/8, pp.790-799, 1995.

Suchmaschinen für Literatur:

Albert-Ludwigs-Universität Freiburg, Lehrstuhl für Mustererkennung und Bildverarbeitung, Maja Temerinac-Ott (temerina at informatik.uni-freiburg.de)

Zuletzt aktualisiert am 08.02.2010, 15:00h