Centroid averaging algorithm for a clustering ensemble
Tatarnikov V.V., Pestunov I.A., Berikov V.B.


Sobolev Institute of Mathematics SB RAS, Novosibirsk, Russia,
Institute of Computational Technologies SB RAS, Novosibirsk, Russia,
Novosibirsk State University, Novosibirsk, Russia

Full text of article: Russian language.


A collective approach to cluster analysis is considered in the paper. An algorithm of centroid averaging is proposed. The algorithm allows constructing the consensus partition of a dataset into clusters, using a set of partitions built with any centroid-based algorithm. We discuss results of applying the proposed algorithm to modeled data and for the segmentation of hyperspectral images with noise channels. Some details of implementation in a multithreaded environment that allows increasing the algorithm performance are given.

clustering ensemble, K-means, centroid, hyperspectral image analysis.

Tatarnikov VV, Pestunov IA, Berikov VB. Centroid averaging algorithm for a clustering ensemble. Computer Optics 2017; 41(5): 712-718. DOI: 10.18287/2412-6179-2017-41-5-712-718.


  1. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. 2nd ed. New York: Springer-Verlag; 2009. ISBN: 978-0-387-84857-0.
  2. Xu R, Wunsch DC II. Clustering. Hoboken, NJ: John Wiley& Sons, Inc.; 2009. ISBN: 978-0-470-27680-8.
  3. Belim S, Kutlunin P. Boundary extraction in images using a clustering algorithm [In Russian]. Computer Optics 2015; 39(1): 119-124. DOI: 10.18287/0134-2452-2015-39-1-119-124.
  4. Jain AK. Data clustering: 50 years beyond K-means. Pattern Recognition Letters 2010; 31(8): 651-666. DOI: 10.1016/j.patrec.2009.09.011.
  5. Ghaemi R, Sulaiman M, Ibrahim H, Mustapha N. A survey: Clustering ensembles techniques. World Academy of Science, Engineering and Technology 2009; 38: 644-653.
  6. Hore P, Hall LO, Goldgof DB. A scalable framework for cluster ensembles. Pattern Recognition 2009; 42(5): 676-688. DOI: 10.1016/j.patcog.2008.09.027.
  7. Kashef R, Kamel MS. Cooperative clustering. Pattern Recognition 2010; 43(6): 2315-2329. DOI: 10.1016/j.patcog.2009.12.018.
  8. Jia J, Liu B, Jiao L. Soft spectral clustering ensemble applied to image segmentation. Frontier of Computer Science in China 2011; 5(1): 66-78.
  9. Franek L, Jiang X. Ensemble clustering by means of clustering embedding in vector spaces. Pattern Recognition 2014; 47(2): 833-842. DOI: 10.1016/j.patcog.2013.08.019.
  10. Berikov V, Pestunov I. Ensemble clustering based on weighted co-association matrices: Error bound and convergence properties. Pattern Recognition 2017; 63: 427-436. DOI: 10.1016/j.patcog.2016.10.017.
  11. Ghosh J, Acharya A. Cluster ensembles. WIREs Data Mining Knowledge Discovery 2011; 1: 305-315. DOI: 10.1002/widm.32.
  12. Pestunov I, Kulikova E, Rylov S, Berikov V. Ensemble of lustering algorithms for large datasets. Optoelectronics, Instrumentation and Data Processing 2011; 47(3): 245-252. DOI: 10.3103/S8756699011030071.
  13. Pestunov IA, Rylov SA, Berikov VB. Hierarchical clustering algorithms for segmentation of multispectral images. Optoelectronics, Instrumentation and Data Processing 2015; 51(4): 329-338. DOI: 10.3103/S8756699015040020.
  14. Pestunov IA, Berikov VB, Sinyavskiy YuN. Algorithm for multispectral image segmentation based on ensemble of nonparametric clustering algorithms [In Russian]. Vestnik SibGAU. 2010; 5(31): 56-64.
  15. Hubert L, Arabie Ph. Comparing partitions. Journal of Classification 1985; 2: 193-218.
  16. Strehl A, Ghosh J. Cluster ensembles–a knowledge reuse framework for combining multiple partitions. The Journal of Machine Learning Research 2003; 3: 583-617. DOI: 10.1162/153244303321897735.
  17. Meila M. Comparing clusterings by the variation of information. Proceedings of 16th Conference on Learning Theory and 7th Kernel Workshop (COLT/Kernel 2003) 2003: 173-187.
  18. Wu J, Chen J, Xiong H, Xie M. External validation measures for k-means clustering: A data distribution perspective. Expert Systems with Applications 2009; 36(3:2): 6050-6061. DOI: 10.1016/j.eswa.2008.06.093.
  19. mlbench: Machine Learning Benchmark Problems. Source: áhttps://cran.r-project.org/web/packages/mlbench/index.htmlñ.
  20. Hyperspectral Remote Sensing Scenes. Source: áhttp://www.ehu.eus/ccwintco/index.php?title=Hyperspect­ral_Remote_Sensing_Scenesñ.
  21. Hossam MA, Ebied HM, Abdel-Aziz MH, Tolba MF. Accelerated hyperspectral image recursive hierarchical segmentation using GPUs, multicore CPUs, and hybrid CPU/GPU cluster. Journal of Real-Time Image Processing 2014: 1-20. DOI: 10.1007/s11554-014-0464-4.
  22. Rylov SA, Pestunov IA. NVIDIA GPU for multispectral data clustering with grid-based algorithm CCA. Interexpo Geo-Siberia 2015; 2: 51-56.
  23. Darjen Chang, Nataniel A. Jones, Dazhuo Li, Ming Ouyang. Compute pairwise Euclidean distances of data points with GPUs. Proceedings of the IASTED International Symposium Computational Biology and Bioinformatics (CBB 2008) 2008: 278-283.

© 2009, IPSI RAS
Institution of Russian Academy of Sciences, Image Processing Systems Institute of RAS, Russia, 443001, Samara, Molodogvardeyskaya Street 151; E-mail: journal@computeroptics.ru ; Phones: +7 (846 2) 332-56-22, Fax: +7 (846 2) 332-56-20