4/2017 - 1 | View TOC | « Previous Article | Next Article » |
Centroid Update Approach to K-Means ClusteringBORLEA, I.-D. , PRECUP, R.-E. , DRAGAN, F. , BORLEA, A.-B. |
View the paper record and citations in |
Click to see author's profile in SCOPUS, IEEE Xplore, Web of Science |
Download PDF (1,203 KB) | Citation | Downloads: 2,631 | Views: 4,858 |
Author keywords
clustering algorithms, clustering methods, data analysis, data mining, machine learning algorithms
References keywords
data(12), fuzzy(9), algorithms(9), systems(7), control(7), comput(7), optimal(6), clustering(6), algorithm(6), system(5)
Blue keywords are present in both the references section and the paper title.
About this article
Date of Publication: 2017-11-30
Volume 17, Issue 4, Year 2017, On page(s): 3 - 10
ISSN: 1582-7445, e-ISSN: 1844-7600
Digital Object Identifier: 10.4316/AECE.2017.04001
Web of Science Accession Number: 000417674300001
SCOPUS ID: 85035816652
Abstract
The volume and complexity of the data that is generated every day increased in the last years in an exponential manner. For processing the generated data in a quicker way the hardware capabilities evolved and new versions of algorithms were created recently, but the existing algorithms were improved and even optimized as well. This paper presents an improved clustering approach, based on the classical k-means algorithm, and referred to as the centroid update approach. The new centroid update approach formulated as an algorithm and included in the k-means algorithm reduces the number of iterations that are needed to perform a clustering process, leading to an alleviation of the time needed for processing a dataset. |
References | | | Cited By «-- Click to see who has cited this paper |
[1] C. Eaton, P. Zikopoulos, T. Deutsch, G. Lapis, and D. Deroos, Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. New York: McGraw-Hill, 2012.
[2] A. Fahad, N. Alshatri, Z. Tari, A. Alamri, I. Khalil, A. Y. Zomaya, S. Foufou, and A. Bouras, "A survey of clustering algorithms for big data: taxonomy and empirical analysis," IEEE Trans. Emerg. Top. Comput., vol. 2, no. 3, pp. 267-279, Sep. 2014. [CrossRef] [Web of Science Times Cited 623] [SCOPUS Times Cited 860] [3] J. Mac Queen, "Some methods for classification and analysis of multivariate observations," in Proc. Fifth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, 1967, vol. 1, pp. 281-297. [4] S. Lloyd, "Least squares quantization in PCM," IEEE Trans. Inf. Theory, vol. 28, pp. 129-137, Mar. 1982. [CrossRef] [Web of Science Times Cited 8849] [SCOPUS Times Cited 10963] [5] X. Wu, V. Kumar, J. R. Quinlan, J. Ghosh, Q. Yang, H. Motoda, G. J. McLachlan, A. F. M. Ng, B. Liu, P. S. Yu, Z.-H. Zhou, M. Steinbach, D. J. Hand, and D. Steinberg, "Top 10 algorithms in data mining," Knowl. Informat. Syst., vol. 14, no. 1, pp. 1-37, Jan. 2008. [CrossRef] [Web of Science Times Cited 3286] [SCOPUS Times Cited 4267] [6] J. Andreu-Perez, C. C. Y. Poon, R. D. Merrifield, S. T. C. Wong, and G.-Z. Yang, "Big data for health," IEEE J. Biomed. Health Inform., vol. 19, no. 4, pp. 1193-1208, July 2015. [CrossRef] [Web of Science Times Cited 394] [SCOPUS Times Cited 557] [7] S. Ram, W. L. Zhang, M. Williams, and Y. Pengetnze, "Predicting asthma-related emergency department visits using big data," IEEE J. Biomed. Health Inform., vol. 19, no. 4, pp. 1216-1223, July 2015. [CrossRef] [Web of Science Times Cited 112] [SCOPUS Times Cited 152] [8] W. Breymann, A. Dias, and P. Embrechts, "Dependence structures for multivariate high-frequency data in finance," Quant. Finance, vol. 3, no. 1, pp. 1-14, 2003. [CrossRef] [SCOPUS Times Cited 287] [9] P. Dewdney, P. Hall, R. Schilizzi, and J. Lazio, "The square kilometre array," Proc. IEEE, vol. 97, no. 8, pp. 1482-1496, Aug. 2009. [CrossRef] [Web of Science Times Cited 773] [SCOPUS Times Cited 851] [10] C. Reed, D. Thompson, W. Majid, and K. Wagstaff, "Real time machine learning to find fast transient radio anomalies: A semi-supervised approach combining detection and RFI excision," in Proc. International Astronomical Union Symposium on Time Domain Astronomy, 2011, pp. 1-6. [11] J. Erman, M. Arlitt, and A. Mahanti, "Traffic classification using clustering algorithms," in Proc. SIGCOMM Workshop on Mining Network Data, Pisa, Italy, 2006, pp. 281-286. [12] T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman, and A. Y. Wu, "An efficient k-means clustering algorithm: analysis and implementation," IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 7, pp. 881-892, Jul. 2002. [CrossRef] [Web of Science Times Cited 3494] [SCOPUS Times Cited 4478] [13] D. Arthur and S. Vassilvitskii, "k-means++: the advantages of careful seeding," in Proc. Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, LA, USA, 2007, pp. 1027-1035. [14] S. Nasser, R. Alkhaldi, and G. Vert, "A modified fuzzy k-means clustering using expectation maximization," in Proc. 2006 IEEE International Conference on Fuzzy Systems, Vancouver, BC, Canada, 2006, pp. 231-235. [CrossRef] [Web of Science Times Cited 25] [SCOPUS Times Cited 42] [15] K. A. A. Nazeer, S. D. M. Kumar, and M. P. Sebastian, "Enhancing the k-means clustering algorithm by using a O(n logn) heuristic method for finding better initial centroids," in Proc. 2011 Second International Conference on Emerging Applications of Information Technology, Washington, DC, USA, 2011, pp. 261-264. [CrossRef] [SCOPUS Times Cited 26] [16] D. Pelleg and A. Moore, "Accelerating exact k-means algorithms with geometric reasoning," in Proc. ACM SIGKDD Fifth International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, 1999, pp. 277-281. [17] C. Elkan, "Using the triangle inequality to accelerate k-means," in Proc. Twentieth International Conference on Machine Learning, Washington, DC, USA, 2003, pp. 147-153. [18] A. W. Moore, "The anchors hierarchy: using the triangle inequality to survive high dimensional data," in Proc. Twelfth Conference on Uncertainty in Artificial Intelligence, CA, USA, 2000, pp. 397-405. [19] T. Kaukoranta, P. Franti, and O. Nevalainen, "A fast exact GLA based on code vector activity detection," IEEE Trans. Image Process., vol. 9, no. 8, pp. 1337-1342, Aug. 2000. [CrossRef] [Web of Science Times Cited 49] [SCOPUS Times Cited 63] [20] [Online] Available: Temporary on-line reference link removed - see the PDF document [21] I.-D. Borlea, R.-E. Precup, and F. Dragan, "On the architecture of a clustering platform for the analysis of big volumes of data," in Proc. IEEE 11th International Symposium on Applied Computational Intelligence and Informatics, Timisoara, Romania, 2016, pp. 145-150. [CrossRef] [SCOPUS Times Cited 8] [22] P. Baranyi, D. Tikk, Y. Yam, and R. J. Patton, "From differential equations to PDC controller design via numerical transformation," Comput. Ind., vol. 51, no. 3, pp. 281-297, Aug. 2003. [CrossRef] [Web of Science Times Cited 115] [SCOPUS Times Cited 145] [23] I. Skrjanc, S. Blazic, and O. E. Agamennoni, "Identification of dynamical systems with a robust interval fuzzy model," Automatica, vol. 41, no. 2, pp. 327-332, Feb. 2005. [CrossRef] [Web of Science Times Cited 72] [SCOPUS Times Cited 99] [24] F. G. Filip, "Decision support and control for large-scale complex systems," Annual Rev. Control, vol. 32, no. 1, pp. 61-70, Apr. 2008. [CrossRef] [Web of Science Times Cited 131] [SCOPUS Times Cited 153] [25] D. Martín, R. Del Toro, R. Haber, and J. Dorronsoro, "Optimal tuning of a networked linear controller using a multi-objective genetic algorithm and its application to one complex electromechanical process," Int. J. Innov. Comput. Informat. Control, vol. 5, no. 10 (B), pp. 3405-3414, Oct. 2009. [26] J. Vascak and M. Pala, "Adaptation of fuzzy cognitive maps for navigation purposes by migration algorithms," Int. J. Artif. Intell., vol. 8, no. S12, pp. 20-37, Oct. 2012. [27] R.-E. Precup, R.-C. David, E. M. Petriu, S. Preitl, and M.-B. Radac, "Novel adaptive charged system search algorithm for optimal tuning of fuzzy controllers," Expert Syst. Appl., vol. 41, no. 4, pp. 1168-1175, Mar. 2014. [CrossRef] [Web of Science Times Cited 71] [SCOPUS Times Cited 80] [28] D. Wijayasekara, O. Linda, M. Manic, and C. G. Rieger, "Mining building energy management system data using fuzzy anomaly detection and linguistic descriptions," IEEE Trans. Ind. Informat., vol. 10, no. 3, pp. 1829-1840, Aug. 2014. [CrossRef] [Web of Science Times Cited 79] [SCOPUS Times Cited 92] [29] R.-E. Precup, M.-C. Sabau, and E. M. Petriu, "Nature-inspired optimal tuning of input membership functions of Takagi-Sugeno-Kang fuzzy models for anti-lock braking systems," Appl. Soft Comput., vol. 27, pp. 575-589, Feb. 2015. [CrossRef] [Web of Science Times Cited 86] [SCOPUS Times Cited 99] [30] A. Y. Jaen-Cuellar, L. Morales-Velazquez, R. Romero-Troncoso, and R. A. Osornio-Rios, "FPGA-based embedded system architecture for micro-genetic algorithms applied to parameters optimization in motion control," Adv. Electr. Comput. Eng., vol. 15, no. 1, pp. 23-32, Mar. 2015. [CrossRef] [Full Text] [Web of Science Times Cited 3] [SCOPUS Times Cited 3] [31] O. Arsene, I. Dumitrache, and I. Mihu, "Expert system for medicine diagnosis using software agents," Expert Syst. Appl., vol. 42, no. 4, pp. 1825-1834, Mar. 2015. [CrossRef] [Web of Science Times Cited 47] [SCOPUS Times Cited 54] [32] A. Basgumus, M. Namdar, G. Yilmaz, and A. Altuncu, "Performance comparison of the differential evolution and particle swarm optimization algorithms in free-space optical communications systems," Adv. Electr. Comput. Eng., vol. 15, no. 3, pp. 17-22, Sep. 2015. [CrossRef] [Full Text] [Web of Science Times Cited 12] [SCOPUS Times Cited 13] [33] A. Moharam, M. A. El-Hosseini, and H. A. Ali, "Design of optimal PID controller using NSGA-II algorithm and level diagram," Stud. Informat. Control, vol. 24, no. 3, pp. 301-308, Sep. 2015. [34] E. Osaba, E. Onieva, F. Dia, R. Carballedo, P. Lopez, and A. Perallos, "A migration strategy for distributed evolutionary algorithms based on stopping non-promising subpopulations: A case study on routing problems," Int. J. Artif. Intell., vol. 13, no. 2, pp. 46-56, Oct. 2015. [35] J. K. Tar, J. F. Bito, and I. J. Rudas, "Contradiction resolution in the adaptive control of underactuated mechanical systems evading the framework of optimal controllers," Acta Polyt. Hung., vol. 13, no. 1, pp. 97-121, Jan. 2016. [CrossRef] [36] S. B. Ghosn, F. Drouby, and H. M. Harmanani, "A parallel genetic algorithm for the open-shop scheduling problem using deterministic and random moves," Int. J. Artif. Intell., vol. 14, no. 1, pp. 130-144, Mar. 2016. [37] C. I. González, P. Melin, J. R. Castro, O. Castillo, and O. Mendoza, "Optimization of interval type-2 fuzzy systems for image edge detection," Appl. Soft Comput., vol. 47, pp. 631-643, Oct. 2016. [CrossRef] [Web of Science Times Cited 89] [SCOPUS Times Cited 140] [38] A. Fakharian and R. Rahmani, "An optimal controlling approach for voltage regulation and frequency stabilization in islanded microgrid system," Control Eng. Appl. Informat., vol. 18, no. 4, pp.107-114, Dec. 2016. Web of Science® Citations for all references: 18,310 TCR SCOPUS® Citations for all references: 23,432 TCR Web of Science® Average Citations per reference: 469 ACR SCOPUS® Average Citations per reference: 601 ACR TCR = Total Citations for References / ACR = Average Citations per Reference We introduced in 2010 - for the first time in scientific publishing, the term "References Weight", as a quantitative indication of the quality ... Read more Citations for references updated on 2024-09-28 07:01 in 171 seconds. Note1: Web of Science® is a registered trademark of Clarivate Analytics. Note2: SCOPUS® is a registered trademark of Elsevier B.V. Disclaimer: All queries to the respective databases were made by using the DOI record of every reference (where available). Due to technical problems beyond our control, the information is not always accurate. Please use the CrossRef link to visit the respective publisher site. |
Faculty of Electrical Engineering and Computer Science
Stefan cel Mare University of Suceava, Romania
All rights reserved: Advances in Electrical and Computer Engineering is a registered trademark of the Stefan cel Mare University of Suceava. No part of this publication may be reproduced, stored in a retrieval system, photocopied, recorded or archived, without the written permission from the Editor. When authors submit their papers for publication, they agree that the copyright for their article be transferred to the Faculty of Electrical Engineering and Computer Science, Stefan cel Mare University of Suceava, Romania, if and only if the articles are accepted for publication. The copyright covers the exclusive rights to reproduce and distribute the article, including reprints and translations.
Permission for other use: The copyright owner's consent does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific written permission must be obtained from the Editor for such copying. Direct linking to files hosted on this website is strictly prohibited.
Disclaimer: Whilst every effort is made by the publishers and editorial board to see that no inaccurate or misleading data, opinions or statements appear in this journal, they wish to make it clear that all information and opinions formulated in the articles, as well as linguistic accuracy, are the sole responsibility of the author.