Uma Abordagem de Agrupamento Automático de Dados Baseada na Otimização por Busca em Grupo Memética
Resumo
Uma das tarefas mais primitivas em organização de padrões, a Análise de Agrupamentos, é um problema difícil em análise exploratória de dados. Muitos dos algoritmos de agrupamento são facilmente presos em mínimos locais, por não possuírem bons operadores de busca global. Neste trabalho, um algoritmo de Inteligência de Enxames (SIs) memético é apresentado, baseado na Otimização por Busca em Grupo e no K-Means, chamado MGSO, que tenta encontrar o melhor número de agrupamentos, assim como a melhor distribuição dos dados nesses agrupamentos, simultaneamente. O MGSO mostrou-se capaz de encontrar boas soluções globais quando testado em nove problemas reais, em comparação a outros SIs e Algoritmos Evolucionários da literatura.
Referências
Barnard, C. and Sibly, R. (1981). Producers and scroungers: a general model and its application to captive ocks of house sparrows. Animal Behaviour, 29(2):543–550.
Calínski, T. and Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics-theory and Methods, 3(1):1–27.
Civicioglu, P. (2013). Backtracking search optimization algorithm for numerical optimization problems. Applied Mathematics and computation, 219(15):8121–8144.
Couzin, I. D., Krause, J., Franks, N. R., and Levin, S. A. (2005). Effective leadership and decision-making in animal groups on the move. Nature, 433(7025):513–516.
Das, S., Abraham, A., and Konar, A. (2007). Automatic clustering using an improved differential evolution algorithm. IEEE Transactions on systems, man, and cyberneticsPart A: Systems and Humans, 38(1):218–237.
Davies, D. L. and Bouldin, D. W. (1979). A cluster separation measure. IEEE transactions on pattern analysis and machine intelligence, (2):224–227.
Demsar, J. (2006). Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 7:1–30.
Dixon, A. (1959). An experimental study of the searching behaviour of the predatory coccinellid beetle adalia decempunctata (l.). The Journal of Animal Ecology, pages 259–281.
Elaziz, M. A., Nabil, N., Ewees, A. A., and Lu, S. (2019). Automatic data clustering based on hybrid atom search optimization and sine-cosine algorithm. In 2019 IEEE Congress on Evolutionary Computation (CEC), pages 2315–2322. IEEE.
Friedman, M. (1937). The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the american statistical association, 32(200):675– 701.
Halkidi, M., Batistakis, Y., and Vazirgiannis, M. (2002). Cluster validity methods: part i. ACM Sigmod Record, 31(2):40–45.
He, S., Wu, Q. H., and Saunders, J. R. (2009). Group search optimizer: an optimization algorithm inspired by animal searching behavior. IEEE Transactions on Evolutionary Computation, 13(5):973–990.
Higgins, C. L. and Strauss, R. E. (2004). Discrimination and classification of foraging paths produced by search-tactic models. Behavioral Ecology, 15(2):248–254.
Holland, J. H. (1992). Genetic algorithms. Scientific american, 267(1):66–72.
Hubert, L. and Arabie, P. (1985). Comparing partitions. Journal of classification, 2(1):193–218.
Jin, Y.-F. and Yin, Z.-Y. (2020). Enhancement of backtracking search algorithm for identifying soil parameters. International Journal for Numerical and Analytical Methods in Geomechanics, 44(9):1239–1261.
José-García, A. and Gómez-Flores, W. (2016). Automatic clustering using nature-inspired metaheuristics: A survey. Applied Soft Computing, 41:192–213.
Kennedy, J. and Eberhart, R. (1995). Particle swarm optimization. In Neural Networks, 1995. Proceedings., IEEE International Conference on, volume 4, pages 1942–1948. IEEE.
Latiff, N. A., Malik, N. N. A., and Idoumghar, L. (2016). Hybrid backtracking search optimization algorithm and k-means for clustering in wireless sensor networks. In 2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), pages 558–564. IEEE.
MacQueen, J. et al. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, pages 281–297. California, USA.
Nemenyi, P. (1962). Distribution-free multiple comparisons. In Biometrics, volume 18, page 263. INTERNATIONAL BIOMETRIC SOC 1441 I ST, NW, SUITE 700, WASHINGTON, DC 20005-2210.
Pacífico, L. (2020). Agrupamento de imagens baseado em uma abordagem híbrida entre a otimização por busca em grupo e k-means para a segmentação automática de doenças em plantas. In Anais do XVII Encontro Nacional de Inteligência Artificial e Computacional, pages 152–163. SBC.
Pacifico, L. and Ludermir, T. (2020). Backtracking group search optimization: A hybrid approach for automatic data clustering. In Brazilian Conference on Intelligent Systems, pages 64–78. Springer.
Pacifico, L. D. and Ludermir, T. B. (2019). Hybrid k-means and improved self-adaptive particle swarm optimization for data clustering. In 2019 International Joint Conference on Neural Networks (IJCNN), pages 1–7. IEEE.
Pacifico, L. D. and Ludermir, T. B. (2021). An evaluation of k-means as a local search operator in hybrid memetic group search optimization for data clustering. Natural Computing, 20(3):611–636.
Preetha, V. (2021). Data analysis on student’s performance based on health status using genetic algorithm and clustering algorithms. In 2021 5th International Conference on Computing Methodologies and Communication (ICCMC), pages 836–842. IEEE.
Shi, X., Zhang, X., and Xu, M. (2020). A self-adaptive preferred learning differential evolution algorithm for task scheduling in cloud computing. In 2020 IEEE International Conference on Advances in Electrical Engineering and Computer Applications (AEECA), pages 145–148. IEEE.
Storn, R. and Price, K. (1995). Differential evolution - a simple and efficient adaptive scheme for global optimization over continuous spaces. international computer science institute, berkeley. Technical report, CA, 1995, Tech. Rep. TR-95–012.
Tam, H.-H., Ng, S.-C., Lui, A. K., and Leung, M.-F. (2017). Improved activation schema on automatic clustering using differential evolution algorithm. In 2017 IEEE Congress on Evolutionary Computation (CEC), pages 1749–1756. IEEE.
Ye, L. and Zheng, D. (2021). Stable grasping control of robot based on particle swarm optimization. In 2021 IEEE 2nd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), pages 1020–1024. IEEE.