Clustering for Data-driven Unraveling Artificial Neural Networks

Resumo


This work presents an investigation on how to define Neural Networks (NN) architectures adopting a data-driven approach using clustering to create sub-labels to facilitate the learning process and to discover the number of neurons needed to compose the layers. We also increase the depth of the model aiming to represent the samples better, the more in-depth it flows into the model. We hypothesize that the clustering process identifies sub-regions in the feature space in which the samples belonging to the same cluster have strong similarities. We used seven benchmark datasets to validate our hypothesis using 10-fold cross validation 3 times. The proposed model increased the performance, while never decreased it, with statistical significance considering the p-value $< 0.05$ in comparison with a Multi-Layer Perceptron with a single hidden layer with approximately the same number of parameters of the architectures found by our approach.

Palavras-chave: Neural networks, Data-driven architecture, Sub-labels, Clustering, Representation learning

Referências

Cai, H., Chen, T., Zhang, W., Yu, Y., Wang, J.: Efficient architecture search by network transformation. In: Thirty-Second AAAI conference on artificial intelligence (2018)

Caliński, T., Harabasz, J.: A dendrite method for cluster analysis. Communications in Statistics-theory and Methods 3(1), 1–27 (1974)

Eberhart, R.C., Shi, Y.: Computational Intelligence: Concepts to Implementations. Elsevier Science (2011)

Farias, F.C., Bernarda Ludermir, T., Bastos-Filho, C.J.A., Rosendo da Silva Oliveira, F.: Analyzing the impact of data representations in classification problems using clustering. In: 2019 International Joint Conference on Neural Networks (IJCNN). pp. 1–6 (July 2019)

Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep learning, vol. 1. MIT press Cambridge (2016)

Gubbi, J., Buyya, R., Marusic, S., Palaniswami, M.: Internet of things (iot): A vision, architectural elements, and future directions. Future generation computer systems 29(7), 1645–1660 (2013)

Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

Klambauer, G., Unterthiner, T., Mayr, A., Hochreiter, S.: Self-normalizing neural networks. In: Advances in neural information processing systems. pp. 971–980(2017)

Liu, C., Zoph, B., Neumann, M., Shlens, J., Hua, W., Li, L.J., Fei-Fei, L., Yuille, A., Huang, J., Murphy, K.: Progressive neural architecture search. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 19–34 (2018)

Liu, J., Gong, M., He, H.: Nucleus neural network: A data-driven self-organized architecture. arXiv preprint arXiv:1904.04036 (2019)

Mitchell, M.: An introduction to genetic algorithms. MIT press (1998)

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen,T.,Lin, Z., Gimelshein, N., Antiga, L., Desmaison,A.,Kopf,A.,Yang,E.,DeVito,Z.,Raison,M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang,L.,Bai,J.Chintala,Pytorch: An imperative style, high-performance deep learning library. In: Wallach,H., Larochelle, H., Beygelzimer, A., dAlché Buc, F., Fox, E., Garnett, R. (eds.)Advances in Neural Information Processing Systems 32, pp. 8024–8035. Curran Associates, Inc. (2019)

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O.,Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A.,Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011)

Stanley, K.O., Clune, J., Lehman, J., Miikkulainen, R.: Designing neural networks through neuroevolution. Nature Machine Intelligence 1(1), 24–35 (2019)

Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evolutionary computation 10(2), 99–127 (2002)

Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016)

Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 8697–8710 (2018)
Publicado
20/10/2020
FARIAS, Felipe; LUDERMIR, Teresa; BASTOS-FILHO, Carmelo. Clustering for Data-driven Unraveling Artificial Neural Networks. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 17. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 567-578. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2020.12160.