Meta-learning for neural network parameter optimization
Abstract
The optimization of Artificial Neural Networks (ANNs) is an important task to the success of using these models in real-world applications. The solutions adopted to this task are expensive in general, involving trial-and-error procedures or expert knowledge which are not always available. In this work, we investigated the use of meta-learning to the optimization of ANNs. Meta-learning is a research field aiming to automatically acquiring knowledge which relates features of the learning problems to the performance of the learning algorithms. The meta-learning techniques were originally proposed and evaluated to the algorithm selection problem and after to the optimization of parameters for Support Vector Machines. However, meta-learning can be adopted as a more general strategy to optimize ANN parameters, which motivates new efforts in this research direction. In the current work, we performed a case study using meta-learning to choose the number of hidden nodes for MLP networks, which is an important parameter to be defined aiming a good networks’ performance. In our work, we generated a base of meta-examples associated to 93 regression problems. Each meta-example was generated from a regression problem and stored: 16 features describing the problem (e.g., number of attributes and correlation among the problem’s attributes) and the best number of nodes for this problem, empirically chosen from a range of possible values. This set of meta-examples was given as input to a meta-learner which was able to predict the best number of nodes for new problems based on their features. The experiments performed in this case study revealed satisfactory results.
References
Aha, D. W., Kibler, D., and Albert, M. K. (1991). Instance-based learning algorithms. Machine Learning, 6:37–66.
Alia, S. and Smith-Miles, K. A. (2006). A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing, 70:173–186.
Bin, Z. Y., Zhong, L. L., and Ming, Z. Y. (2010). Study on network flow prediction model based on particle swarm optimization algorithm and RBF neural network. In Computer Science and Information Technology (ICCSIT), 2010 3rd IEEE International Conference on, volume 2, pages 302–306.
Blum, A. (1992). Neural networks in C++. John Wiley e Sons, New York.
Braga, A. d. P., Carvalho, A. P. d. L. F., and Ludermir, T. B. (2007). Redes Neurais Artificiais: Teoria e Aplicações. LTC, Rio de Janeiro.
Brazdil, P., Soares, C., and da Costa, J. (2003). Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. Machine Learning, 50:251–277.
El-Henawy, I., Kamal, A., Abdelbary, H., and Abas, A. (2010). Predicting stock index using neural network combined with evolutionary computation methods. In Informatics and Systems (INFOS), 2010 The 7th International Conference on, pages 1–6.
Giraud-Carrier, C., Vilalta, R., and Brazdil, P. (2004). Introduction to the special issue on meta-learning. Machine Learning., 54:187–193.
Gomperts, A., Ukil, A., and Zurfluh, F. (2011). Development and implementation of parameterized FPGA-based general purpose neural networks for online applications. Industrial Informatics, IEEE Transactions on, 7(1):78–89.
Guerra, S. B., Prudêncio, R. B. C., and Ludermir, T. B. (2008). Predicting the performance of learning algorithms using support vector machines as meta-regressors. In Proceedings of the 18th international conference on Artificial Neural Networks, Part I, ICANN ’08, pages 523–532. Springer-Verlag.
Harun, N., Dlay, S., and Woo, W. (2010). Performance of keystroke biometrics authentication system using multilayer perceptron neural network (MLP NN). In Communication Systems Networks and Digital Signal Processing (CSNDSP), 2010 7th International Symposium on, pages 711–714.
Huang, G., Zhua, Q., and Siew, C. (2004). Extreme learning machine: A new learning scheme of feedforward neural networks. International Joint Conference on Neural Networks, 2:985–990.
Huang, G., Zhua, Q., and Siew, C. (2006). Extreme learning machine: Theory and applications. Neurocomputing, 70:489–501.
Kalousis, A., Gama, J., and Hilário, M. (2004). On data and algorithms: Understanding inductive performance. Machine Learning, 54(3):275–312.
Li, H., Cai, M., and Xia, Z.-Y. (2010). Algorithm research of RBF neural network based on improved PSO. In Multimedia Communications (Mediacom), 2010 International Conference on, pages 87–89.
Mar, T., Zaunseder, S., Martinez Cortes, J. P., Llamedo Soria, M., and Poll, R. (2011). Optimization of ECG classification by means of feature selection. Biomedical Engineering, IEEE Transactions on, PP(99):1.
Patra, J. and Chua, K. (2010). Neural network based drug design for diabetes mellitus using QSAR with 2D and 3D descriptors. In Neural Networks (IJCNN), The 2010 International Joint Conference on, pages 1–8.
Prudêncio, R. B. and Ludermir, T. B. (2009). Active generation of training examples in meta-regression. In Proceedings of the 19th International Conference on Artificial Neural Networks: Part I, ICANN ’09, pages 30–39. Springer-Verlag.
Quinlan, J. R. (1992). Learning with continuous classes. In Proceedings of the Australian Joint Conference on Artificial Intelligence, pages 343–348. World Scientific.
Sarle, W. S. (1997). Neural network faq, part 1 of 7: Introduction. [link].
Smith-Miles, K. A. (2008). Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Comput. Surv., 41:1–25.
Soares, C., Brazdil, P. B., and Kuba, P. (2004). A meta-learning method to select the kernel width in support vector regression. Machine Learning, 54:195–209.
Zanchettin, C., Ludermir, T. B., and Almeida, L. M. (2011). Hybrid training method for MLP: Optimization of architecture and training. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 41:1–13.
