Towards Robust Cluster-Based Hyperparameter Optimization
Resumo
Hyperparameter optimization is a fundamental step in machine learning pipelines since it can influence the predictive performance of the resulting models. However, the setup generally selected by classical hyperparameter optimization based on minimizing an objective function may not be robust to overfitting. This work proposes CHyper, a novel clustering-based approach to hyperparameter selection. CHyper derives a candidate cluster of close or similar hyperparameters with low prediction errors in the validation dataset. Hyperparameters chosen are likely to produce models that generalize the inherent behavior of the data. CHyper was evaluated with two different clustering techniques, namely k-means and spectral clustering, in the context of time series prediction of annual fertilizer consumption. Complementary to minimizing an objective function, cluster-based hyperparameter selection achieved robustness to negative overfitting effects and contributed to lowering a generalization error.
Referências
García, S., Luengo, J., and Herrera, F. (2014). Data Preprocessing in Data Mining. Springer.
Karatzoglou, A., Hornik, K., Smola, A., and Zeileis, A. (2004). kernlab - An S4 package for kernel methods in R. Journal of Statistical Software, 11:1–20.
Khalid, R. and Javaid, N. (2020). A survey on hyperparameters optimization algorithms of forecasting models in smart grid. Sustainable Cities and Society, 61.
Li, H. and Huang, S. (2021). Research on the Prediction Method of Stock Price Based on RBF Neural Network Optimization Algorithm. In E3S Web of Conferences, volume 235.
Liu, Y., Sun, Y., Xue, B., Zhang, M., Yen, G., and Tan, K. (2021). A Survey on Evolutionary Neural Architecture Search. IEEE Transactions on Neural Networks and Learning Systems.
Ogasawara, E., Murta, L., Zimbrão, G., and Mattoso, M. (2009). Neural networks cartridges for data mining on time series. In IJCNN, pages 2302–2309.
Ran, Z.-Y. and Hu, B.-G. (2017). Parameter identifiability in statistical machine learning: A review. Neural Computation, 29(5):1151–1203.
Sarwar Murshed, M., Murphy, C., Hou, D., Khan, N., Ananthanarayanan, G., and Hussain, F. (2022). Machine Learning at the Network Edge: A Survey. ACM Computing Surveys, 54(8).
Yu, J. and Kang, S. (2019). Clustering-based proxy measure for optimizing one-class classifiers. Pattern Recognition Letters, 117:37–44.
Zhang, C., Ren, M., and Urtasun, R. (2019). Graph hypernetworks for neural architecture search. In ICLR, 2019.