HubISC: a new hubness-based algorithm for image data stream classification

  • Mateus C. de Lima Federal University of Uberlândia
  • Elaine R. Faria Federal University of Uberlândia
  • Maria Camila N. Barioni Federal University of Uberlândia

Abstract


Image data stream classification presents several challenges. One of the challenges inherent in the domain of image data is the high dimensionality of the data, which can cause the curse of dimensionality. Studies applied in other contexts efficiently employ an aspect, called hubness, inherent in high-dimensional data. This article introduces a new algorithm for image data stream classification, which incorporates the hubness aspect. The experiment results show a good cost-benefit of the algorithm in terms of efficacy and the percentage of labeled instances in relation to commonly used algorithms for image data stream classification.
Keywords: Image data stream classification, Hubness, High dimensionality

References

Alzubaidi, L., Zhang, J., Humaidi, A. J., Al-dujaili, A., Duan, Y., Al-Shamma, O., Santamaria, J., Fadhel, M. A., Al-Amidie, M., and Farhan, L. (2021). Review of deep learning: concepts, cnn architectures, challenges, applications, future directions. Journal of Big Data, 8. DOI: 10.1186/s40537-021-00444-8.

Castro, F. M., Marin-Jimenez, M. J., Guil, N., Schmid, C., and Alahari, K. (2018). End-to-end incremental learning. In ECCV, pages 241-257, Munich, Germany. Springer. DOI: 10.1007/978-3-030-01258-8 15.

de Lima, M. C., Barioni, M. C. N., Faria, E. R., and Razente, H. L. (2020). Evisclass: a new evaluation method for image data stream classifiers. In ICMLA, pages 399-406. DOI: 10.1109/ICMLA51294.2020.00070.

Hu, J., Sun, Z., Li, B., Yang, K., and Li, D. (2017). Online user modeling for interactive streaming image classification. In MMM, pages 293-305, Reykjavik, Iceland. Springer. DOI: 10.1007/978-3-319-51814-5 25.

Mani, P., Vazquez, M., Metcalf-Burton, J., Domeniconi, C., Fairbanks, H., Bal, G., Beer, E., and Tari, S. (2019). The hubness phenomenon in high-dimensional spaces. AWMS, pages 15-45. DOI: 10.1007/978-3-030-11566-1 2.

Nguyen, H.-L., Woon, Y.-K., and Ng, W.-K. (2015). A survey on data stream clustering and classification. Knowl.Inf.Syst., 45(3):535-569. DOI:10.1007/s10115-014-0808-1.

Parreira, P. and Prati, R. (2019). Active learning in data stream with intermediate latency. In ENIAC, Salvador, Brazil. DOI: 10.5753/eniac.2019.

Pham, T., Kottke, D., Krempl, G., and Sick, B. (2021). Stream-based active learning for sliding windows under the influence of verification latency. Machine Learning. DOI:10.1007/s10994-021-06099-z.

Rebuffi, S.-A., Kolesnikov, A., Sperl, G., and Lampert, C. H. (2017). iCaRL: Incremental classifier and representation learning. In CVPR, pages 5533-5542, Honolulu, Hawaii. IEEE. DOI: 10.1109/CVPR.2017.587.

Ristin, M., Guillaumin, M., Gall, J., and Gool, L. V. (2014). Incremental learning of ncm forests for large-scale image classification. In CVPR, pages 3654-3661, Columbus, Ohio. IEEE. DOI: 10.1109/CVPR.2014.467.

Romaszewski, M., Glomb, P., and Cholewa, M. (2018). Adaptive, hubness-aware nearest neighbour classifier with application to hyperspectral data. In ISCIS, pages 113-120, Poznan, Polonia. DOI: 10.1007/978-3-030-00840-6_13.

Samet, H. (2005). Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann.

Silva, J. A., Faria, E. R., Barros, R. C., Hruschka, E. R., Carvalho, A. C. P. L. F. d., and Gama, J. a. (2013). Data stream clustering: A survey. ACM Comput. Surv., 46(1):13:1-13:31. DOI: 10.1145/2522968.2522981.

Tomasev, N., Radovanovic, M., Mladenic, D., and Ivanovic, M. (2014). Hubness-based fuzzy measures for high-dimensional k-nearest neighbor classification. Int. J. Mach. Learn. e Cyber., 5:445-458. DOI: 10.1007/s13042-012-0137-1.

Wang, H., Zhou, Z., Wang, Y., and Yan, X. (2021). Feature selection for image classification based on bacterial colony optimization. In ICSI, page 430-439, Qingdao, China. Springer. DOI: 10.1007/978-3-030-78811-7 40.

Wang, Z., Kong, Z., Changra, S., Tao, H., and Khan, L. (2019). Robust high dimensional stream classification with novel class detection. In ICDE, pages 1418-1429, Macao, Macao. IEEE. DOI: 10.1109/ICDE.2019.0012.

Wu, Q., Lin, Y., Zhu, T., and Zhang, Y. (2020). Hiboost: A hubness-aware ensemble learning algorithm for high-dimensional imbalanced data classification. J. Intell. Fuzzy Syst., 39:1-12. DOI: 10.3233/JIFS-190821.

Zliobaite, I., Bifet, A., Pfahringer, B., and Holmes, G. (2014). Active learning with drifting streaming data. TNNLS, 25(1):27-39. DOI: 10.1109/TNNLS.2012.2236570.
Published
2022-09-19
DE LIMA, Mateus C.; FARIA, Elaine R.; BARIONI, Maria Camila N.. HubISC: a new hubness-based algorithm for image data stream classification. In: BRAZILIAN SYMPOSIUM ON DATABASES (SBBD), 37. , 2022, Búzios. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 138-150. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2022.224318.