Plant Classification Using Weighted k-NN Variants
Abstract
Automatic plant species identification is a difficulty challenge for botanical taxonomy field. Many works have been proposed towards the development of automatic plant species recognition systems through machine learning methods. One of the most popular algorithms for plant classification is the k-Nearest Neighbor (k-NN), given its simplicity and robustness. In this work, we evaluate the performance of two improved weighted k-NN algorithms when dealing with plant classification task. Experimental evaluation includes three real-world data sets obtained from different image processing and feature extraction processes. Also, a statistical hypothesis test is employed to perform an overall evaluation of the selected models.
References
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al. (2016). Tensorflow: A system for large-scale machine learning. In OSDI, volume 16, pages 265–283.
Agarwal, G., Belhumeur, P., Feiner, S., Jacobs, D., Kress,W. J., Ramamoorthi, R., Bourg, N. A., Dixit, N., Ling, H., Mahajan, D., et al. (2006). First steps toward an electronic field guide for plants. Taxon, 55(3):597–610.
Anderson, E. (1935). The irises of the gaspe peninsula. Bulletin of American Iris Society, 59:2–5.
Asuncion, A. and Newman, D. (2007). Uci machine learning repository. Bellman, R. E. (1957). Dynamic programming. Princeton University Press.
Buitinck, L., Louppe, G., Blondel, M., Pedregosa, F., Mueller, A., Grisel, O., Niculae, V., Prettenhofer, P., Gramfort, A., Grobler, J., Layton, R., VanderPlas, J., Joly, A., Holt, B., and Varoquaux, G. (2013). API design for machine learning software: experiences from the scikit-learn project. In ECML PKDD Workshop: Languages for Data Mining and Machine Learning, pages 108–122.
Charytanowicz, M., Niewczas, J., Kulczycki, P., Kowalski, P. A., Łukasik, S., and Zak, S. (2010). Complete gradient clustering algorithm for features analysis of x-ray images. In Information technologies in biomedicine, pages 15–24. Springer.
Cover, T. and Hart, P. (1967). Nearest neighbor pattern classification. IEEE transactions on information theory, 13(1):21–27.
Demsar, J. (2006). Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 7:1–30.
Dudani, S. A. (1976). The distance-weighted k-nearest-neighbor rule. IEEE Transactions on Systems, Man, and Cybernetics, 1(4):325–327.
Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of human genetics, 7(2):179–188.
Friedman, M. (1937). The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the american statistical association, 32(200):675–701.
Jin, T., Hou, X., Li, P., and Zhou, F. (2015). A novel method of automatic plant species identification using sparse representation of leaf tooth features. PloS one, 10(10):e0139482.
Kumar, N., Belhumeur, P. N., Biswas, A., Jacobs, D. W., Kress, W. J., Lopez, I. C., and Soares, J. V. (2012). Leafsnap: A computer vision system for automatic plant species identification. In Computer Vision–ECCV 2012, pages 502–516. Springer.
Mallah, C., Cope, J., and Orwell, J. (2013). Plant leaf classification using probabilistic integration of shape, texture and margin features. Signal Processing, Pattern Recognition and Applications, 5(1).
Mallah, C. D. and Orwell, J. (2013). Probabilistic classification from a k-nearestneighbour classifier. Computational Research, 1(1):1–9.
Nemenyi, P. (1962). Distribution-free multiple comparisons. Biometrics, 18(2):263.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.
Rahmani, M. E., Amine, A., and Hamou, M. R. (2015). Plant leaves classification. ALLDATA 2015, 82.
Sahay, A. and Chen, M. (2016). Leaf analysis for plant recognition. In Software Engineering and Service Science (ICSESS), 2016 7th IEEE International Conference on, pages 914–917. IEEE.
Sierra, B., Lazkano, E., Irigoien, I., Jauregi, E., and Mendialdua, I. (2011). K nearest neighbor equality: giving equal chance to all existing classes. Information Sciences, 181(23):5158–5168.
