Iris-CV: Classifying Iris Flowers Is Not as Easy as You Thought
Resumo
The iris flower dataset is a ubiquitous benchmark task in machine learning literature. With its 150 instances, four continuous features, and three balanced classes, of which one is linearly separable from the others, iris is generally considered an easy problem. Hence researchers usually rely on other datasets when they need more challenging benchmarks. A similar situation happens with computer vision datasets such as MNIST and ImageNet, which have been widely explored. The state of the art models essentially solves these problems, motivating the search for more challenging tasks. Therefore, this paper introduces a new computer vision toy dataset featuring iris flowers. Users of a nature photography application took the pictures, thus they include noisy background information. Additionally, certain desirable features are not guaranteed, such as single, similarly-sized objects at the center of each picture, which makes the task more challenging. Our benchmark results show that the dataset can be challenging for traditional machine learning algorithms without any pre-processing steps, while state of the art deep learning architectures achieve around 82% accuracy, which means some effort will be necessary to drive this accuracy closer to what has been accomplished for MNIST and ImageNet.
Palavras-chave:
Computer vision, Dataset, Machine learning
Publicado
29/11/2021
Como Citar
ROCHA FILHO, Itamar de Paiva et al.
Iris-CV: Classifying Iris Flowers Is Not as Easy as You Thought. In: BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 10. , 2021, Online.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2021
.
ISSN 2643-6264.