Data Complexity and Instance Hardness Measures in Transferability Evaluation

Derick W. B. Rangel; Alfredo A. A. Exposito De Queiroz; Ana C. Lorena

Derick W. B. Rangel ITA
Alfredo A. A. Exposito De Queiroz ITA
Ana C. Lorena ITA

Resumo

Advancements in Machine Learning (ML) have driven the development of diverse pre-trained models. Through transfer learning, knowledge from these models can be transferred to other domains for solving challenging problems for which the amount of training data is still scarce. However, there are currently various pre-trained models with different architectures and possibly different source domains where learning occurs. This makes it challenging to choose a particular model for transferability. The hypothesis investigated in this paper is that the data complexity of the embedded target dataset, as represented by the feature representation provided by the pre-trained model, can be used to inform the choice of specific models for new target applications. Specifically, if the representation allows a good separation between the classes, it will be a better option for knowledge transfer than another representation where the classes overlap more. We experimentally demonstrate that specific data complexity and instance hardness measures can effectively evaluate the knowledge transferability of different pre-trained models for new target datasets, enabling the ranking of models for evaluation.