Property-based Testing for Machine Learning Models

  • Vinicius H. S. Durelli UFSJ
  • Ricardo Monteiro UFSJ
  • Rafael S. Durelli UFLA
  • Andre T. Endo UFSCar
  • Fabiano C. Ferrari UFSCar
  • Simone R. S. Souza USP


There has been a growing interest in machine learning due to its potential to address a myriad of problems that would otherwise be difficult to solve. Consequently, the adoption of machine learning based programs has become mainstream. Owing to this widespread adoption, it is imperative to develop automated approaches to assess the quality of machine learning-based solutions. Although significant research has been devoted to creating automated test input generation methods for machine learning programs, some promising approaches to test data generation have received limited attention. This paper introduces a property-driven approach to test data generation that leverages the training of an interpretable model, specifically a decision tree, to predict the behavior of the model under test. The tree-like structure of the resulting interpretable model provides valuable insights into the model’s behavior under test. These insights are then transformed into executable properties, enabling the generation of test data. A primary advantage of property-based testing is its capacity to generate a vast number of inputs from a single property, thereby offering a more rigorous evaluation of machine learning models. The results of our experiment suggest that our property-driven approach has the potential to generate test data that more thoroughly examine models compared to more widely used methods for evaluating the performance and generalizability of machine learning models.
Palavras-chave: Software testing, property-based testing, machine learning


