WSI2ML – An Open-Source Whole Slide Image Annotation Software for Machine Learning Applications

  • Luan V. C. Martins USP
  • Adriana Passos Bueno CIPE/A.C. Camargo Cancer Center
  • Alexandre Defelicibus CIPE/A.C. Camargo Cancer Center
  • Rodrigo D. Drummond CIPE/A.C. Camargo Cancer Center
  • Renan Valieris CIPE/A.C. Camargo Cancer Center
  • Yu-Tao Zhu China Branch of BRICS Institute of Future Networks
  • Israel Tojal Da Silva CIPE/A.C. Camargo Cancer Center
  • Liang Zhao USP


Machine learning (ML) has emerged as a powerful tool for improving the clinical pathology routine; however, developing novel ML research requires a complex multidisciplinary team effort. Therefore, a software for tackling the challenges of effectively annotating, building, and validating ML models is desirable. In this work we present WSI2ML, a web-based platform that provides a friendly interface suited for ML computational pathology research to be easily performed. Compared to similar tools currently available, the proposed software provides a complete toolset for each stage of ML workflow. We demonstrate the usefulness and functionality of WSI2ML by analyzing the performance results obtained in a tissue recognition task using a novel gastric cancer research dataset that is currently being developed with the tool. The software, documentation, installation instructions and related annotation handling library is freely available at

Palavras-chave: bioinformatics, image tagging, whole slide image


David Ahmedt-Aristizabal, Mohammad Ali Armin, Simon Denman, Clinton Fookes, and Lars Petersson. 2022. A survey on graph-based deep learning for computational histopathology. Computerized Medical Imaging and Graphics 95 (2022), 102027

Peter Bankhead, Maurice B Loughrey, José A Fernández, Yvonne Dombrowski, Darragh G McArt, Philip D Dunne, Stephen McQuaid, Ronan T Gray, Liam J Murray, Helen G Coleman, 2017. QuPath: Open source software for digital pathology image analysis. Scientific reports 7, 1 (2017), 1–7

Howard Butler, Martin Daly, Allan Doyle, Sean Gillies, Stefan Hagen, and Tim Schaub. 2016. The geojson format. Technical Report

Mingyu Chen, Bin Zhang, Win Topatana, Jiasheng Cao, Hepan Zhu, Sarun Juengpanich, Qijiang Mao, Hong Yu, and Xiujun Cai. 2020. Classification and mutation prediction based on histopathology H&E images in liver cancer using deep learning. NPJ precision oncology 4, 1 (2020), 14

Francesco Ciompi, Oscar Geessink, Babak Ehteshami Bejnordi, Gabriel Silva De Souza, Alexi Baidoshvili, Geert Litjens, Bram Van Ginneken, Iris Nagtegaal, and Jeroen Van Der Laak. 2017. The importance of stain normalization in colorectal tissue classification with convolutional networks. In 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017). IEEE, 160–163

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248–255

Shujian Deng, Xin Zhang, Wen Yan, Eric I-Chao Chang, Yubo Fan, Maode Lai, and Yan Xu. 2020. Deep learning in digital pathology image analysis: a survey. Frontiers of medicine 14 (2020), 470–487

Amelie Echle, Niklas Timon Rindtorff, Titus Josef Brinker, Tom Luedde, Alexander Thomas Pearson, and Jakob Nikolas Kather. 2021. Deep learning in cancer pathology: a new generation of clinical biomarkers. British journal of cancer 124, 4 (2021), 686–696

Adam Goode, Benjamin Gilbert, Jan Harkes, Drazen Jukic, and Mahadev Satyanarayanan. 2013. OpenSlide: A vendor-neutral software foundation for digital pathology. Journal of pathology informatics 4, 1 (2013), 27

Matthew G Hanna, Anil Parwani, and Sahussapont Joseph Sirintrapun. 2020. Whole slide imaging: technology and applications. Advances in Anatomic Pathology 27, 4 (2020), 251–259

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778

MP Humphries, P Maxwell, and M Salto-Tellez. 2021. QuPath: The global impact of an open source digital pathology system. Computational and Structural Biotechnology Journal 19 (2021), 852–859

Cesare Lancellotti, Pierandrea Cancian, Victor Savevski, Soumya Rupa Reddy Kotha, Filippo Fraggetta, Paolo Graziano, and Luca Di Tommaso. 2021. Artificial intelligence & tissue biomarkers: advantages, risks and perspectives for pathology. Cells 10, 4 (2021), 787

Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. 2022. A convnet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11976–11986

Marc Macenko, Marc Niethammer, James S Marron, David Borland, John T Woosley, Xiaojun Guan, Charles Schmitt, and Nancy E Thomas. 2009. A method for normalizing histology slides for quantitative analysis. In 2009 IEEE international symposium on biomedical imaging: from nano to macro. IEEE, 1107–1110

Raphaël Marée, Loïc Rollus, Benjamin Stévens, Renaud Hoyoux, Gilles Louppe, Rémy Vandaele, Jean-Michel Begon, Philipp Kainz, Pierre Geurts, and Louis Wehenkel. 2016. Collaborative analysis of multi-gigapixel imaging data using Cytomine. Bioinformatics 32, 9 (2016), 1395–1401

Christian Marzahl, Marc Aubreville, Christof A Bertram, Jennifer Maier, Christian Bergler, Christine Kröger, Jörn Voigt, Katharina Breininger, Robert Klopfleisch, and Andreas Maier. 2021. EXACT: a collaboration toolset for algorithm-aided annotation of images with annotation version control. Scientific reports 11, 1 (2021), 1–11

Sambit K Mohanty and Anil V Parwani. 2022. Whole slide imaging: applications. Whole Slide Imaging: Current Applications and Future Directions (2022), 57–79

Pargorn Puttapirat, Haichuan Zhang, Yuchen Lian, Chunbao Wang, Xiangrong Zhang, Lixia Yao, and Chen Li. 2018. OpenHI-An open source framework for annotating histopathological image. In 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 1076–1082

Jorge S Reis-Filho and Jakob Nikolas Kather. 2023. Overcoming the challenges to implementation of artificial intelligence in pathology. JNCI: Journal of the National Cancer Institute (2023), djad048

Hyuna Sung, Jacques Ferlay, Rebecca L Siegel, Mathieu Laversanne, Isabelle Soerjomataram, Ahmedin Jemal, and Freddie Bray. 2021. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: a cancer journal for clinicians 71, 3 (2021), 209–249

Thaína A Azevedo Tosta, Paulo Rogério de Faria, Leandro Alves Neves, and Marcelo Zanchetta do Nascimento. 2019. Computational normalization of H&E-stained histological images: Progress, challenges and future potential. Artificial intelligence in medicine 95 (2019), 118–132

Jeroen Van der Laak, Geert Litjens, and Francesco Ciompi. 2021. Deep learning in histopathology: the path to the clinic. Nature medicine 27, 5 (2021), 775–784.
MARTINS, Luan V. C.; BUENO, Adriana Passos; DEFELICIBUS, Alexandre; DRUMMOND, Rodrigo D.; VALIERIS, Renan; ZHU, Yu-Tao; DA SILVA, Israel Tojal; ZHAO, Liang. WSI2ML – An Open-Source Whole Slide Image Annotation Software for Machine Learning Applications. In: SIMPÓSIO BRASILEIRO DE SISTEMAS MULTIMÍDIA E WEB (WEBMEDIA), 29. , 2023, Ribeirão Preto/SP. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 104–109.