On P&ID Symbol Recognition using Transformers
Resumo
Machine-interpretable graphs of P&IDs are essential to advance automation and digital twins; however, most diagrams remain in vector formats like DWG, limiting automated use. Conversion is further hindered by symbol variability, scarce annotations, and the lack of openly available datasets. This work addresses these challenges by evaluating transformer-based detection of P&ID symbols with DETR, comparing backbones pretrained on ImageNet, COCO, and a domain-specific synthetic data. The study analyzes their generalization from synthetic to real diagrams, with evaluations on real world datasets (OPEN100, IPD) showing that domain-specific pretraining consistently improves performance of about 1 mAP point, underscoring its value for robust industrial diagram interpretation.Referências
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020). End-to-end object detection with transformers. In Computer Vision – ECCV 2020.
Gao, Q., Yang, H., Theisen, M. F., and Schweidtmann, A. M. (2025). Accelerating process synthesis with reinforcement learning. Computers & Chemical Engineering.
Ghadekar, P., Joshi, S., Swain, D., Acharya, B., Pradhan, M. R., and Patro, P. (2021). Automatic digitization of engineering diagrams using intelligent algorithms.
Goldstein, D. P., Balhorn, L. S., Alimin, A. A., and Schweidtmann, A. M. (2025). pyD-EXPI. In The 35th European Symposium on Computer Aided Process Engineering.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Kieffer, S., Dwyer, T., Marriott, K., and Wybrow, M. (2016). HOLA: Human-like orthogonal network layout.
Kim, B. C., Kim, H., Moon, Y., Lee, G., and Mun, D. (2022). End-to-end digitization of image format piping and instrumentation diagrams at an industrially applicable level.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C. L. (2014). Microsoft COCO. In Computer Vision – ECCV 2014.
Paliwal, S., Jain, A., Sharma, M., and Vig, L. (2021). Digitize-PID: Automatic digitization of piping and instrumentation diagrams. In Trends and Applications in Knowledge Discovery and Data Mining. Springer International Publishing.
Rupprecht, S., Hounat, Y., Kumar, M., Lastrucci, G., and Schweidtmann, A. M. (2025). Text2model.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Fei-Fei, L. (2015). ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV).
Stürmer, J. M., Graumann, M., and Koch, T. (2023). Demonstrating automated generation of simulation models from engineering diagrams. In 2023 International Conference on Machine Learning and Applications (ICMLA).
Stürmer, J. M., Graumann, M., and Koch, T. (2024). Transforming engineering diagrams: A novel approach for P&ID digitization using transformers.
Toghraei, M. (2019). Piping and Instrumentation Diagram Development. Wiley.
Gao, Q., Yang, H., Theisen, M. F., and Schweidtmann, A. M. (2025). Accelerating process synthesis with reinforcement learning. Computers & Chemical Engineering.
Ghadekar, P., Joshi, S., Swain, D., Acharya, B., Pradhan, M. R., and Patro, P. (2021). Automatic digitization of engineering diagrams using intelligent algorithms.
Goldstein, D. P., Balhorn, L. S., Alimin, A. A., and Schweidtmann, A. M. (2025). pyD-EXPI. In The 35th European Symposium on Computer Aided Process Engineering.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Kieffer, S., Dwyer, T., Marriott, K., and Wybrow, M. (2016). HOLA: Human-like orthogonal network layout.
Kim, B. C., Kim, H., Moon, Y., Lee, G., and Mun, D. (2022). End-to-end digitization of image format piping and instrumentation diagrams at an industrially applicable level.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C. L. (2014). Microsoft COCO. In Computer Vision – ECCV 2014.
Paliwal, S., Jain, A., Sharma, M., and Vig, L. (2021). Digitize-PID: Automatic digitization of piping and instrumentation diagrams. In Trends and Applications in Knowledge Discovery and Data Mining. Springer International Publishing.
Rupprecht, S., Hounat, Y., Kumar, M., Lastrucci, G., and Schweidtmann, A. M. (2025). Text2model.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Fei-Fei, L. (2015). ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV).
Stürmer, J. M., Graumann, M., and Koch, T. (2023). Demonstrating automated generation of simulation models from engineering diagrams. In 2023 International Conference on Machine Learning and Applications (ICMLA).
Stürmer, J. M., Graumann, M., and Koch, T. (2024). Transforming engineering diagrams: A novel approach for P&ID digitization using transformers.
Toghraei, M. (2019). Piping and Instrumentation Diagram Development. Wiley.
Publicado
12/11/2025
Como Citar
CUNHA, Leonardo K. T. da; LEITE, Eduardo F.; MORAIS, Lucas Eduardo C.; HARRISON, Alessandra; VALIATI, João F..
On P&ID Symbol Recognition using Transformers. In: ESCOLA REGIONAL DE APRENDIZADO DE MÁQUINA E INTELIGÊNCIA ARTIFICIAL DA REGIÃO SUL (ERAMIA-RS), 1. , 2025, Porto Alegre/RS.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 444-447.
DOI: https://doi.org/10.5753/eramiars.2025.16628.