A Practical TLA+ Library for Designing and Verifying Distributed Systems

  • Diogo Canut Freitas Peixoto UFSC
  • Odorico Machado Mendizabal UFSC

Resumo


Designing distributed systems is complex, presenting challenges like concurrency and fault tolerance that extend beyond understanding business requirements and underlying technologies. Traditional testing often miss unpredictable but critical failures. Alternatively, model checking allows rigorous verification of system properties early in development. This paper aims to combine the rigor of model checking with an easy-to-use approach for the design of distributed systems, where message exchange is ever-present. To address this, we propose a modular TLA+ library for modeling communication primitives over point-to-point and broadcast abstractions. It enables designers to formally describe and verify solutions by providing these primitives as building blocks for communication subsystems. The library includes fault injection for analyzing system behavior under unreliable conditions, such as message loss, duplication, and out-of-order delivery. We formally verified classic atomic broadcast properties for the library’s primitives and demonstrated its practical utility by specifying and verifying the Deferred Update Replication Protocol.
Publicado
27/10/2025
PEIXOTO, Diogo Canut Freitas; MENDIZABAL, Odorico Machado. A Practical TLA+ Library for Designing and Verifying Distributed Systems. In: LATIN-AMERICAN SYMPOSIUM ON DEPENDABLE COMPUTING (LADC), 14. , 2025, Valparaíso/Chile. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 183-200.