Characterizing Synchronous Writes in Stable Memory Devices
Distributed algorithms that operate in the fail-recovery model rely on the state stored in stable memory to guarantee the irreversibility of operations even in the presence of failures. The performance of these algorithms lean heavily on the performance of stable memory. Current storage technologies have a defined performance profile: data is accessed in blocks of hundreds or thousands of bytes, random access to these blocks is expensive and sequential access is somewhat better. File system implementations hide some of the perfor- mance limitations of the underlying storage devices using buffers and caches. However, fail-recovery distributed algorithms bypass some of these techniques and perform synchronous writes to be able to tolerate a failure during the write itself. Assuming the distributed system designer is able to buffer the algorithm’s writes, we ask how buffer size and latency complement each other. In this paper we start to answer this question by characterizing the performance (throughput and latency) of typical stable memory devices using a representative set of current file systems.
Chandra, T. D., Griesemer, R., and Redstone, J. (2007). Paxos made live: an engineering perspective. In PODC ’07: Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing, pages 398–407, New York, NY, USA. ACM Press.
Chen, F., Koufaty, D. A., and Zhang, X. (2009). Understanding intrinsic characteris- tics and system implications of flash memory based solid state drives. SIGMETRICS Perform. Eval. Rev., 37(1):181–192.
Hunt, P., Konar, M., Junqueira, F. P., and Reed, B. (2010). ZooKeeper: Wait-free coordi- nation for internet-scale systems. In Proceedings of the 2010 USENIX Conference onUSENIX Annual Technical Conference, USENIXATC’10, pages 11–11, Berkeley, CA, USA. USENIX Association.
Jannen, W., Yuan, J., Zhan, Y., Akshintala, A., Esmet, J., Jiao, Y., Mittal, A., Pandey, P., Reddy, P., Walsh, L., Bender, M., Farach-Colton, M., Johnson, R., Kuszmaul, B. C., and Porter, D. E. (2015). BetrFS: A right-optimized write-optimized file sys- tem. In Proceedings of the 13th USENIX Conference on File and Storage Technologies, FAST’15, pages 301–315, Berkeley, CA, USA. USENIX Association.
Lamport, L. (1998). The part-time parliament. ACM Trans. Comput. Syst., 16(2):133– 169.
Lamport, L. (2006). Fast Paxos. Distrib. Comput., 19(2):79–103.
Lee, C., Sim, D., Hwang, J.-Y., and Cho, S. (2015). F2FS: A new file system for flash stor- age. In Proceedings of the 13th USENIX Conference on File and Storage Technologies, FAST’15, pages 273–286, Berkeley, CA, USA. USENIX Association.
Mathur, A., Cao, M., Bhattacharya, S., Dilger, A., Tomas, A., and Vivier, L. (2007). The new ext4 filesystem: current status and future plans. In Proceedings of the Linux symposium, volume 2, pages 21–33.
Min, C., Kim, K., Cho, H., Lee, S.-W., and Eom, Y. I. (2012). SFS: Random write consid- ered harmful in solid state drives. In Proceedings of the 10th USENIX Conference on File and Storage Technologies, FAST’12, pages 12–12, Berkeley, CA, USA. USENIX Association.
Patterson, D. A. (2004). Latency lags bandwith. Commun. ACM, 47(10):71–75.
Rodeh, O., Bacik, J., and Mason, C. (2013). BTRFS: The linux B-tree filesystem. Trans. Storage, 9(3):9:1–9:32.
Rosenblum, M. and Ousterhout, J. K. (1992). The design and implementation of a log- structured file system. ACM Trans. Comput. Syst., 10(1):26–52.
Ruemmler, C. and Wilkes, J. (1994). An introduction to disk drive modeling. Computer, 27(3):17–28.
Schneider, F. B. (1990). Implementing fault-tolerant services using the state machine approach: a tutorial. ACM Comput. Surv., 22(4):299–319.
Sweeney, A., Doucette, D., Hu, W., Anderson, C., Nishimoto, M., and Peck, G. (1996). Scalability in the XFS file system. In Proceedings of the 1996 Annual Conference on USENIX Annual Technical Conference, ATEC ’96, pages 1–1, Berkeley, CA, USA. USENIX Association.
Vieira, G. M. D. and Buzato, L. E. (2010). Implementation of an object-oriented specifi- cation for active replication using consensus. Technical Report IC-10-26, Institute of Computing, University of Campinas.
Yeon, J., Jeong, M., Lee, S., and Lee, E. (2018). RFLUSH: Rethink the flush. In Pro- ceedings of the 16th USENIX Conference on File and Storage Technologies, FAST’18, pages 201–209, Berkeley, CA, USA. USENIX Association.