Extending Summation Precision for Network Reduction Operations

  • George Michelogiannakis Lawrence Berkeley National Laboratory
  • Xiaoye S. Li Lawrence Berkeley National Laboratory
  • David H. Bailey Lawrence Berkeley National Laboratory
  • John Shalf Lawrence Berkeley National Laboratory

Abstract


Double precision summation is at the core of numerous important algorithms such as Newton-Krylov methods and other operations involving inner products, but the effectiveness of summation is limited by the accumulation of rounding errors, which are an increasing problem with the scaling of modern HPC systems and data sets. To reduce the impact of precision loss, researchers have proposed increased- and arbitrary-precision libraries that provide reproducible error or even bounded error accumulation for large sums, but do not guarantee an exact result. Such libraries can also increase computation time significantly. We propose big integer (BigInt) expansions of double precision variables that enable arbitrarily large summations without error and provide exact and reproducible results. This is feasible with performance comparable to that of double-precision floating point summation, by the inclusion of simple and inexpensive logic into modern NICs to accelerate performance on large-scale systems.
Keywords: Program processors, Hardware, Libraries, Atmospheric modeling, Adders, Sorting, Precision, floating point, double-precision, distributed summation, exact summation, reproducible summation
Published
2013-10-23
MICHELOGIANNAKIS, George; LI, Xiaoye S.; BAILEY, David H.; SHALF, John. Extending Summation Precision for Network Reduction Operations. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 25. , 2013, Porto de Galinhas/PE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2013 . p. 41-48.