Pedram, A., Gerstlauer, A., & Geijn, R. (2012). On the Efficiency of Register File versus Broadcast Interconnect for Collective Communications in Data-Parallel Hardware Accelerators. In Proceedings of the 24th International Symposium on Computer Architecture and High Performance Computing, (pp. 19-26). Porto Alegre: SBC.