TY - GEN
T1 - Addressing Reproducibility Challenges in HPC with Continuous Integration
AU - Hayot-Sasson, Valérie
AU - Hudson, Nathaniel
AU - Bauer, André
AU - Gonthier, Maxime
AU - Foster, Ian
AU - Chard, Kyle
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/11/15
Y1 - 2025/11/15
N2 - The high-performance computing (HPC) community has adopted incentive structures to motivate reproducible research, with major conferences awarding badges to papers that meet reproducibility requirements. Yet, many papers do not meet such requirements. The uniqueness of HPC infrastructure and software, coupled with strict access requirements, may limit opportunities for reproducibility. In the absence of resource access, we believe that regular documented testing, through continuous integration (CI), coupled with complete provenance information, can be used as a substitute. Here, we argue that better HPC-compliant CI solutions will improve reproducibility of applications. We present a survey of reproducibility initiatives and describe the barriers to reproducibility in HPC. To address existing limitations, we present a GitHub Action, CORRECT, that enables secure execution of tests on remote HPC resources. We evaluate CORRECT's usability across three different types of HPC applications, demonstrating the effectiveness of using CORRECT for automating and documenting reproducibility evaluations.
AB - The high-performance computing (HPC) community has adopted incentive structures to motivate reproducible research, with major conferences awarding badges to papers that meet reproducibility requirements. Yet, many papers do not meet such requirements. The uniqueness of HPC infrastructure and software, coupled with strict access requirements, may limit opportunities for reproducibility. In the absence of resource access, we believe that regular documented testing, through continuous integration (CI), coupled with complete provenance information, can be used as a substitute. Here, we argue that better HPC-compliant CI solutions will improve reproducibility of applications. We present a survey of reproducibility initiatives and describe the barriers to reproducibility in HPC. To address existing limitations, we present a GitHub Action, CORRECT, that enables secure execution of tests on remote HPC resources. We evaluate CORRECT's usability across three different types of HPC applications, demonstrating the effectiveness of using CORRECT for automating and documenting reproducibility evaluations.
KW - Continuous Integration
KW - High Performance Computing
KW - Provenance
KW - Reproducibility
UR - https://www.scopus.com/pages/publications/105023989202
U2 - 10.1145/3712285.3759874
DO - 10.1145/3712285.3759874
M3 - Contribution to conference proceedings
AN - SCOPUS:105023989202
T3 - Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025
SP - 437
EP - 457
BT - Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025
PB - Association for Computing Machinery, Inc
T2 - 2025 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025
Y2 - 16 November 2025 through 21 November 2025
ER -