Build Optimization: A Systematic Literature Review

Henri Aïdasso; Mohammed Sayagh; Francis Bordeleau

doi:10.1145/3757912

Build Optimization: A Systematic Literature Review

École de technologie supérieure

Research output: Contribution to journal › Review Article › peer-review

Abstract

In modern software organizations, Continuous Integration (CI) consists of an automated build process triggered by change submissions and involving compilation, testing, and packaging to enable the continuous deployment of new software versions to end-users. While CI offers various advantages regarding software quality and delivery speed, it introduces challenges addressed by a large body of research. To better understand this literature, so as to help practitioners find solutions for their problems and guide future research, we conduct a systematic review of 97 studies published between 2006 and 2024, summarizing their goals, methodologies, datasets, and metrics. These studies target two main challenges: (1) long build durations and (2) build failures. To address the first, researchers have proposed techniques such as predicting build outcomes and durations, selective build execution, and build acceleration through caching or performance smell repair. On the other hand, build failure root causes have been studied, leading to techniques for predicting build script maintenance needs and automating repairs. Recent work also focuses on flaky build failures caused by environmental issues. Most techniques use machine learning and rely on build metrics, which we classify into five categories. Finally, we identify eight publicly available datasets to support future research on build optimization.

Original language	English
Article number	12
Journal	ACM Computing Surveys
Volume	58
Issue number	1
DOIs	https://doi.org/10.1145/3757912
Publication status	Published - 2 Sept 2025

!!!Keywords

CI
Continuous integration
build
optimization
systematic literature review

Access to Document

10.1145/3757912

Cite this

@article{a41d1206b5e44010baa758567ef3002d,

title = "Build Optimization: A Systematic Literature Review",

abstract = "In modern software organizations, Continuous Integration (CI) consists of an automated build process triggered by change submissions and involving compilation, testing, and packaging to enable the continuous deployment of new software versions to end-users. While CI offers various advantages regarding software quality and delivery speed, it introduces challenges addressed by a large body of research. To better understand this literature, so as to help practitioners find solutions for their problems and guide future research, we conduct a systematic review of 97 studies published between 2006 and 2024, summarizing their goals, methodologies, datasets, and metrics. These studies target two main challenges: (1) long build durations and (2) build failures. To address the first, researchers have proposed techniques such as predicting build outcomes and durations, selective build execution, and build acceleration through caching or performance smell repair. On the other hand, build failure root causes have been studied, leading to techniques for predicting build script maintenance needs and automating repairs. Recent work also focuses on flaky build failures caused by environmental issues. Most techniques use machine learning and rely on build metrics, which we classify into five categories. Finally, we identify eight publicly available datasets to support future research on build optimization.",

keywords = "CI, Continuous integration, build, optimization, systematic literature review",

author = "Henri A{\"i}dasso and Mohammed Sayagh and Francis Bordeleau",

note = "Publisher Copyright: {\textcopyright} 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.",

year = "2025",

month = sep,

day = "2",

doi = "10.1145/3757912",

language = "English",

volume = "58",

journal = "ACM Computing Surveys",

issn = "0360-0300",

publisher = "Association for Computing Machinery (ACM)",

number = "1",

}

TY - JOUR

T1 - Build Optimization

T2 - A Systematic Literature Review

AU - Aïdasso, Henri

AU - Sayagh, Mohammed

AU - Bordeleau, Francis

PY - 2025/9/2

Y1 - 2025/9/2

N2 - In modern software organizations, Continuous Integration (CI) consists of an automated build process triggered by change submissions and involving compilation, testing, and packaging to enable the continuous deployment of new software versions to end-users. While CI offers various advantages regarding software quality and delivery speed, it introduces challenges addressed by a large body of research. To better understand this literature, so as to help practitioners find solutions for their problems and guide future research, we conduct a systematic review of 97 studies published between 2006 and 2024, summarizing their goals, methodologies, datasets, and metrics. These studies target two main challenges: (1) long build durations and (2) build failures. To address the first, researchers have proposed techniques such as predicting build outcomes and durations, selective build execution, and build acceleration through caching or performance smell repair. On the other hand, build failure root causes have been studied, leading to techniques for predicting build script maintenance needs and automating repairs. Recent work also focuses on flaky build failures caused by environmental issues. Most techniques use machine learning and rely on build metrics, which we classify into five categories. Finally, we identify eight publicly available datasets to support future research on build optimization.

AB - In modern software organizations, Continuous Integration (CI) consists of an automated build process triggered by change submissions and involving compilation, testing, and packaging to enable the continuous deployment of new software versions to end-users. While CI offers various advantages regarding software quality and delivery speed, it introduces challenges addressed by a large body of research. To better understand this literature, so as to help practitioners find solutions for their problems and guide future research, we conduct a systematic review of 97 studies published between 2006 and 2024, summarizing their goals, methodologies, datasets, and metrics. These studies target two main challenges: (1) long build durations and (2) build failures. To address the first, researchers have proposed techniques such as predicting build outcomes and durations, selective build execution, and build acceleration through caching or performance smell repair. On the other hand, build failure root causes have been studied, leading to techniques for predicting build script maintenance needs and automating repairs. Recent work also focuses on flaky build failures caused by environmental issues. Most techniques use machine learning and rely on build metrics, which we classify into five categories. Finally, we identify eight publicly available datasets to support future research on build optimization.

KW - CI

KW - Continuous integration

KW - build

KW - optimization

KW - systematic literature review

UR - https://www.scopus.com/pages/publications/105020376552

U2 - 10.1145/3757912

DO - 10.1145/3757912

M3 - Review Article

AN - SCOPUS:105020376552

SN - 0360-0300

VL - 58

JO - ACM Computing Surveys

JF - ACM Computing Surveys

IS - 1

M1 - 12

ER -

Build Optimization: A Systematic Literature Review

Abstract

!!!Keywords

Access to Document

Other files and links

Fingerprint

Cite this