TY - GEN
T1 - An Empirical Study on Hugging Face Trends, Topics and Challenges on Stack Overflow
AU - Feki, Hatem
AU - Abdellatif, Manel
AU - Sayagh, Mohammed
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Hugging Face (HF) has emerged as a pivotal platform for the Machine Learning (ML) community, functioning as a central hub where developers collaborate, share models, and exchange datasets. By offering a vast repository of pre-trained models (PTMs), HF has democratized access to advanced ML resources, promoting model reuse and accelerating the development of ML-based systems. Despite its rapid adoption in recent years, there remains a limited understanding of the challenges developers encounter when working with HF in general and PTMs in particular. Understanding these challenges is crucial for guiding future research and developing support strategies for the software engineering community. Consequently, in this study we investigate HF-related Stack Overflow (SO) posts, one of the most popular discussion platforms for developers, to uncover the relevance of the topics, key challenges, and trends in HF-related discussions. This understanding will help future studies and the HF community improve the use of HF by focusing on the challenges developers face according to the prevalence and complexity of each of these challenges. To do so, we apply a topic modeling technique to categorize the topics discussed in SO posts that are related to HF. We then assess the popularity and difficulty of these topics to gain deeper insight into the specific challenges developers encounter. Our findings reveal an average annual growth rate of 31.3% in the number of HF-related questions on SO from 2019 to 2024. Furthermore, we identify eight major topics, with the usage and understanding of large language models (LLMs) being the most popular, while the distributed computing and resource management of PTMs stands out as the most challenging topic for developers.
AB - Hugging Face (HF) has emerged as a pivotal platform for the Machine Learning (ML) community, functioning as a central hub where developers collaborate, share models, and exchange datasets. By offering a vast repository of pre-trained models (PTMs), HF has democratized access to advanced ML resources, promoting model reuse and accelerating the development of ML-based systems. Despite its rapid adoption in recent years, there remains a limited understanding of the challenges developers encounter when working with HF in general and PTMs in particular. Understanding these challenges is crucial for guiding future research and developing support strategies for the software engineering community. Consequently, in this study we investigate HF-related Stack Overflow (SO) posts, one of the most popular discussion platforms for developers, to uncover the relevance of the topics, key challenges, and trends in HF-related discussions. This understanding will help future studies and the HF community improve the use of HF by focusing on the challenges developers face according to the prevalence and complexity of each of these challenges. To do so, we apply a topic modeling technique to categorize the topics discussed in SO posts that are related to HF. We then assess the popularity and difficulty of these topics to gain deeper insight into the specific challenges developers encounter. Our findings reveal an average annual growth rate of 31.3% in the number of HF-related questions on SO from 2019 to 2024. Furthermore, we identify eight major topics, with the usage and understanding of large language models (LLMs) being the most popular, while the distributed computing and resource management of PTMs stands out as the most challenging topic for developers.
KW - Challenges
KW - Hugging Face
KW - Stack Overflow
KW - Topic Modeling
UR - https://www.scopus.com/pages/publications/105016115307
U2 - 10.1109/COMPSAC65507.2025.00163
DO - 10.1109/COMPSAC65507.2025.00163
M3 - Contribution to conference proceedings
AN - SCOPUS:105016115307
T3 - Proceedings - 2025 IEEE 49th Annual Computers, Software, and Applications Conference, COMPSAC 2025
SP - 1297
EP - 1307
BT - Proceedings - 2025 IEEE 49th Annual Computers, Software, and Applications Conference, COMPSAC 2025
A2 - Shahriar, Hossain
A2 - Alam, Kazi Shafiul
A2 - Ohsaki, Hiroyuki
A2 - Cimato, Stelvio
A2 - Capretz, Miriam
A2 - Ahmed, Shamem
A2 - Ahamed, Sheikh Iqbal
A2 - Majumder, AKM Jahangir Alam
A2 - Haque, Munirul
A2 - Yoshihisa, Tomoki
A2 - Cuzzocrea, Alfredo
A2 - Takemoto, Michiharu
A2 - Sakib, Nazmus
A2 - Elsayed, Marwa
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 49th IEEE Annual Computers, Software, and Applications Conference, COMPSAC 2025
Y2 - 8 July 2025 through 11 July 2025
ER -