TY - GEN
T1 - Disentangled Source-Free Personalization for Facial Expression Recognition with Neutral Target Data
AU - Sharafi, Masoumeh
AU - Ollivier, Emma
AU - Zeeshan, Muhammad Osama
AU - Belharbi, Soufiane
AU - Koerich, Alessandro Lameiras
AU - Pedersoli, Marco
AU - Bacon, Simon
AU - Granger, Eric
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Facial Expression Recognition (FER) from videos is a crucial task in various application areas, such as human-computer interaction and health diagnosis and monitoring (e.g., assessing pain and depression). Beyond the challenges of recognizing subtle emotional or health states, the effectiveness of deep FER models is often hindered by the considerable inter-subject variability in expressions. Source-free (unsupervised) domain adaptation (SFDA) methods may be employed to adapt a pre-trained source model using only unlabeled target domain data, thereby avoiding data privacy, storage, and transmission issues. Typically, SFDA methods adapt to a target domain dataset corresponding to an entire population and assume it includes data from all recognition classes. However, collecting such comprehensive target data can be difficult or even impossible for FER in healthcare applications. In many real-world scenarios, it may be feasible to collect a short neutral control video (which displays only neutral expressions) from target subjects before deployment. These videos can be used to adapt a model to better handle the variability of expressions among subjects. This paper introduces the Disentangled SFDA (DSFDA) method to address the challenge posed by adapting models with missing target expression data. DSFDA leverages data from a neutral target control video for end-to-end generation and adaptation of target data with missing non-neutral data. Our method learns to disentangle features related to expressions and identity while generating the missing non-neutral expression data for the target subject, thereby enhancing model accuracy. Additionally, our self-supervision strategy improves model adaptation by reconstructing target images that maintain the same identity and source expression. Experimental results1 on the challenging BioVid, UNBC-McMaster and StressID datasets indicate that our DSFDA approach can outperform state-of-the-art adaptation methods.1https://github.com/MasoumehSharafi/DSFDA/
AB - Facial Expression Recognition (FER) from videos is a crucial task in various application areas, such as human-computer interaction and health diagnosis and monitoring (e.g., assessing pain and depression). Beyond the challenges of recognizing subtle emotional or health states, the effectiveness of deep FER models is often hindered by the considerable inter-subject variability in expressions. Source-free (unsupervised) domain adaptation (SFDA) methods may be employed to adapt a pre-trained source model using only unlabeled target domain data, thereby avoiding data privacy, storage, and transmission issues. Typically, SFDA methods adapt to a target domain dataset corresponding to an entire population and assume it includes data from all recognition classes. However, collecting such comprehensive target data can be difficult or even impossible for FER in healthcare applications. In many real-world scenarios, it may be feasible to collect a short neutral control video (which displays only neutral expressions) from target subjects before deployment. These videos can be used to adapt a model to better handle the variability of expressions among subjects. This paper introduces the Disentangled SFDA (DSFDA) method to address the challenge posed by adapting models with missing target expression data. DSFDA leverages data from a neutral target control video for end-to-end generation and adaptation of target data with missing non-neutral data. Our method learns to disentangle features related to expressions and identity while generating the missing non-neutral expression data for the target subject, thereby enhancing model accuracy. Additionally, our self-supervision strategy improves model adaptation by reconstructing target images that maintain the same identity and source expression. Experimental results1 on the challenging BioVid, UNBC-McMaster and StressID datasets indicate that our DSFDA approach can outperform state-of-the-art adaptation methods.1https://github.com/MasoumehSharafi/DSFDA/
UR - https://www.scopus.com/pages/publications/105014513260
U2 - 10.1109/FG61629.2025.11099202
DO - 10.1109/FG61629.2025.11099202
M3 - Contribution to conference proceedings
AN - SCOPUS:105014513260
T3 - 2025 IEEE 19th International Conference on Automatic Face and Gesture Recognition, FG 2025
BT - 2025 IEEE 19th International Conference on Automatic Face and Gesture Recognition, FG 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 19th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2025
Y2 - 26 May 2025 through 30 May 2025
ER -