A Realistic Protocol for Evaluation of Weakly Supervised Object Localization

Shakeeb Murtaza; Soufiane Belharbi; Marco Pedersoli; Eric Granger

doi:10.1109/WACV61041.2025.00524

A Realistic Protocol for Evaluation of Weakly Supervised Object Localization

Shakeeb Murtaza
, Soufiane Belharbi
, Marco Pedersoli
, Eric Granger

École de technologie supérieure

Résultats de recherche: Chapitre dans un livre, rapport, actes de conférence › Participation à un ouvrage collectif lié à un colloque ou une conférence › Revue par des pairs

Résumé

Weakly Supervised Object Localization (WSOL) allows training deep learning models for classification and localization (LOC) using only global class-level labels. The absence of bounding box (bbox) supervision during training raises challenges in the literature for hyper-parameter tuning, model selection, and evaluation. WSOL methods rely on a validation set with bbox annotations for model selection, and a test set with bbox annotations for threshold estimation for producing bboxes from localization maps. This approach, however, is not aligned with the WSOL setting as these annotations are typically unavailable in real-world scenarios. Our initial empirical analysis shows a significant decline in LOC performance when model selection and threshold estimation rely solely on class labels and the image itself, respectively, compared to using manual bbox annotations. This highlights the importance of incorporating bbox labels for optimal model performance. In this paper11Our scope. This work focuses on WSOL [2],[28]-[30], as opposed to other related tasks such as weakly supervised detection, segmentation, or instance segmentation, which are often mixed in earlier works [11], [18], [40]., a new WSOL evaluation protocol is proposed that provides LOC information without the need for manual bbox annotations. In particular, we generated noisy pseudo-boxes from a pretrained off-the-shelf region proposal method such as Selective Search, CLIP, and RPN for model selection. These bboxes are also employed to estimate the threshold from LOC maps, circumventing the need for test-set bbox annotations. Our experiments22Our code and generated pseudo-bounding boxes can be accessed at github.com/shakeebmurtaza/wsol_model_selection. with several WSOL methods on challenging natural and medical image datasets show that using the proposed pseudo-bboxes for validation facilitates the model selection and threshold estimation, with LOC performance comparable to models selected using GT bboxes on the validation set and threshold estimation on the test set. It also outperforms models selected using class-level labels, and then dynamically thresholded with only LOC maps.

langue originale	Anglais
titre	Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025
Editeur	Institute of Electrical and Electronics Engineers Inc.
Pages	5367-5376
Nombre de pages	10
ISBN (Electronique)	9798331510831
Les DOIs	https://doi.org/10.1109/WACV61041.2025.00524
état	Publié - 2025
Evénement	2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025 - Tucson, Etats-Unis Durée: 28 févr. 2025 → 4 mars 2025

Série de publications

Nom	Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025

Conférence

Conférence	2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025
Pays/Territoire	Etats-Unis
La ville	Tucson
période	28/02/25 → 4/03/25

Accès au document

10.1109/WACV61041.2025.00524

Autres fichiers et liens

Lien vers la publication dans Scopus

Empreinte digitale

Voici les principaux termes ou expressions associés à « A Realistic Protocol for Evaluation of Weakly Supervised Object Localization ». Ces libellés thématiques sont générés à partir du titre et du résumé de la publication. Ensemble, ils forment une empreinte digitale unique.

Contient cette citation

Murtaza, S., Belharbi, S., Pedersoli, M., & Granger, E. (2025). A Realistic Protocol for Evaluation of Weakly Supervised Object Localization. Dans Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025 (p. 5367-5376). (Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/WACV61041.2025.00524

Murtaza, Shakeeb ; Belharbi, Soufiane ; Pedersoli, Marco et al. / A Realistic Protocol for Evaluation of Weakly Supervised Object Localization. Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025. Institute of Electrical and Electronics Engineers Inc., 2025. p. 5367-5376 (Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025).

@inproceedings{2f1604a855e246f3a11599d7a9152cdc,

title = "A Realistic Protocol for Evaluation of Weakly Supervised Object Localization",

abstract = "Weakly Supervised Object Localization (WSOL) allows training deep learning models for classification and localization (LOC) using only global class-level labels. The absence of bounding box (bbox) supervision during training raises challenges in the literature for hyper-parameter tuning, model selection, and evaluation. WSOL methods rely on a validation set with bbox annotations for model selection, and a test set with bbox annotations for threshold estimation for producing bboxes from localization maps. This approach, however, is not aligned with the WSOL setting as these annotations are typically unavailable in real-world scenarios. Our initial empirical analysis shows a significant decline in LOC performance when model selection and threshold estimation rely solely on class labels and the image itself, respectively, compared to using manual bbox annotations. This highlights the importance of incorporating bbox labels for optimal model performance. In this paper11Our scope. This work focuses on WSOL [2],[28]-[30], as opposed to other related tasks such as weakly supervised detection, segmentation, or instance segmentation, which are often mixed in earlier works [11], [18], [40]., a new WSOL evaluation protocol is proposed that provides LOC information without the need for manual bbox annotations. In particular, we generated noisy pseudo-boxes from a pretrained off-the-shelf region proposal method such as Selective Search, CLIP, and RPN for model selection. These bboxes are also employed to estimate the threshold from LOC maps, circumventing the need for test-set bbox annotations. Our experiments22Our code and generated pseudo-bounding boxes can be accessed at github.com/shakeebmurtaza/wsol\_model\_selection. with several WSOL methods on challenging natural and medical image datasets show that using the proposed pseudo-bboxes for validation facilitates the model selection and threshold estimation, with LOC performance comparable to models selected using GT bboxes on the validation set and threshold estimation on the test set. It also outperforms models selected using class-level labels, and then dynamically thresholded with only LOC maps.",

author = "Shakeeb Murtaza and Soufiane Belharbi and Marco Pedersoli and Eric Granger",

note = "Publisher Copyright: {\textcopyright} 2025 IEEE.; 2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025 ; Conference date: 28-02-2025 Through 04-03-2025",

year = "2025",

doi = "10.1109/WACV61041.2025.00524",

language = "English",

series = "Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "5367--5376",

booktitle = "Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025",

}

Murtaza, S, Belharbi, S, Pedersoli, M & Granger, E 2025, A Realistic Protocol for Evaluation of Weakly Supervised Object Localization. Dans Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025. Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025, Institute of Electrical and Electronics Engineers Inc., p. 5367-5376, 2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025, Tucson, Etats-Unis, 28/02/25. https://doi.org/10.1109/WACV61041.2025.00524

A Realistic Protocol for Evaluation of Weakly Supervised Object Localization. / Murtaza, Shakeeb; Belharbi, Soufiane; Pedersoli, Marco et al.
Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025. Institute of Electrical and Electronics Engineers Inc., 2025. p. 5367-5376 (Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025).

Résultats de recherche: Chapitre dans un livre, rapport, actes de conférence › Participation à un ouvrage collectif lié à un colloque ou une conférence › Revue par des pairs

TY - GEN

T1 - A Realistic Protocol for Evaluation of Weakly Supervised Object Localization

AU - Murtaza, Shakeeb

AU - Belharbi, Soufiane

AU - Pedersoli, Marco

AU - Granger, Eric

PY - 2025

Y1 - 2025

N2 - Weakly Supervised Object Localization (WSOL) allows training deep learning models for classification and localization (LOC) using only global class-level labels. The absence of bounding box (bbox) supervision during training raises challenges in the literature for hyper-parameter tuning, model selection, and evaluation. WSOL methods rely on a validation set with bbox annotations for model selection, and a test set with bbox annotations for threshold estimation for producing bboxes from localization maps. This approach, however, is not aligned with the WSOL setting as these annotations are typically unavailable in real-world scenarios. Our initial empirical analysis shows a significant decline in LOC performance when model selection and threshold estimation rely solely on class labels and the image itself, respectively, compared to using manual bbox annotations. This highlights the importance of incorporating bbox labels for optimal model performance. In this paper11Our scope. This work focuses on WSOL [2],[28]-[30], as opposed to other related tasks such as weakly supervised detection, segmentation, or instance segmentation, which are often mixed in earlier works [11], [18], [40]., a new WSOL evaluation protocol is proposed that provides LOC information without the need for manual bbox annotations. In particular, we generated noisy pseudo-boxes from a pretrained off-the-shelf region proposal method such as Selective Search, CLIP, and RPN for model selection. These bboxes are also employed to estimate the threshold from LOC maps, circumventing the need for test-set bbox annotations. Our experiments22Our code and generated pseudo-bounding boxes can be accessed at github.com/shakeebmurtaza/wsol_model_selection. with several WSOL methods on challenging natural and medical image datasets show that using the proposed pseudo-bboxes for validation facilitates the model selection and threshold estimation, with LOC performance comparable to models selected using GT bboxes on the validation set and threshold estimation on the test set. It also outperforms models selected using class-level labels, and then dynamically thresholded with only LOC maps.

AB - Weakly Supervised Object Localization (WSOL) allows training deep learning models for classification and localization (LOC) using only global class-level labels. The absence of bounding box (bbox) supervision during training raises challenges in the literature for hyper-parameter tuning, model selection, and evaluation. WSOL methods rely on a validation set with bbox annotations for model selection, and a test set with bbox annotations for threshold estimation for producing bboxes from localization maps. This approach, however, is not aligned with the WSOL setting as these annotations are typically unavailable in real-world scenarios. Our initial empirical analysis shows a significant decline in LOC performance when model selection and threshold estimation rely solely on class labels and the image itself, respectively, compared to using manual bbox annotations. This highlights the importance of incorporating bbox labels for optimal model performance. In this paper11Our scope. This work focuses on WSOL [2],[28]-[30], as opposed to other related tasks such as weakly supervised detection, segmentation, or instance segmentation, which are often mixed in earlier works [11], [18], [40]., a new WSOL evaluation protocol is proposed that provides LOC information without the need for manual bbox annotations. In particular, we generated noisy pseudo-boxes from a pretrained off-the-shelf region proposal method such as Selective Search, CLIP, and RPN for model selection. These bboxes are also employed to estimate the threshold from LOC maps, circumventing the need for test-set bbox annotations. Our experiments22Our code and generated pseudo-bounding boxes can be accessed at github.com/shakeebmurtaza/wsol_model_selection. with several WSOL methods on challenging natural and medical image datasets show that using the proposed pseudo-bboxes for validation facilitates the model selection and threshold estimation, with LOC performance comparable to models selected using GT bboxes on the validation set and threshold estimation on the test set. It also outperforms models selected using class-level labels, and then dynamically thresholded with only LOC maps.

UR - https://www.scopus.com/pages/publications/105003631231

U2 - 10.1109/WACV61041.2025.00524

DO - 10.1109/WACV61041.2025.00524

M3 - Contribution to conference proceedings

AN - SCOPUS:105003631231

T3 - Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025

SP - 5367

EP - 5376

BT - Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025

Y2 - 28 February 2025 through 4 March 2025

ER -

Murtaza S, Belharbi S, Pedersoli M , Granger E. A Realistic Protocol for Evaluation of Weakly Supervised Object Localization. Dans Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025. Institute of Electrical and Electronics Engineers Inc. 2025. p. 5367-5376. (Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025). doi: 10.1109/WACV61041.2025.00524