A Realistic Protocol for Evaluation of Weakly Supervised Object Localization

Résultats de recherche: Chapitre dans un livre, rapport, actes de conférenceParticipation à un ouvrage collectif lié à un colloque ou une conférenceRevue par des pairs

Résumé

Weakly Supervised Object Localization (WSOL) allows training deep learning models for classification and localization (LOC) using only global class-level labels. The absence of bounding box (bbox) supervision during training raises challenges in the literature for hyper-parameter tuning, model selection, and evaluation. WSOL methods rely on a validation set with bbox annotations for model selection, and a test set with bbox annotations for threshold estimation for producing bboxes from localization maps. This approach, however, is not aligned with the WSOL setting as these annotations are typically unavailable in real-world scenarios. Our initial empirical analysis shows a significant decline in LOC performance when model selection and threshold estimation rely solely on class labels and the image itself, respectively, compared to using manual bbox annotations. This highlights the importance of incorporating bbox labels for optimal model performance. In this paper11Our scope. This work focuses on WSOL [2],[28]-[30], as opposed to other related tasks such as weakly supervised detection, segmentation, or instance segmentation, which are often mixed in earlier works [11], [18], [40]., a new WSOL evaluation protocol is proposed that provides LOC information without the need for manual bbox annotations. In particular, we generated noisy pseudo-boxes from a pretrained off-the-shelf region proposal method such as Selective Search, CLIP, and RPN for model selection. These bboxes are also employed to estimate the threshold from LOC maps, circumventing the need for test-set bbox annotations. Our experiments22Our code and generated pseudo-bounding boxes can be accessed at github.com/shakeebmurtaza/wsol_model_selection. with several WSOL methods on challenging natural and medical image datasets show that using the proposed pseudo-bboxes for validation facilitates the model selection and threshold estimation, with LOC performance comparable to models selected using GT bboxes on the validation set and threshold estimation on the test set. It also outperforms models selected using class-level labels, and then dynamically thresholded with only LOC maps.

langue originaleAnglais
titreProceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025
EditeurInstitute of Electrical and Electronics Engineers Inc.
Pages5367-5376
Nombre de pages10
ISBN (Electronique)9798331510831
Les DOIs
étatPublié - 2025
Evénement2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025 - Tucson, Etats-Unis
Durée: 28 févr. 20254 mars 2025

Série de publications

NomProceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025

Conférence

Conférence2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025
Pays/TerritoireEtats-Unis
La villeTucson
période28/02/254/03/25

Empreinte digitale

Voici les principaux termes ou expressions associés à « A Realistic Protocol for Evaluation of Weakly Supervised Object Localization ». Ces libellés thématiques sont générés à partir du titre et du résumé de la publication. Ensemble, ils forment une empreinte digitale unique.

Contient cette citation