TY - GEN
T1 - Robust video fingerprints using positions of salient regions
AU - Ouali, Chahid
AU - Dumouchel, Pierre
AU - Gupta, Vishwa
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/6/16
Y1 - 2017/6/16
N2 - This paper describes a video fingerprinting system that is highly robust to audio and video transformations. The proposed system adapts a robust audio fingerprint extraction approach to video fingerprinting. The audio fingerprinting system converts the spectrogram into binary images, and then encodes the positions of salient regions selected from each binary image. Visual features are extracted in a similar way from the video images. We propose two visual fingerprint generation methods where fingerprints encode the positions of salient regions of greyscale video images. Salient regions of the first method are selected based on the intensity values of the image, while the second method identifies the regions that represent the highest variations between two successive images. The similarity between two fingerprints is defined as the intersection between their elements. The search algorithm is speeded up by an efficient implementation on a Graphics Processing Unit (GPU). We evaluate the performance of the proposed video system on TRECVID 2009 and 2010 datasets, and we show that this system achieves promising results and outperforms other state-of-the-art video copy detection methods for queries that do not includes geometric transformations. In addition, we show the effectiveness of this system for a challenging audio+video copy detection task.
AB - This paper describes a video fingerprinting system that is highly robust to audio and video transformations. The proposed system adapts a robust audio fingerprint extraction approach to video fingerprinting. The audio fingerprinting system converts the spectrogram into binary images, and then encodes the positions of salient regions selected from each binary image. Visual features are extracted in a similar way from the video images. We propose two visual fingerprint generation methods where fingerprints encode the positions of salient regions of greyscale video images. Salient regions of the first method are selected based on the intensity values of the image, while the second method identifies the regions that represent the highest variations between two successive images. The similarity between two fingerprints is defined as the intersection between their elements. The search algorithm is speeded up by an efficient implementation on a Graphics Processing Unit (GPU). We evaluate the performance of the proposed video system on TRECVID 2009 and 2010 datasets, and we show that this system achieves promising results and outperforms other state-of-the-art video copy detection methods for queries that do not includes geometric transformations. In addition, we show the effectiveness of this system for a challenging audio+video copy detection task.
KW - Content-based copy detection
KW - feature extraction
KW - fingerprinting
KW - video fingerprint
UR - https://www.scopus.com/pages/publications/85023749206
U2 - 10.1109/ICASSP.2017.7952715
DO - 10.1109/ICASSP.2017.7952715
M3 - Contribution to conference proceedings
AN - SCOPUS:85023749206
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 3041
EP - 3045
BT - 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017
Y2 - 5 March 2017 through 9 March 2017
ER -