TY - GEN
T1 - YouTubeCat
T2 - 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2010
AU - Wang, Zheshen
AU - Zhao, Ming
AU - Song, Yang
AU - Kumar, Sanjiv
AU - Li, Baoxin
PY - 2010
Y1 - 2010
N2 - Automatic categorization of videos in a Web-scale unconstrained collection such as YouTube is a challenging task. A key issue is how to build an effective training set in the presence of missing, sparse or noisy labels. We propose to achieve this by first manually creating a small labeled set and then extending it using additional sources such as related videos, searched videos, and text-based webpages. The data from such disparate sources has different properties and labeling quality, and thus fusing them in a coherent fashion is another practical challenge. We propose a fusion framework in which each data source is first combined with the manually-labeled set independently. Then, using the hierarchical taxonomy of the categories, a Conditional Random Field (CRF) based fusion strategy is designed. Based on the final fused classifier, category labels are predicted for the new videos. Extensive experiments on about 80K videos from 29 most frequent categories in YouTube show the effectiveness of the proposed method for categorizing large-scale wild Web videos.
AB - Automatic categorization of videos in a Web-scale unconstrained collection such as YouTube is a challenging task. A key issue is how to build an effective training set in the presence of missing, sparse or noisy labels. We propose to achieve this by first manually creating a small labeled set and then extending it using additional sources such as related videos, searched videos, and text-based webpages. The data from such disparate sources has different properties and labeling quality, and thus fusing them in a coherent fashion is another practical challenge. We propose a fusion framework in which each data source is first combined with the manually-labeled set independently. Then, using the hierarchical taxonomy of the categories, a Conditional Random Field (CRF) based fusion strategy is designed. Based on the final fused classifier, category labels are predicted for the new videos. Extensive experiments on about 80K videos from 29 most frequent categories in YouTube show the effectiveness of the proposed method for categorizing large-scale wild Web videos.
UR - http://www.scopus.com/inward/record.url?scp=77955988704&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77955988704&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2010.5540125
DO - 10.1109/CVPR.2010.5540125
M3 - Conference contribution
AN - SCOPUS:77955988704
SN - 9781424469840
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 879
EP - 886
BT - 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2010
Y2 - 13 June 2010 through 18 June 2010
ER -