In this paper we propose a framework to learn and predict saliency in videos using human eye movements. In our approach, we record the eye-gaze of users as they are watching videos, and then learn the low level features of regions that are of visual interest. The learnt classifier is then used to predict salient regions in videos belonging to the same application. So far, predicting saliency in images and videos has been approached from mainly two different perspectives, namely visual attention modeling and spatio-temporal interest point detection. Such approaches are purely-vision based and detect regions having a predefined set of characteristics, such as complex motion or high contrast, for all kinds of videos. However, what is 'interesting' varies from one application to another. By learning features of regions that capture the attention of viewers while watching a video, we aim to distinguish those that are actually salient in the given context, from the rest. This is especially useful in an environment where users are interested only in a certain kind of activity, as in the case of surveillance or biomedical applications. In this paper, the proposed framework is implemented using a neural network that learns the low-level features defined in visual attention modeling literature (Itti's saliency model) based on the interesting regions as identified by the eye gaze movements of viewers. In our experiments with news videos of popular channels, the results show a significant improvement in the identification of relevant salient regions in such videos, when compared with existing approaches.

Original languageEnglish (US)
Title of host publication2009 Workshop on Motion and Video Computing, WMVC '09
StatePublished - 2009
Event2009 Workshop on Motion and Video Computing, WMVC '09 - Snowbird, UT, United States
Duration: Dec 8 2009Dec 9 2009

Publication series

Name2009 Workshop on Motion and Video Computing, WMVC '09


Other2009 Workshop on Motion and Video Computing, WMVC '09
Country/TerritoryUnited States
CitySnowbird, UT

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Software


Dive into the research topics of 'Learning attention based saliency in videos from human eye movements'. Together they form a unique fingerprint.

Cite this