Classification of video shots based on human affect

Kok Meng Ong*, Wataru Kameyama

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

5 Citations (Scopus)


This study addresses the challenge of analyzing affective video content. The affective content of a given video is defined as the intensity and the type of emotion that arise in a viewer while watching that video. In this study, human emotion was monitored by capturing viewers' pupil sizes and gazing points while they were watching the video. On the basis of the measurement values, four features were extracted (namely cumulative pupil response (CPR), frequency component (FC), modified bivariate contour ellipse area (mBVCEA) and Gini coefficient). Using principal component analysis, we have found that two key features, namely the CPR and FC, contribute to the majority of variance in the data. By utilizing the key features, the affective content was identified and could be used in classifying the video shots into their respective scenes. An average classification accuracy of 71.89% was achieved for three basic emotions, with the individual maximum classification accuracy at 89.06%. The development in this study serves as the first step in automating personalized video content analysis on the basis of human emotion.

Original languageEnglish
Pages (from-to)847-856
Number of pages10
JournalKyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers
Issue number6
Publication statusPublished - 2009


  • Emotion
  • Gaze
  • Pupil
  • Video content classification

ASJC Scopus subject areas

  • Media Technology
  • Computer Science Applications
  • Electrical and Electronic Engineering


Dive into the research topics of 'Classification of video shots based on human affect'. Together they form a unique fingerprint.

Cite this