Current Developments In Emotion And Gesture Recognition: [Essay Example], 2819 words GradesFixer

Haven't found the right essay?

Get an expert to write your essay!


Professional writers and researchers


Sources and citation are provided


3 hour delivery

This essay has been submitted by a student. This is not an example of the work written by professional essay writers.

Current Developments in Emotion and Gesture Recognition

Download Print

Pssst… we can write an original essay just for you.

Any subject. Any type of essay.

We’ll even meet a 3-hour deadline.

Get your price

121 writers online

Download PDF


The world we are computing, the world of our television, the world of our smart phones has changed so much over last 40 years and what hasn’t change is the way we interact with all these devices. We are still using the same technology which was developed 30-40 years old to interact with our computers. We are still using the same remote control to interact with our television, we are still using the same keyboard which was created over 200 years ago on a typewriter so that when letter came up on the write board they need not to interact with each other and today we are still using that same format to interact with our computer. It is the industry that right for change and just the recognition is one of the areas where a more natural interaction with our device is emerging. Over a 4-5 years we have seen the emergence of the 3D imaging in many new devices like TVs with 3D gestures, smartphones are coming up with 3D gestures, PC are coming up with 3D gestures recognition and these has given market prediction that by the year 2070 there can be as many as 1. 6 billion devices enabled with some sort of gesture recognition. In the present-day context of interactive, intelligent computing, an efficient human-computer interaction is of utmost importance. Emotion and Gesture recognition can be coined as an approach in this direction.

Emotion Recognition

Emotions is any conscious experience characterized by intense mental activity and a certain degree of pleasure or displeasure. Scientific discourse has drifted to other meanings and there is no consensus on a definition. Emotion is often intertwined with mood, temperament, personality, disposition, and motivation. The study of emotions in human-computer interaction has increased in the recent years. With successful classification of emotions, we could get instant feedback from users, gain better understanding of the human behavior while using the information technologies and thus make the systems and user interfaces more emphatic and intelligent. Human express their feelings through several channels: facial expression, voice, body gestures and movements etc. Emotion recognition is the process of identifying human emotion, most typically from facial expressions.

Gesture Recognition

The word gesture can refer to any non-verbal communication that is intended to communicate a specific message. In the world of gesture recognition, a gesture is defined as any physical movement, large or small, that can be interpreted by a motion sensor. It may include anything from the pointing of a finger to a roundhouse kick or a nod of the head to a pinch or wave of the hand. Gestures can be broad and sweeping or small and contained. Gesture recognition is the ability of a computer to understand gestures and execute commands based on those gestures. Gesture recognition systems finds applications in several interesting areas:

  • Evolving aids for haering impaires;
  • Enabling very young children to interact with computers;
  • Designing techniques for forensic identification;
  • Recognizing sign language;
  • Medically monitoring patients’ emotional states/stresslevels;
  • Lie detection;
  • Navigating/manipulating in virtual environments;
  • Distance learning /tele-teaching assistance
  • Monitoring automobile drivers’ alertness, photojournalism, biometrics etc.

Existing Technologies for Gesture Recognition

Image-based Gesture Recognition

Image based systems use a combination of IR, radio ranging, laser or visible light cameras to decipher hand gestures made by the user. Microsoft’s Kinect is a good example for IR light pattern analysis for depth perception. Kinect is motion sensing device which takes input as gesture and posture, analyze the data and develop an interactive response.

Radar Based Technologies

This uses a miniaturized radar sensor to recognize touch less gesture interactions. Radar has a very high positional accuracy compared to image sensors and hence it can detect sub-millimeter motion at high speed with great accuracy. The Reflected radar patter is analyzed at a high frequency rate and the gesture is recognized. Google’s Project Soli is an example of this technology which is the worlds’ first radar-based key technology making the augmented reality breakthrough a reality.

Muscle, Nerve and Brain Impulse reading

This field has developed vastly in the past few years, with various technologies using nerve signals, muscle contractions, and brain impulses to measure and decipher arm movements.

Glove based physical sensors

Many varieties of wearable glove-based sensors have been developed, to acquire gesture data, usually with sensor suites detecting the actual flex of the finger, for an accurate reading. Generally speaking, emotion recognition based on facial expression has been investigate profoundly. To date, some efforts have been made to build systems capable of recognizing emotions based on two modalities (e. g. , based on the combination official expressions and speech data and facial expressions and gesture). This study is focused on the bimodal emotion recognition (based on combination of facial expression and gesture). Consideration of multiple modalities is helpful when some modality feature values are missing or unreliable. This may occur, for example, when the feature detection process is made difficult by noisy environmental conditions, when the signals are corrupted during transmission, or, in an extreme case like neurological disorder or specially challenged person, when the system is unable to record one of the modalities. In real-life naturalistic scenarios, a system for emotion recognition must have the robustness to deal with these situations.

This report survey on different aspect of emotion and gesture recognition, and different tools that are employed to achieve better performance and accuracy

Literature Survey

Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis (Loic Kessous, Ginevra Castellano, George Caridaki).

This paper presents a study on multimodal automatic emotion recognition during a speech-based interaction. Facial expression, gesture and acoustic analysis of speech were used to extract features relevant to emotion. There are already many databases containing facial and vocal expression, but gesture is often not included. The main motive of this paper was to build a framework on the interaction of multiple modalities. The corpus used in this study was collected specifically for multimodal emotion recognition, thanks to which the result of this work cannot be compare with those in the literature. For the development of the corpus, explicit subjects and setup were organized which include two DV cameras (25 fps), one with high resolution to record the participant’s face and other to record participant’s body and a direct-to disk computer-based system for voice recording. Participants were asked to act eight emotional states: Anger, Despair, Interest, Pleasure, Sadness, Irritation, Joy and Pride. Initially feature extraction algorithms are applied on the database to extract viable feature to perform unimodal emotion recognition. Viola Jones algorithm is used for face detection, EyesWeb platform is used for tracking participants hands and body. A system based on a Bayesian classifier was used for the automatic classification of unimodal data, bimodal data and multimodal data. The Overall performances of unimodal emotion recognition from facial expressions was 48. 3%, from body gesture was 67. 1%, and from speech was 57. 1%. The misclassification doesn’t concern much because the misclassification in one class will be attenuated in the two others. After performing an automatic classification of each modality, the different modalities were combined using a multimodal approach. Fusion of the modalities at the feature level (before running the classifier) and at the results level (combining results from classifier from each modality) were compared. A common approach based on Bayesian classifier provided by software Weks, a free toolbox containing a collection of machine learning algorithms for data mining tasks was used to compare the results of the unimodal, bimodal and the multimodal systems.

He overall performance of multimodal emotion recognition system based on feature level fusion was 78. 3%, which is clearly stands out as the most successful classifier if compare with unimodal classifier’s performances. 17 feature set remains in the final feature set out of which 5 features are from gesture modality, 9 features are from the speech modality and 2 features are from the face modality. The number of features persists from each modality doesn’t indicate the contribution of that modality. The approach based on decision-level fusion has overall performances of 74. 6% which is clearly lower as compared to the classifier based on feature-level fusion. Bimodal classification was also performed by using feature-level fusion. As expected, classifier based on both facial expression and speech data and, body gesture and speech outperform the classifier trained with a single modality. This leads to the conclusion that bimodal and multimodal emotion recognition system enhances the recognition over the unimodal system. The automatic system demonstrates the similar behavior like humans, use of multiple modalities to recognize emotion and process signals.

Gesture Recognition: A Survey (S. Mitra and T. Acharya)

This paper provides a survey on different aspect of gesture recognition using tools such as principal component analysis (PCA), FACS, contour models, Gabor filtering, Hidden Markov models (HMMS), Finite-state machine (FSM), particle filtering and Artificial Neural Network (ANN). Gesture is one the natural from of interaction, babies’ gesture even before they learn to talk. Gesture can be static (specific pose) or dynamic (body and hand movement). Gesture can be broadly categorize as following:

Hand and arm gesture: include gesticulation, pantomimes, emblems and sign languages. The tools discussed for hand gesture recognition are HMMs, Condensation Algorithm, FSMs and Connectionist Approach.

Face and head gesture: include nodding head, direction of eye gaze, raising eyebrows, flaring the nostrils etc. Face gesture recognition is used in application like criminal identification, credit card verification, surveillance, HDTV, medicines etc. The tools discussed for face and head gesture are HMMs, PCA, FACs, Contour Models, facial features extraction for gesture recognition, Gabor filtering etc.

Body gesture: involve full body motion, as in: analyzing movements of a dancer for generating matching music and graphics, recognizing human gaits for medical rehabilitation and athletic training. The tools discussed for the body gesture recognition are HMM, particle filtering and condensation algorithm, FSM approach, soft computing and Connectionist approach.

Gesture Recognitions plats a vital role in building an efficient Human-Computer interaction. It has wide-ranging applications from sign language recognition, photojournalism through medical technology to biometrics and virtual reality. Among all the different approaches, FSM and HMMs combine offers a potential approach to increase reliability and accuracy. An independent modelling of each state of FMS as an HMM is an interesting approach which can be useful in recognizing complex gestures.

See Me, Teach Me: Facial Expression and Gesture Recognition for Intelligent Tutoring Systems (A. Sarrafzadeh, S. Alexander, F. Dadgostar, C. Fan and A. Bigdeli).

This paper discusses how intelligent tutoring systems can be boosted to incorporate learners’ affective state in its student model. The objective is the development of Easy with Eve, an Affective Tutoring Systems (ATS) for mathematics, which detect non-vocal behavior dynamically and use this information to personalize interactions with the students. This system detects the student emotions, adapts to students and displays emotion via a lifelike agent called Eve. Eve’s is guided by a case-based system which uses data that was generated by an observational study.

An Intelligent Tutoring System (ITS) is capable of adapting knowledge, learning abilities and needs of each individual student to provide individualized instruction. Intelligent tutoring system offer many advantages over the traditional classroom scenario: they are always available, non-judgmental, and provide tailored feedback. An important factor in the success of human one-to-one tutoring is the tutor’s ability to identify and respond to affective cues, and here Affective Tutoring System comes into picture, which detects non-vocal behavior in real time and use it to individualize interaction with student. The basics of ATS is to analyze facial express and gesture to detect affective state.

For detecting face, ANN (Artificial Neural Network) and approach based on SVM (Support Vector Machine) and ANN classifier with accuracy of 96. 34% outperform the SVM classifier. Once the student’s affective state is discovered, animated agent (Eve) will be able to give appropriate emotional response to the students through their own facial expression and gesture. The observational study on human tutor involve videoing several tutor, with the aim to learn the ways in which human tutors adapts to affective state of student. This observational study’s data plays a significant role in tutoring process for adapting to student affect. Eve’s responses are driven by a case-based reasoning program that searches the data based on the sequence of interaction a scenario, and outputs a weighted set of recommended tutoring actions and facial expressions.

Gesture-Based Affective and Cognitive States Recognition Using Kinect for Effective Feedback during e-Learning (K. Vermun, M. Senapaty).

In this paper, Kinect technology is applied to gestures in order to recognize affective and cognitive states for providing user feedback in pedagogical context. Kinect is a motion sensing device, and capable of automatically calibrating the sensor based on gameplay and the player’s physical environment. Microsoft Kinect has several features such as feature extraction, gesture recognition, depth perception, etc. because of which it is preferred over regular webcams. Its depth sensor can be used to determine relative position of certain body parts especially forward and backward movement. The proposed gesture database used in this study is one of its kind to focus on sitting postures exclusively in learning context. These gestures will be documented in individuals in sitting posture.

A new tool to support diagnosis of neurological disorders by means of facial expressions (V. Bevilacqua, D. D’Ambruoso, G. Mandolino and M. Suma).

The important element of interpersonal communication in humans is the ability to produce and identify facial expressions of emotion. Processing of facial expression is supported by distributed neural systems, and as lesions studies show, preferentially by the right hemisphere. According to human and primate studies, cortical areas of the inferior occipital gyrus, fusiform gyrus and inferior temporal gyrus have been reported as essential for face processing. Orbitofrontal areas, in particular on the right, are involved in explicit identification of facial emotions. While cognition dysfunction has been thoroughly evaluated across neuropsychiatric disorders, impairments in emotion recognition have received increasing attention within the past 15 years. In this paper, author presented an innovative and still experimental tool to support diagnosis of neurological disorders with the assistance of facial-expressions monitoring. The basic idea is to track the face, after detecting it by a C++ software library for finding features in faces (STASM). The target of this tool is to provide an instrument for an early diagnosis in the initial stage of the disease, overworking the flat-affect of the patients. The tool aims to recognize the expressions of a face, and then classify them into positive and negative attitudes.

In patients with Alzheimer (AD) disease may be a deficit in processing some or all facial expressions of emotions. Emotional recognition deficits occur in bulbar Amyotrophic Lateral Sclerosis (ALS), particularly with emotional facial expressions, and can arise independent of depressive and dementia symptoms or co-morbidity with depression and dementia. These findings expand the scope of cognitive dysfunction detected in ALS, and bolster the view of ALS as a multisystem disorder involving cognitive and motor deficits. Thus, there is a direct relationship between neurological distress and expressivity of some emotions. For example, depression presents persistence of anger and sadness.

Gesture and emotion: Can basic gestural form features discriminate emotions? (M. Kipp and J. C. Martin).

The question here pops out is how gestures are related to emotion? So, this paper presents a study on how basic gestural form features (handedness, hand shape, palm orientation and motion direction) are related to components of emotion. Emotion is a four-part process consisting of physiological arousal, cognitive interpretation, subjective feeling, and motor expression. While facial expression has been extensively considered as the main vehicle for the motor expression component. So, the target is to build an emotion-to-gesture-component morphological lexicon to complement the meaning-form lexicon in gesture synthesis approach. This paper presents a study analyzing the relation between emotion and gestural features on a corpus of theater movies, using a coding scheme for emotions, based on the PAD (Pleasure (P), Arousal (A) and Dominance (D)) dimension, and for gestural features (handedness, hand shape, palm orientation, motion direction). The corpus used in this study was produced by actors in filmed theater staging are particularly well suited for such analyses. To represent and code emotions, two main approaches are used: categorical approaches and dimensional approaches. A categorical approach consists of selecting a finite set of discrete labels. Dimensional approaches represent an affective state by its location along one or more continuous axes. The encoding of temporal events like gestures always involves two principal steps, segmentation and categorization.

For gestures, features of handedness, hand shape, palm orientation, and motion direction were used and on the emotion side, single emotion dimensions of pleasure (P), arousal (A) and dominance (D) were used. Chi square (χ2) is computed to find correlated dimension pairs (e. g. pleasure and handedness). To find out the magnitude and direction (positive/negative) of correlations between concrete values, the deviations between expected occurrences and actual occurrences were analyzed. To compare occurrences of emotion episodes with gestural events in our data, co-occurrences between encoded emotion segments and encoded gestures were counted, looking at all pairs where the emotion interval temporally contained the gesture event.

Remember: This is just a sample from a fellow student.

Your time is important. Let us write you an essay from scratch

100% plagiarism free

Sources and citations are provided

Find Free Essays

We provide you with original essay samples, perfect formatting and styling

Cite this Essay

To export a reference to this article please select a referencing style below:

Current Developments In Emotion And Gesture Recognition. (2020, March 16). GradesFixer. Retrieved November 25, 2020, from
“Current Developments In Emotion And Gesture Recognition.” GradesFixer, 16 Mar. 2020,
Current Developments In Emotion And Gesture Recognition. [online]. Available at: <> [Accessed 25 Nov. 2020].
Current Developments In Emotion And Gesture Recognition [Internet]. GradesFixer. 2020 Mar 16 [cited 2020 Nov 25]. Available from:
copy to clipboard

Sorry, copying is not allowed on our website. If you’d like this or any other sample, we’ll happily email it to you.

    By clicking “Send”, you agree to our Terms of service and Privacy statement. We will occasionally send you account related emails.


    Attention! this essay is not unique. You can get 100% plagiarism FREE essay in 30sec

    Recieve 100% plagiarism-Free paper just for 4.99$ on email
    get unique paper
    *Public papers are open and may contain not unique content
    download public sample

    Sorry, we cannot unicalize this essay. You can order Unique paper and our professionals Rewrite it for you



    Your essay sample has been sent.

    Want us to write one just for you? We can custom edit this essay into an original, 100% plagiarism free essay.

    thanks-icon Order now

    Hi there!

    Are you interested in getting a customized paper?

    Check it out!
    Having trouble finding the perfect essay? We’ve got you covered. Hire a writer uses cookies. By continuing we’ll assume you board with our cookie policy.