ABSTRACT
Facial expression recognition has been an active research area over the past few decades, and it is still challenging due to its high intra-class variation. Most of these works perform reasonably well on datasets of images captured in a controlled condition but fail to perform as good on more challenging datasets with more image variation and partial faces. In recent years, several works proposed a framework for facial expression recognition, using deep learning models. Despite the better performance of these works, there is still room great for improvement. In this work, we propose a deep learning approach that is based on an attentional convolutional network and Yolo 5, which can focus on important parts of the face and achieves significant improvement over various datasets, including FER-2013, CK+, FERG, and JAFFE. We also use a visualization technique that is able to find important face regions for detecting different emotions, based on the user’s output. Through the experimental results, we show that different emotions seem to be reactive to different parts of the face.