The 10th IEEE International Workshop on Analysis and Modeling of Faces and Gestures (AMFG2021) Call for Papers

 Deadline Extended: Last CFP

2021 CVPR Workshop: 
The 10th IEEE International Workshop on Analysis and Modeling of Faces and Gestures (AMFG2021)

-- A deeper understanding of face, gestures, and higher-level
attributes and semantics to further social analysis and enhance HCI


Call for papers

We have experienced rapid advances in the face, gesture, and
cross-modality (e.g., voice and face) technologies. This is thanks to
deep learning (i.e., dating back to 2012, AlexNet) and large-scale,
labeled datasets. The progress in deep learning continues to push
renowned public databases to near saturation which, thus, calls for
evermore challenging image collections to be compiled as databases. In
practice, and even widely in applied research, using off-the-shelf
deep learning models has become the norm, as numerous pre-trained
networks are available for download and are readily deployed to new,
unseen data (e.g., VGG-Face, ResNet, amongst other types). We have
almost grown “spoiled” from such luxury, which, in all
actuality, has enabled us to stay hidden from many
truths. Theoretically, the truth behind what makes neural networks
more discriminant than ever before is still, in all fairness,
unclear—rather, they act as a sort of black box to most
practitioners and even researchers, alike. More troublesome is the
absence of tools to quantitatively and qualitatively characterize
existing deep models, which, in itself, could yield greater insights
about these all so familiar black boxes. With the frontier moving
forward at rates incomparable to any spurt of the past, challenges
such as high variations in illuminations, pose, age, etc., now
confront us. However, state-of-the-art deep learning models often fail
when faced with such challenges owing to the difficulties in modeling
structured data and visual dynamics.

Alongside the effort spent on conventional face recognition is the
research is done across modality learning, such as face and voice,
gestures in imagery, and motion in videos, along with several other
tasks. This line of work has attracted attention from industry and
academic researchers from all sorts of domains. Additionally, and in
some cases with this, there has been a push to advance these
technologies for social media-based applications. Regardless of the
exact domain and purpose, the following capabilities must be
satisfied: face and body tracking (e.g., facial expression analysis,
face detection, gesture recognition), lip reading and voice
understanding, face and body characterization (e.g., behavioral
understanding, emotion recognition), face, body and gesture
characteristic analysis (e.g., gait, age, gender, ethnicity
recognition), group understanding via social cues (e.g., kinship,
non-blood relationships, personality), and visual sentiment analysis
(e.g., temperament, arrangement). Thus, needing to be able to create
effective models for visual certainty has significant value in both
the scientific communities and the commercial market, with
applications that span topics of human-computer interaction, social
media analytics, video indexing, visual surveillance, and the internet
vision. Currently, researchers have made significant progress
addressing many of these problems, and especially when considering
off-the-shelf and cost-efficient vision HW products available these
days, e.g. Intel RealSense, Magic Leap, SHORE, and
Affdex. Nonetheless, serious challenges still remain, which only
amplifies when considering the unconstrained imaging conditions
captured by different sources focused on non-cooperative subjects. It
is these latter challenges that especially grabs our interest, as we
sought out to bring together the cutting-edge techniques and recent
advances of deep learning to solve the challenges in the wild.

This one-day serial workshop (AMFG2021) provides a forum for
researchers to review the recent progress of recognition, analysis,
and modeling of face, body, and gesture, while embracing the most
advanced deep learning systems available for face and gesture
analysis, particularly, under an unconstrained environment like social
media and across modalities like face to a voice. The workshop
includes up to 3 keynotes and peer-reviewed papers (oral and
poster). Original high-quality contributions are solicited on the
following topics:

    Novel deep model, deep learning survey, or comparative study for
    face/gesture recognition;

    Deep learning methodology, theory, as applied to social media

    Data-driven or physics-based generative models for faces, poses,
    and gestures; Deep learning for internet-scale soft biometrics and
    profiling: age, gender, ethnicity, personality, kinship,
    occupation, beauty ranking, and fashion classification by facial
    or body descriptor;

    Deep learning for detection and recognition of faces and bodies
    with large 3D rotation, illumination change, partial occlusion,
    unknown/changing background, and aging (i.e., in the wild);
    especially large 3D rotation robust face and gesture recognition;

    Motion analysis, tracking, and extraction of face and body models
    captured from several non-overlapping views;

    Face, gait, and action recognition in low-quality (e.g., blurred),
    or low-resolution video from fixed or mobile device cameras;

    AutoML for face and gesture analysis;

    Mathematical models and algorithms, sensors and modalities for
    face & body gesture and action representation, analysis, and
    recognition for cross-domain social media;

    Social/psychological based studies that aid in understanding
    computational modeling and building better-automated face and
    gesture systems with interactive features;

    Multimedia learning models involving faces and gestures (e.g.,
    voice, wearable IMUs, and face);

    Social applications involving detection, tracking & recognition of
    face, body, and action;

    Face and gesture analysis for sentiment analysis in a social

    Other applications involving face and gesture analysis in social
    media content.

Submissions are handled via the workshop's CMT website:

Following the guideline of CVPR2021:

    8 pages (+ references)
    Using CVPR Latex/Word templates

Tentative Program Outlines

    8:30 AM - 8:45 AM Chairs'opening remarks
    8:45 AM - 9:30 AM Invited talk I
    9:30 AM - 10:00 AM Coffee break I
    10:00 AM - 12:30 PM Oral Session I
    12:30 PM - 2:00 PM Lunch break
    2:00 PM - 2:45 PM Invited talk IIw
    2:45 PM - 3:15 PM Coffee break II
    3:15 PM - 5:00 PM Oral session II
    5:00 PM Best Paper Announcement and Conclusion

Previous AMFG Workshops

The first AMFG was held in conjunction with the 2003 ICCV in Nice,
France. So far, it has been successfully held NINE times. The
homepages of previous AMFG workshops are as follows:


Honorary General Chair

    Rama Chellappa, University of Maryland
    Matthew A. Turk, Toyota Technological Institute at Chicago (TTIC)

General Co-Chairs

    Y. Raymond Fu, Northeastern University, Boston, USA
    Mike Jones, Mitsubishi Electric Research Laboratories (MERL), Cambridge, USA 

Workshop Co-Chairs

    Ming Shao, University of Massachusetts Dartmouth, USA,  
    Sarah Ostadabbas, Northeastern University, Boston, USA,
    Zhengming Ding, Indiana University-Purdue University Indianapolis, 
    Sheng Li, University of Georgia, Athens, GA, USA. 
    Joseph P. Robinson, Northeastern University, Boston, USA,  
    Yu Yin, Northeastern University, Boston, USA,  


Important Dates (Tentative)

Date: the whole day of June 16th, 17th, or 21st, 2021

Estimated Attendance: ~100 attend (from academia and industry). Most
with basic knowledge of computer vision and pattern recognition
(estimated based on previous AMFG workshops)

**Submission Deadline: March 13th, 2021 March 28th, 2021

Notification: March 30th, 2021 April 8th, 2021

Camera-Ready: April 20th, 2021



Estimated submissions and acceptance rate (estimated based on previous
AMFG workshops):

    Estimated submissions: > 60 submissions
    Estimated acceptance rate: 30%-40%
    Estimated paper breakdown: Oral (40%) Poster (60%)


Best paper award: Like previous AMFG workshops, the best paper award
shall be sponsored by a computer vision-related company.