Call for Papers

Call for participation in the 
ACM Multimedia Deep Video Understanding Grand Challenge
Where: Ottawa, Ontario, Canada (https://www.acmmm2023.org/grand-challenges-2/)
When: Oct. 29 - Nov. 3, 2023
Website: https://sites.google.com/view/dvuchallenge2023/home


Background:
Deep video understanding is a difficult task which requires systems to
develop a deep analysis and understanding of the relationships between
different entities in video, to use known information to reason about
other more hidden information, and to populate a knowledge graph (KG)
representation with all acquired information. To work on this task, a
system should take into consideration all available modalities
(speech, image/video, and in some cases text). The aim of this
challenge series is to push the limits of multimodal extraction,
fusion, and analysis techniques to address the problem of analyzing
long duration videos holistically and extracting useful knowledge to
utilize it in solving different types of queries. The target knowledge
includes both visual and non-visual elements. As videos and multimedia
data are getting more and more popular and usable by users in
different domains and contexts, the research, approaches and
techniques we aim to apply in this Grand Challenge will be very
relevant in the coming years and near future.

Challenge Overview:

Interested participants are invited to apply their approaches and
methods on an extended novel Deep Video Understanding (DVU) dataset
being made available by the challenge organizers. The dataset is split
into a development data of 14 movies from the 2020-2022 versions of
this challenge with Creative Commons licenses, and a new set of 5
movies licensed from KinoLorberEdu platform. The development data
includes: original while videos, segmented scene shots, image examples
of main characters and locations, movie-level KG representation of the
relationships between main characters, relationships between
characters key-locations, scene-level KG representation of each scene
in a movie (location type, characters, interactions between them,
order of interactions, sentiment of scene, and a short textual
summary), and a global shared ontology of locations, relationships
(family, social, work), interactions and sentiments. The testing
dataset consists of 5 Kinolorber licensed movies.

The organizers will support evaluation and scoring for a hybrid of
main query types, at the overall movie level and at the individual
scene level distributed with the dataset. Participants will be given
the choice to submit results for either the movie-level or scene-level
queries, or both. And for each category, queries are grouped for more
flexible submission options (please refer to the dataset webpage for
more details):

Example Question types at Overall Movie Level:

Multiple choice question answering on part of Knowledge Graph for
selected movies.

Fill in the Graph Space - Given a partial graph, systems will be asked
to fill in the graph space.


Example Question types at Individual Scene Level:

Find next or previous interaction, given two people, a specific scene,
and the interaction between them.

Find a unique scene given a set of interactions and a scene list.

Fill in the Graph Space - Given a partial graph for a scene, systems
will be asked to fill in the graph space.

Match between selected scenes and set of scene descriptions written in
natural language .

Scene sentiment classification.

A new addition to 2023 challenge is that systems may also submit their
results against a secondary dataset where real world noise and various
types of perturbations and corruptions are introduced (in visual and
audio channels). This will allow the measure of multimodal robustness
in this context.

IMPORTANT DATES
DVU development data release: Available now (more updates will be added by April 15)
Testing dataset release : April 15, 2023
Testing queries release: June 2, 2023
Paper submission deadline: July 14, 2023
Submissions of solutions to organizers: July 14, 2023
Results released back to participants: July 24 2023
Notification to authors: July 24, 2023
Camera-ready submission: August 6, 2023
Grand Challenge at ACM Multimedia: TBD


DVU 2023 Grand Challenge Organizers