Weakly Supervised Learning for Image and Video Understanding Call for Papers

Weakly Supervised Learning for Image and Video Understanding


With the goal of addressing fine-level image and video understanding
tasks by learning from coarse-level human annotations, WSL is of
particular importance in such a big data era as it can dramatically
alleviate the human labor for annotating each of the structured
visual/multimedia data and thus enables machines to learn from much
larger-scaled data but with the equal annotation cost of the
conventional fully supervised learning methods. More importantly, when
dealing with the data from real-world application scenarios, such as
the medical imaging data, remote sensing data, and audio-visual data,
fine-level manual annotations are very limited and difficult to
obtain. Under these circumstances, the WSL-based learning frameworks,
specifically for the WSL-based multi-modality/multi-task learning
frameworks, would bring great benefits. Unfortunately, designing
effective WSL systems is challenging due to the issues of “semantic
unspecificity” and “instance ambiguity”, where the former
refers to the setting where the provided semantic label is at image
level rather than specific instance-level while the latter refers to
the ambiguity when determining an instance sample against the instance
part or instance cluster. Principled solutions to address these
problems are still under-studied. Nowadays, with the rapid development
of advanced machine learning techniques, such as the Graph
Convolutional Networks, Capsule Networks, Transformers, Generative
Adversarial Networks, and Deep Reinforcement Learning models, new
opportunities have emerged for solving the problems in WSL and
applying WSL to richer vision and multimedia tasks. This special issue
aims at promoting cutting-edge research along this direction and
offers a timely collection of works to benefit researchers and
practitioners. We welcome high-quality original submissions addressing
both novel theoretical and practical aspects related to WSL, as well
as the real-world applications based on WSL approaches.



Topics of interests include, but are not limited to:

-          Multi-modality weakly supervised learning theory and framework;

-          Multi-task weakly supervised learning theory and framework;

-          Robust learning theory and framework;

-          Audio-visual learning under weak supervision;

-          Weakly supervised spatial/temporal feature learning;

-          Self-supervised learning frameworks and applications;

- Graph Convolutional Networks/Graph Neural Networks-based weakly
supervised learning frameworks;

-          Deep Reinforcement Learning for weakly supervised learning;

-          Emerging vision and multimedia tasks with limited supervision;



Manuscript submission:           15th August 2021

Preliminary results:                  15th November 2021

Revisions due:                         1st January 2022

Notification:                             15th February 2022

Final manuscripts due:             15th March 2022

Anticipated publication:          Midyear 2022



Papers should be formatted according to the IEEE Transactions on
Multimedia guidelines for authors (see:
http://www.signalprocessingsociety.org/tmm/tmm-author-info/). By
submitting/resubmitting your manuscript to these Transactions, you are
acknowledging that you accept the rules established for publication of
manuscripts, including agreement to pay all over-length page charges,
color charges, and any other charges and fees associated with
publication of the manuscript. Manuscripts (both 1-column and 2-column
versions are required) should be submitted electronically through the
online IEEE manuscript submission system at
http://mc.manuscriptcentral.com/tmm-ieee. All submitted papers will go
through the same review process as the regular TMM paper
submissions. Referees will consider originality, significance,
technical soundness, clarity of exposition, and relevance to the
special issue topics above.