CONTEXTUS Workshop: Understanding Multi-Actor Scene Interaction in Context Call for Papers

We are pleased to invite submissions for our forthcoming 
CONTEXTUS Workshop: Understanding Multi-Actor Scene Interaction in Context
to be held in conjunction with ECCV 2026 in Malmö, Sweden.

 

Real-world environments are inherently interactive and social. Humans
rarely act in isolation: behavior unfolds through coordination,
negotiation, competition, and collaboration, shaped by shared goals,
roles, objects, and evolving contextual constraints. As embodied AI
systems increasingly operate alongside people (assisting,
collaborating, or communicating), they must move beyond isolated
perception toward understanding interaction dynamics, social
influence, anticipation, and causal structure in situated,
multi-human/agent settings.

 

The CONTEXTUS workshop addresses a fundamental next step for computer
vision: transitioning from object- and action-centric recognition
toward socially grounded, context-aware visual intelligence. While
major advances have been achieved in detection, tracking, and action
recognition, modeling relational dependencies, joint intention,
temporal coherence, and how actions influence and are influenced by
others within a scene remain as challenging problems. Context, whether
spatial, temporal, social, or task-level, is central to meaningful
interpretation.

 

The workshop seeks to bring together researchers working at the
intersection of context modeling, multimodal learning, embodied AI,
and social intelligence. We encourage submissions presenting novel
methods, datasets, benchmarks, theoretical insights, interdisciplinary
perspectives, and real-world applications related to socially aware
and context-driven scene understanding.

 

Topics of interest include, but are not limited to:

    Multi-human, human-object, and human-scene interaction understanding and generation,
    Context-aware scene understanding and representation learning,
    Context modeling, contextual memory, and long-term temporal reasoning, 
    Long-tail and open-vocabulary recognition using contextual and relational cues,
    Context-aware prompting of multimodal foundation models,
    Multimodal retrieval-augmented scene understanding, 
    Egocentric and exocentric multimodal perception,
    Human behavior understanding and social signal processing,
    Relational and causal reasoning in videos and multimodal data,
    Knowledge injection, procedural knowledge modeling, and structured human-scene understanding,
    Temporal reasoning, forecasting, and action anticipation,
    Affordance learning and task-oriented scene reasoning,
    Neuro-symbolic learning and structured reasoning approaches,
    Explainability and interpretability in multimodal interaction models,
    Foundation and world models for interaction understanding,
    Embodied AI, human-AI interaction, and behavior-conditioned HCI,
    Domain adaptation, generalization, robustness, and bias mitigation  for social scene understanding, 
    Datasets and benchmarks for context-aware interaction understanding,
    Applications of the above in robotics, healthcare, education, assistive technologies, and smart environments. 

 

Submission tracks

We welcome submission to three tracks: 

    Novel track: novel work, double-blind peer reviewed.
    Non-archival track: papers published in or accepted at other
      venues, to be presented during the workshop as posters to increase
      visibility and encourage further discussion.
    Challenge track: novel work, double-blind peer reviewed, related
      to the ongoing ChaLearn UDIVA-HHOI Challenge associated with the
      workshop.

 

Important dates

17 July: Novel track submission deadline

17 July: Non-archival track submission deadline

31 July: Challenge track submission deadline

7 August: Notification of acceptance

13 August: Camera ready

8-9 Sept: CONTEXTUS workshop

 

Invited speakers

Dima Damen (U. Bristol; Deepmind)

Paul Liang (MIT)

Juan Carlos Niebles (Stanford; Salesforce)

 

Submission details and more information

Please check the official CONTEXTUS Workshop website: 
https://lap.chalearn.eu/public/ECCV26-CONTEXTUS

 

Organizers

Cristina Palmero (King’s College London)

Sergio Escalera (Universitat de Barcelona & Computer Vision Center)

Albert Clapés (Universitat de Barcelona)

Xavier Baró (Universitat de Barcelona)

Daniele Berardini (Istituto Italiano di Tecnologia)

Hugo Jair Escalante (The University of Texas at El Paso & INAOE)

Vittorio Murino (Istituto Italiano di Tecnologia & University of Verona)