1st Workshop on Human-Scene Interaction (HSI) Call for Papers

Call for Papers: 1st Workshop on Human-Scene Interaction (HSI)
Submission deadline: July 7, 2026
Location: ECCV 2026, Malmö, Sweden
Website: hsi-workshop.com


Overview

We invite submissions to the First Workshop on Human-Scene Interaction
(HSI) at ECCV 2026. This workshop focuses on modelling and generating
human motion and behaviour grounded in the surrounding scene. Our goal
is to bring together research from computer vision, graphics,
robotics, and multimodal learning to advance scene-aware embodied
agents, including (humanoid) robots and virtual avatars.

We welcome both archival and non-archival submissions.

Invited Speakers

    Umar Iqbal - NVIDIA DAIR Lab
    Taku Komura - University of Hong Kong
    Zhengyi Luo - NVIDIA GEAR Lab
    Gerard Pons-Moll - University of Tübingen

Topics of interest

Topics include, but are not limited to:

    Scene-conditioned human motion generation
    Human-scene and human-object interaction modelling
    Referring expression understanding and grounding in 3D scenes
    Language understanding and grounded communication for embodied agents
    Vision-language-motion alignment and grounding
    Vision-language-action (VLA) models for embodied agents
    Multimodal learning for motion and interaction
    Datasets, benchmarks, and evaluation for interaction
    Affordance learning and scene understanding
    Physically-based simulation of interaction
    Applications in robotics, animation, AR/VR, and embodied communication
    Technical reports accompanying challenge submissions

Submission guidelines

Archival submissions

    Must present original, unpublished work
    Will undergo peer review
    Accepted papers will be published in the ECCV 2026 Workshop Proceedings
    Papers must follow the ECCV formatting guidelines

Non-archival submissions

    May include previously published work, work under review, or ongoing research
    Intended for presentation only (poster or oral), and will not be
    included in the proceedings
    Ideal for sharing recent results, demos, or position papers


Submission links: To be announced

Important dates

    July 7 - Submission deadline
    July 31 - Notification
    September 2026 - Workshop

Challenge

The workshop also hosts a challenge on scene-aware referential gesture
generation. Given speech, a 3D target coordinate, and a virtual scene,
the goal is to generate full-body referential gestures that correctly
indicate the target object among distractors. For details on the task,
data, evaluation protocol, and baselines, see:

hsi-workshop.com/challenge

Contact
hsi-workshop@googlegroups.com

Organizers

    Jonas Beskow - KTH Royal Institute of Technology, Sweden
    Rishabh Dabral - Max Planck Institute for Informatics, Germany
    Anna Deichler - KTH Royal Institute of Technology, Sweden
    Fethiye Irmak Dogan - University of Cambridge, United Kingdom
    Anindita Ghosh - Max Planck Institute for Informatics, Germany