Program

Program

Program of this half-day workshop includes 7 invited talks, a poster session (see below) and the trajectory prediction challenge. The main workshop track will be held in this Zoom room: https://epfl.zoom.us/j/95756947672. Additional rooms will be open during the poster session, see the table below.

All invited talks are available on our channel on Youtube (click on the talk title to see the video).

Time CEST (PST) Speaker Talk
16:30 - 16:40 (7:30 - 7:40) Organizers Welcome and Introduction
16:40 - 17:00 (7:40 - 8:00) Gonzalo Ferrer, Skoltech Human Motion Prediction for Social Robot Navigation
This talk will be focused on human motion prediction from the point of view of social robot navigation in urban environments. We will show how predicting the state of dynamic agents, that is, pedestrians, has a profound impact on robot navigation and the other way around, how robot actions affect motion prediction. We will also discuss on the importance of computationally fast solutions, when applied to real-world navigation and the correct management of risk due to the inherent limitations in prediction. Finally, we would mention the current challenges and some next promising directions in motion prediction.
17:00 - 17:20 (8:00 - 8:20) Marco Pavone, Stanford Multimodal Deep Generative Models for Intent Prediction
In this talk I will present a data-driven approach for learning multimodal interaction dynamics between robot-driven and human-driven vehicles based on recent advances in deep generative modeling. I will then discuss how to incorporate such a learned interaction model into a real-time, interaction-aware decision-making framework.
17:20 - 17:40 (8:20 - 8:40) Dariu Gavrila, TU Delft Predictive Motion Models for Vulnerable Road Users
Sensors are meanwhile very good at measuring 3D in the context of environment perception for self-driving vehicles. Scene labeling and object detection have also made big strides, mainly due to advances in deep learning. Time has now come to focus on the next frontier: modeling and anticipating the motion of road users. The potential benefits are large, such as earlier and more effective system reactions in dangerous traffic situations. To reap these benefits, however, it is necessary to use sophisticated predictive motion models based on intent-relevant (context) cues.
In this talk, I give an overview of predictive motion models and intent-relevant cues with respect to the vulnerable road users (i.e. pedestrians, cyclists). In particular, I discuss the pros and cons of having these models handcrafted by an expert compared to learning them from data. I present results from a recent case study on cyclist path prediction involving a Dynamic Bayesian Network and a Recurrent Neural Network.
17:40 - 17:50 (8:40 - 8:50) Coffee break and setting up posters
17:50 - 18:20 (8:50 - 9:20) Virtual poster session
18:20 - 18:40 (9:20 - 9:40) Adrien Gaidon, Toyota Research Institute Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction
Forecasting of the next events or actions in videos is a desirable capability for robotics and vision-based applications. In recent years, various models have been developed based on convolution operations for prediction or forecasting, but they lack the ability to reason over spatiotemporal data and infer the relationships of different objects in the scene. In this talk, we will present a framework based on graph convolution to uncover the spatiotemporal relationships in the scene for reasoning about pedestrian intent. We approach the problem of intent prediction from two different perspectives and anticipate the intention-to-cross within both pedestrian-centric and location-centric scenarios. In addition, we introduce a new dataset designed specifically for autonomous-driving scenarios in areas with dense pedestrian populations: the Stanford-TRI Intent Prediction (STIP) dataset. Our experiments on STIP and the standard JAAD benchmark show that our graph modeling framework is able to outperform the state of the art, predicting the intention-to-cross of the pedestrians with an accuracy of 79.10% on STIP and 79.28% on JAAD, up to one second earlier than when the actual crossing happens.
18:40 - 19:00 (9:40 - 10:00) Dorsa Sadigh, Stanford When our Human Modeling Assumptions Fail: The effects of risk, conventions, and non-stationarity on long-term human-robot interaction
Predicting computational models of humans is an important challenge in many robotics applications including autonomous driving, assistive teleoperation, robotic surgery, or interacting in a home with a service robot over a long-period of time. Most current techniques, model-based or data-driven, make strong assumptions about human behaviors. Specifically most techniques assume humans are noisily rational, capable of belief modeling to some extent, or do not adapt significantly over time. In this talk, we discuss settings where these assumptions fail to hold, and provide techniques and overarching paradigms that can capture human behavior even in complex scenarios. We will first discuss how to model human behaviors in near the end of the risk spectrum scenarios. We will then introduce the idea of conventions, i.e., low-dimensional shared representations that capture the interaction and can change over time. Finally, we discuss settings where humans do not simply follow a stationary model, and present reward learning approaches that can discover these non-stationaries.
19:00 - 19:10 (10:00 - 10:10) Coffee break
19:10 - 19:30 (10:10 - 10:30) Anca Dragan, Berkley Data-driven but Safe Prediction
Protecting against the worst-case human motion is too conservative, while fitting a black box policy to human data is too brittle when tested out-of-distribution. In this talk, I'll share our perspective that 1) we have to take into account that human motion is intent-driven when doing prediction, but 2) we also have to find ways to flexibly incorporate this into our predictors to avoid underfitting. I'll go over some examples of achieving this "flexibility" from our recent work.
19:30 - 19:50 (10:30 - 10:50) Heni Ben Amor, Arizona State University Learning to Predict and Respond for Human-Robot Interaction
Motion prediction plays a critical role in tasks that involve physical interaction between multiple people. In such scenarios, human interaction partners need to constantly make predictions about each other in order to align their actions in time and space. In this talk, I argue that a tight coupling between motion prediction and response generation is vital for fluent and effective human-robot interaction (HRI). Further, I will present Bayesian Interaction Primitives -- a probabilistic framework that enables learning and inference in such coupled representations for HRI scenarios. Bayesian Interaction Primitives encode the mutual dependencies between interaction partners and can be used to 1.) predict human motion and sensor values, 2.) infer task-relevant latent variables, and 3.) generate appropriate robot repsonses. A critical aspect of this approach is the ability to jointly reason about time and space. I will conclude the talk with a number of application scenarios such as human-robot hugging, collaborative lifting, throwing and catching, or assisted walking with an intelligent prosthesis.
19:50 - 20:20 (10:50 - 11:20) Kothari Parth TrajNet++ prediction challenge
20:20 - 20:30 (11:20 - 11:30) Organizers Closing remarks

Virtual poster session

Each accepted paper will be presented as a virtual poster in a dedicated room during the 30 minute session.

Authors Paper Room
Ronny Hug, Stefan Becker, Wolfgang Hübner and Michael Arens A Short Note on Analyzing Sequence Complexity in Trajectory Prediction Benchmarks [paper] [poster] n/a
Philipp Kratzer, Niteesh Balachandra Midlagajni and Jim Mainprice Anticipating Human Intention for Full-Body Motion Prediction [paper] [poster] n/a
Aleksey Postnikov, Alseksander Gamayunov and Gonzalo Ferrer HSFM-sigmaNN: Combining a Feedforward Motion Prediction Network and Covariance Prediction [paper] [poster] n/a
Irmak Guzey, Ahmet Ercan Tekden, Evren Samur and Emre Ugur Human Motion Prediction With Graph Neural Networks [paper] [poster] n/a
Mehmet Hakan Kurtoglu, Yunus Seker, Evren Samur and Emre Ugur Predicting Whole Body Motion Trajectories using Conditional Neural Movement Primitives [paper] [poster] n/a
Francesco A. N. Palmieri, Krishna R. Pattipati, Giovanni Fioretti, Francesco Verolla, Giovanni Di Gennaro and Amedeo Buonanno Exploration/Exploitation in Path Planning Using Probability Propagation [paper] [poster] n/a
Eike Rehder Prediction: Where To Go Next [poster] n/a