Robot Intention Projection:

A Mixed-Reality Approach to Human-Robot Collaboration and Communication

The ability to quickly understand each other’s intentions and goals is a critical element of successful collaboration within human teams. Efficient teaming often emerges as a result of explicit or implicit cues that are shared, recognized, and understood by the participants. Such cues act as signals that maintain trust, situational awareness, and mutual understanding among team members. The ability to communicate intentions through implicit and explicit cues is also of critical importance to fluent human-robot collaboration. As highlighted in the Roadmap for U.S. Robotics report, humans must be able to read and recognize robot activities in order to interpret the robot’s understanding” [Christensen2009roadmap]. Failure to establish such a shared understanding of the situation may lead to potentially lethal accidents. We introduce an alternative communication paradigm that is based on the projection of explicit visual cues. In particular, we propose Intention Projection – a context-aware projection method that embeds visual signals within the environment, such that they can be intuitively understood and directly read by the human partner.

VIDEOS OF INTENTION  PROJECTION

REALTIME OBJECT TRACKING

Our presented system uses vision-based 3D object tracking to estimate the 6-DOF pose of objects in the environment. To this end, we use a model-based tracking algorithm to estimate the pose of objects in real-time. The tracker uses polygonal mesh features from 3D CAD model to estimate the pose of a desired object. Tacking is achieved by solving a least-squares optimization problem minimizing the projection error between the projected 3D model and the features identified in the image. Instead of using only single low-level hypothesis for pose estimation, we handle multiple low-level hypotheses simultaneously. This enhanced approach enables robust tracking of objects even when projections are overlaid on objects. An occlusion-aware computer vision method, along with Kalman filtering is used to deal with occlusions caused by human partners.

PROJECTION MAPPING SYSTEM

Given the 3D pose, we can perform projection mapping in order to display additional information on top of an object while taking into account the geometric structure. Using a projection device, the visual cues are projected into the environment in order to rapidly communicate important aspects of the tasks. The pose and shape of objects from the tracker are incorporated into the generation of visual cues, which enables the system to display only on objects-of-interest. Since rendering of visualizations is performed within the reference frame of the projector, transforming the tracked object pose from the camera to projector frame of reference is required. To this end, projector-camera calibration is performed between the two reference frames. All algorithms are implemented in c++ and run on a single desktop PC. Our system can simultaneously track, render, and project on multiple objects in real-time at a frame rate of 20–30 Hz.

EXTENSIBLE VISUAL LANGUAGE

we propose an extensible visual language to explicitly convey information to a human collaborator through visual signals. A set of patterns, analogous to parts of speech, are used to form a visual language from which visual messages can be formed. The language includes a reasonable fragment of patterns for human-robot interaction tasks, but can be further extended according to the application domain. Since the visual processing system in humans is very fast, visual messages can rapidly be processed without additional cognitive effort. The basic fragment of visual cues proposed here includes patterns for designating and targeting objects (substantives), indicating positions, relations, and orientations (prepositions), basic movement instructions (verbs), success and failure (affirmation), hazards and visualizing the robot work area. Basic cues can be composed to generate a sequence of instructions or a visual equivalent of a phrase. Cues are, in turn, are translated into a visual message by generating appropriate mixed-reality signals.

EXAMPLE SIGNALS AND INTERACTIONS

EXPERIMENTS AND RESULTS

A human subject experiment was conducted to compare the performance and usability of the proposed system using real-time projected cues in the workspace with a conventional method using static printed instructions. The aim of the experiment was to collect objective and subjective measurements from human subjects to analyze and evaluate the efficiency, effectiveness and satisfaction of collaborating with a robot teammate. In our experiment, we manipulated a single independent variable, mode of communication, which can have one of the three values:

Printed mode: The subjects were provided with a printed set of instructions in the form of written descriptions and corresponding figures. The printed instructions were pasted on a wall adjacent to the workspace and were available to the subject throughout the experiment.

Mobile display mode: The subjects were provided with a tablet device consisting of instructions in the form of texts, figures, animations and videos. The device was free to be carried around while executing the task. Instructions were provided just-in-time via “forward” and “backward” buttons that allowed users to move to the next or previous tasks.

Projection mode: The subject was provided with just-in-time instructions by augmenting (using projection mapping) the work environment with mixed reality cues.

The objective evaluation using the task completion time and accuracy measurements corroborated our hypotheses that using our mixed reality system would increase the efficiency and effectiveness of a human-robot team. Participants took less time for the task when following projected visual instructions. Our analysis also confirmed that visual instructions were intuitive and took approximately the same amount of time for different participants to understand. Participants responded favorably to feedback and found the projected case to be enjoyable.

PAPER AND REFERENCES

Title: Better Teaming Through Visual Cues: How Projecting Imagery in a Workspace Can Improve Human-Robot Collaboration

Cite: If you use this library, please use this citation:
@article{ganesan2018better,
title={Better teaming through visual cues: how projecting imagery in a workspace can improve human-robot collaboration},
author={Ganesan, Ramsundar Kalpagam and Rathore, Yash K and Ross, Heather M and Amor, Heni Ben},
journal={IEEE Robotics \& Automation Magazine},
volume={25},
number={2},
pages={59–71},
year={2018},
publisher={IEEE}
}

Title: Projecting Robot Intentions into Human Environments

Cite: If you use this library, please use this citation:

@inproceedings{andersen2016projecting,
title={Projecting robot intentions into human environments},
author={Andersen, Rasmus S and Madsen, Ole and Moeslund, Thomas B and Amor, Heni Ben},
booktitle={2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)},
pages={294–301},
year={2016},
organization={IEEE}
}