Evaluating human-robot social interactions in a rigorous manner is notoriously
difficult: studies are either conducted in labs with constrained protocols to
allow for robust measurements and a degree of replicability, but at the cost of
ecological validity; or in the wild, which leads to superior experimental
realism, but often with limited replicability and at the expense of rigorous
interaction metrics.
In the frame of the DoRoThy project, we have conceptualised,
designed, implemented and applied a novel interaction paradigm, designed to
elicit rich and varied social interactions while having desirable scientific
properties (replicability, clear metrics). This paradigm focuses on both child-child and child-robot
interactions, and builds on what we call a sandboxed free-play environment.
The free-play sandbox is based on free play interactions: Pairs of children
(4-8 years old in our experiments) are invited to freely draw and interact with items
displayed on an interactive table, without any explicit goal set by the
experimenter. The task is designed so that
children can engage in open-ended and non-directive play, yet it is
sufficiently constrained to be suitable for recording, and allows the
reproduction of social behaviour by an artificial agent in comparable
conditions.
Our interactive table is equipped with 3D cameras recording the faces and postures of the children,
and the quantity and thoughtfulness of information logged allows
to keep a track of every interaction happening around the game.
These advantages, combined with the openness of the proposed task, make
this setup a powerful tool to observe and quantify a large range of
social behaviours expressed by children when interacting in a natural
environment.
Using this innovative platform, we have conducted a large scale data collection campaign,
to build a first-in-this-kind dataset of social interactions, called the PInSoRo dataset: 120 children, from 4 to 8 years old,
have been recorded while playing either together, or with a robot. 45 hours of 3D video and audio
have been acquired, including close to 2 millions frames of faces. Using a new coding scheme,
developed during the project, this dataset is now being annotated by hand with the hundreds of
social micro-episodes that took place between the children and the robot.