Visuomotor Representations of the Peripersonal Space in Humanoid Robots. Active Learning by Reaching and Gazing.

Tuesday, 12 May, 2015

Modern robots are expected to act in unstructured and changing environments, to
work side-by-side with humans without damaging them, and to have long-life autonomy.
For these reasons they need to make sense of their surroundings, recognize
objects, detect and, possibly, identify the people around them. They also need to
autonomously recalibrate their internal parameters to be robust to changes of their
embodiment, due to wear, damages or the use of un-modeled tools.
These same problems are faced and brilliantly solved by biological organisms
and in particular by human beings. We argue that the success of biological systems
relies on three fundamental computational principles: (a) the continual coupling of
perception and behavior; (b) the continual learning between motor action and the
consequential sensory perception; (c) and the efficient integration of the emerging
multimodal cues. Building on these computational principles, we propose novel approaches
for robot perception that emulate the strategies with which humans interact
with the environment. Proprioceptive and visual cues resulting from these strategies
are integrated following probabilistic principles both to learn the associations between
the sensorimotor transformations and to create a coherent representation of
the scene.
In particular, we first propose an active, monocular approach to compute the
depth of the observed scene. Mimicking humans’ behavior during fixation, the robot
performs coordinated head and eye movements that are instrumental in revealing
depth information. In the meantime proprioceptive cues and visual information are
integrated in a probabilistic framework to obtain a dense depth map. Achieved results
show that this approach yields accurate and robust 3D representations of the
observed scene.
Then we focus on the learning of saccade control. Saccades are fast eye movements
that allow humans and robots to bring the visual target in the center of the
visual field. They are executed open loop with respect to the vision system, and
consequently they require a precise knowledge of the internal model of the oculomotor
system. We propose two novel approaches to goal-directed learning of saccade
control. The first is a linear approximation of the direct inverse modeling and
works without any a-priori knowledge of the inverse model. The second approach,
namely recurrent architecture, is inspired by the recurrent loops between the cerebellum
and the brainstem. In this model, the brainstem acts as a fixed-inverse model
of the plant, while the cerebellum acts as an adaptive element that learns the internal
model of the oculomotor system. In both cases the adaptive element that approximates
the internal model is implemented using a basis function network trained with
the equations of the Kalman filter. This learning algorithm ensures fast convergence
and allows us to get a confidence value of the output response that can be used
to speed-up the learning process. The proposed approaches were validated through
experiments performed both in simulation and on an antropomorphic robotic head,
and were compared to the feedback error learning. Achieved results show that the
recurrent architecture outperforms the feedback error learning in terms of accuracy
and insensitivity to the choice of the feedback controller.
Moreover, we show how the combination of looking and reaching to the same target
leads to an implicit sensorimotor representation of the peripersonal space, that
is, the space around the body. This representation is created incrementally by linking
together correlated signals. Also, such a map is not learned all at once, but following
an order established by the temporal dependencies among different modalities,
which is imposed by the choice of vision as the master signal. Indeed, visual feedback
is used both to correct gazing movements and to improve eye-arm coordination.
Inspired by these observations the proposed framework builds and maintains
an implicit sensorimotor map of the environment. We show how this framework allows
the robot to gaze and reach to target objects. Moreover, the model is updated
on-line after each movement and achieved results show how the robot can adapt to
unexpected changes of its body configuration.
Finally, we have developed a cognitive architecture that integrates in a unified
framework several models inspired by neural circuits of the visual, frontal and posterior
parietal cortices of the brain. The outcome of the integration process is a
system that allows the robot to create its internal model and its representation of the
surrounding space by interacting with the environment directly, through a mutual
adaptation of perception and action. The robot is eventually capable of executing
a set of tasks, such as recognizing, gazing and reaching objects, which can work
separately or cooperate for supporting more structured and effective behaviors.