Skip to main content

Deferred Restructuring of Experience in Autonomous Machines

Periodic Reporting for period 3 - DREAM (Deferred Restructuring of Experience in Autonomous Machines)

Période du rapport: 2017-07-01 au 2018-12-31

DREAM is a robotics project that incorporates sleep and dream-like processes within a cognitive architecture. This enables an individual robot or a group of robots to consolidate their experience into more useful and generic formats, thus improving their future ability to learn and adapt. DREAM relies on Evolutionary methods for discovery, optimization, re-structuring and consolidation of knowledge. This new paradigm will make the robot more autonomous in its acquisition, organization and use of knowledge and skills just as long as they comply with the satisfaction of pre-established basic motivations. DREAM will enable robots to cope with the complexity of being an information-processing entity in domains that are open-ended both in terms of space and time. It paves the way for a new generation of robots whose existence and purpose goes far beyond the mere execution of dull tasks.

Accumulating knowledge over long periods of time requires a consolidation process, so as to avoid being overwhelmed by the abundance of incoming information. Sleep has been shown to be critical for many consolidation processes, such as restructuring of representations, maintaining knowledge integration and coherence, improving insight learning, driving abstractions, forming novel levels of description, deleting unwanted information, exploring recombination of concepts, and stimulating creative thinking (Wagner et al., Nature, 2004). Our targeted scientific breakthrough is to enable robots to gain an open-ended understanding of the world over long periods of time, with alternating periods of experience and sleep. The possible benefits of sleep has so far been neglected in robotics and artificial intelligence.

To achieve higher levels of autonomy and understanding in developmental robotics, we propose a paradigm shift with DREAM, a cognitive architecture that exploits sleep to improve its functioning. It is contended here that Evolutionary methods (Fernando et al, Frontiers in Comp Neuro, 2012; Bellas et al., IEEE-TAMD, 2010) are a unifying principle for creative thinking and knowledge consolidation; these methods form the core of DREAM. Our key insight is that the brain consists of three coupled subsystems that are generated and adapted according to experience through evolutionary means: Models to make predictions about future state of the environment, notably to understand the results of actions; Policies that generate actions and behaviors, and are related to task-specific perceptual features; Values to reward, evaluate and compare policies or models. The long-term vision is to build genuinely situated and embodied agents with beliefs, desires, personalities, and idiosyncrasies, who are as inevitably influenced by their individual developmental trajectories as we are. To reach the proposed adaptive properties, the architecture will rely on alternating between active interaction and passive introspection over past events, i.e. sleep.
"DREAM is a fundamental research project that proposes to study approaches of life long learning alternating between active periods in the real world (""daytime"") and analysis or exploration in simulation (""nighttime"") for representational redescription in robotics. The goal is to allow robots to deal with an open-ended environment and continuously adapt their behavior to their environment through a dedicated developmental process. This vision of robotics development and learning is original and requires to lay down new grounds. A first outcome of the project is the definition of a research framrework including the a definition and formalization of the relevant concepts, a methodology and a decomposition of the knowledge acquisition process into developmental waves. The knowledge built by the system can be split into three different categories: reward functions, policies controlling robot behavior and predictive models. A motivational engine to build new value functions and balance all reward functions has been proposed and tested. Different developmental waves are under development to build policies that can transfer to new domains or tasks. They are aimed at identifying objects and learning to recognize and manipulate them before consolidating this knowledge. To simplify their restructuring and consolidation processes, a unifying framework has been proposed for regression algorithms, that are at the core of predictive models. A long term memory management process has been added to the multi-level Darwinian brain architecture, the cognitive architecture used in the project. Beyond the development of a single individual, the impact of social learning has been studied. The project also considers the question of the role of sleep and dream processes in animals and humans. The role of replay is under investigation in a neuroscience context and a model of bird song development relying on a nocturnal restructuring process has been proposed to explain observed behaviors.
The proposed methodology defines the framework in which the project contributions take place. It allows Kuhn's 'normal science' to occur, what is necessary to consolidate knowledge and federate researchers. It is then expected to help building a scientific community around this topic, but the impact of these first contributions will not be limited to academia. The criteria we have defined measure progresses in the direction of adaptive robots. More precisely, they measure (1) generalization ability and (2) learning speed progress. Generalization is the ability to exhibit a given behavior in a large range of conditions. This is expected to help robots to go out of controlled conditions and to face open environments. Up to now, this is possible for reactive behaviors involving navigation only, as exemplified by vacuum cleaner robots. Behaviors implying object manipulation are more fragile, as they critically depend on object shape and weight. It is possible to program a robot to grasp a particular set of objects, but it remains difficult to program a controller that can grasp any object. Providing robots with the ability to discover it by themselves is then a critical step towards the design of robots able to face open environments. The existence of such robots would create new markets for robotics, notably service robotics, factory 4.0 or search and rescue robots.

The cross-fertilization approach between robotics and neuroscience may have an impact that goes beyond engineering and robotics. We aim at better understanding how a representation restructuring process occurs, to provide robots with this ability. Animals and humans have this ability. Drawing inspiration from the knowledge we have of this process may then help us in our quest. The idea we have proposed to test in a robotics context may also help to better understand how animals and humans actually do it. Little is known yet about the neural substrate of this process. The algorithmic principles we are proposing may give some insights to neuro-scientists, just like reinforcement learning helped them build new models of reward-based learning.
Project Logo