Skip to main content

Dream-like simulation abilities for automated cars

Periodic Reporting for period 2 - Dreams4Cars (Dream-like simulation abilities for automated cars)

Reporting period: 2018-07-01 to 2019-12-31

A human being learns to drive in tens of hours. As driving experience accumulates, humans improve their behavioural-control skills. Autonomous vehicles generally lack such learning abilities, improving only via the intervention of human designers. So far, no autonomous vehicle has yet proved to be even close to replicating the human accident rate (and they need to be better for mass introduction).

Is it possible to develop a robot that discovers behaviours and control without needing an excessive amount of trial and error, which would be dangerous in the real world and, in a simulated world, would be computationally expensive and not easily transferable to the reality?

Dreams4Cars contributes to achieving this ability. The theoretical framework inspiring Dreams4Cars is the Simulation Hypothesis of Cognition. In Dreams4Cars an artificial driving agent learns by creating and manipulating models of the world. By exploiting learned models, inverse control models can be generated with high efficiency.

1) An agent sensorimotor system with an architecture that is modular, scalable and explainable both at the agent level and at the level of the individual modules. The architecture combines individual predictive models into emergent and verifiable control and behaviours, and can accept logical directives and rules that influence the agent behaviours by biasing action-selection.
2) A process for the synthesis of control and behaviours, which first learns predictive models and then synthetizes inverse models of various hierarchical levels by means of episodic simulations that at each level exploit and manipulate the previous learned/synthetized level.
3) Demonstration of these technologies with the evolution of Codriver agents in three different environments: a research vehicle, a production vehicle and an engineering simulation system. Release of an open simulation environment and of data/methods examples.
4) Qualification of the Codriver abilities for the automotive domain at TRL 6, by passing a set of automotive-grade tests derived from adaptation of the Euro NCAP tests.
The developed agent sensorimotor system (Objective 1) was guided by a number of theoretical ideas in robotics and cognition. The agent is realised by an algorithmic scaffolding (figure 1), producing large-scale functions of different types, hosting learning modules. There are 5 main loops: 1) parallel action priming, 2) action-selection generating emergent adaptive behaviours, 3) a logical module which implements biases in action-selection steering the behaviour according to traffic rules, 4) a loop learning predictive models, 5) a loop which implements inverse model control.

Learning of behaviour and control (Objective 2) follows a hierarchy of motor abilities (figure 2). At ‘wake state’, predictive models of the vehicle dynamics are learned (declarative prediction models may also be learned within loop 1). Offline, at ‘dream state’, inverse models are trained to synthetize the inverse dynamics in a carefully crafted set of situations (episodes). The process proceeds by levels of increasing competence: predictive control, short-term goal-directed actions, etc.

A large number of experiments have been carried out while progressively training the motor abilities of the agent (Objective 3). The agent was trained to drive three different types of vehicles demonstrating interoperability. In addition, a fourth driving simulation environment, open for research and evaluation was released with examples of simulation scenarios, data and training procedures (D5.5).

A number of standardized tests have been created by adapting the Euro NCAP test scenarios to the autonomous driving case (Objective 4). These include basic functionalities like lane keeping, speed adaptation, obstacle avoidance as well as more complex situations like complex traffic in motorways and in urban scenarios. The successful progress of the agent abilities has been monitored across the development phase.

A large number of organised activities have been carried out for dissemination and exploitations. A rough count indicates 36 participations to workshops, conferences, talks, 22 scientific papers (that will continue to increase after project end), 10 communications actions, 6 organised workshops, 18 different liaisons activities, 7 exploitation-oriented liaisons and 1 post laureate Master course in Autonomous Driving Technologies.

Dreams4Cars today holds a portfolio of methods and know-how that is suited for exploitation. The exploitation strategy is to provide support methods for development of self-driving and driver-assistance functions. With this respect, the methods of Dreams4Cars do not need to be adopted altogether. Instead, they can also be adopted progressively (for example beginning with predictive/robust/adaptive control with learned dynamics), hence integrating in industrial workflows without disrupting impact.
Scientific results (in particular the “learning via embodied/episodic simulation”) and the open simulation environment are other non-commercial exploitation opportunities for (independent) studies, e.g. related to human-agent interactions in the driving simulator.
The major progress of Dreams4Cars constitutes two main achievements: the agent cognitive architecture and the methods for the synthesis of control and behaviour.

The cognitive architecture is an interesting example of application of a number of biological principles which have turned out to produce useful functionalities. It presents several technical and scientific innovations among which, in particular, the topographic organisation of the motor space with its fall-backs including explainability and modularity, the way in which traffic rules can be incorporated into the agent via biasing mechanisms, the minimum commitment principle, the minimum intervention principle, the principle of lower-level veto that makes for a safe sandbox for programmed rules.
The emergence of complex behaviours from from principles is also important because it allows for a lean and simple software design, which, in turn, is economical and easy to maintain.

The process for the synthesis of behaviours is the second major contribution. It is based on the learning of models of the world that are then manipulated to synthetize actions. It is very efficient both with respect to produced behaviours and for sample efficiency. It bootstraps a hierarchical sensorimotor architecture wherein each level of learning is enacted (and accelerated) by what was learned at the immediately preceding level.

The project findings contribute to the call expected impact, demonstrating (objectives 3 and 4) an increase in the level of system abilities in the automotive application domains, in particular with innovative robust agent architectures and lifelong development methods (objectives 1 and 2). This also targets other more general impacts such as demonstrated deployment of robotics and artificial cognition technologies in new application domains (transport); a contribution to the competitiveness of Europe's transport sector; an increase in robotic system abilities with learning methods inspired by human cognitive abilities.
Improving transport and traffic (from less pollution to safer and accessible mobility, to maintaining market leadership in the sector) is the main long-term goal for Society, to which this project contributed.