Skip to main content
Vai all'homepage della Commissione europea (si apre in una nuova finestra)
italiano it
CORDIS - Risultati della ricerca dell’UE
CORDIS

Integrating reinforcement learning and predictive control for smart home energy management

Periodic Reporting for period 1 - SmartHEM (Integrating reinforcement learning and predictive control for smart home energy management)

Periodo di rendicontazione: 2023-10-10 al 2026-01-09

The world is currently witnessing a global urgency to revolutionise the energy sector and make it completely renewable. Buildings are responsible for approximately 40% of global energy consumption and 33% of greenhouse gas (GHG) emissions. To address this issue, improving building energy efficiency plays a key role in achieving the ambitious goal of carbon neutrality outlined in the European Green Deal. Specifically, smart buildings interacting with renewable energy sources, such as solar panels, power grids, and electric vehicles (EVs), are a prevailing concept to reduce both emissions and energy consumption. However, managing these systems is complicated due to the variable nature of renewable sources and stochastic user behaviours.

Consequently, this project aims to address these challenges by developing advanced control and learning frameworks for battery systems and home energy management. The project provides in-depth solutions for three critical aspects of the energy ecosystem: the efficient recycling of retired batteries, the fast charging of operational EV batteries, and the intelligent management of home energy.

First, regarding the end-of-life management of batteries, the project focuses on the pre-treatment phase of battery recycling. A robust model predictive control (MPC) framework has been developed for the fast discharging of retired lithium-ion batteries. The objective is to minimise the discharging time to improve recycling efficiency while strictly satisfying safety constraints, such as temperature limits, despite the uncertainties inherent in retired cells.

Second, to facilitate the adoption of EVs, the project addresses the conflict between charging speed and battery longevity. A reinforcement learning (RL) based strategy has been designed for the fast charging of batteries. This approach considers the battery state of health (SoH) and the risk of lithium plating. By optimising the charging current in a health-aware manner, the algorithm aims to achieve safe and fast charging across the entire battery lifespan.

Third, at the residential level, the project develops a smart home energy management system (HEMS) using deep reinforcement learning. This system manages the power flow between rooftop photovoltaic panels, stationary home batteries, and EVs under uncertain electricity prices and user driving schedules. The developed algorithm aims to minimise the total household electricity cost and the battery degradation cost simultaneously, while strictly maintaining occupant thermal comfort to ensure an economic and sustainable operation of the home energy network.
The primary objective was to integrate RL and predictive control for HEMS. To ensure system reliability and efficiency, we adopted a bottom-up approach, starting with the rigorous control of battery energy storage systems, which are the core components of modern smart homes. Consequently, the initial activities focused on developing advanced algorithms for battery discharging and fast charging. These efforts served as the technical foundation for the subsequent comprehensive HEMS framework. By first mastering the dynamics and safety constraints of batteries at the component level, the project successfully paved the way for the high-level integration of photovoltaic (PV) generation, stationary storage, and EVs.

To address the fundamental challenges in battery handling, we first investigated the fast discharging of retired lithium-ion batteries (LIBs). This work was critical for understanding battery behaviour under extreme conditions and ensuring safety during the recycling pre-treatment phase. We developed a control-oriented thermal-electric model and designed a robust MPC framework. This controller accounts for unknown internal parameters, such as internal resistance and capacity, by modelling them as bounded lumped disturbances within a robust invariant set framework. Simulation results demonstrated that this method minimises the discharging time while strictly adhering to safety constraints, such as cell surface temperature and terminal voltage, thereby establishing a robust control algorithm applicable to variable battery conditions.

Building on the component-level control, the second area of work optimised the charging strategy for EVs, which are essential active loads in a HEMS. We developed a lifelong RL framework that interacts with a high-fidelity electrochemical model to balance charging speed with battery longevity. The activities included designing a SoH-based cut-off voltage map that can indirectly mitigate lithium plating. A key achievement was the development of a lifelong optimisation horizon, where the agent learns to protect the battery over thousands of cycles. This strategy significantly extended the useful battery cycle life (by approximately 22.9%) while maintaining a competitive charging speed compared to standard constant-current constant-voltage protocols, ensuring that EVs can be reliably integrated into the home energy network.

The final stage synthesised these foundational technologies into a comprehensive HEMS. The work integrated PV generation, stationary home batteries, and EVs into a unified control framework utilising deep RL. A major advancement was the explicit incorporation of heterogeneous degradation characteristics, distinguishing between LFP batteries for stationary storage and NMC batteries for EVs. This system specifically addresses the stochastic nature of user driving behaviours, including arrival times and daily driving distances. By employing a Lagrangian relaxation method, the algorithm dynamically adapts penalty multipliers to strictly satisfy occupants' thermal comfort. It simultaneously optimises two conflicting objectives: minimising the household electricity bill and alleviating the degradation cost of the battery systems. Extensive numerical experiments showed that this data-driven approach reduces the total operational cost compared to rule-based benchmarks, demonstrating the successful application of the proposed control and learning methodologies at the system level.
Overview of results

The project successfully delivered advanced control and learning frameworks across the entire battery value chain, encompassing residential usage, EV integration, and end-of-life recycling. We established a rigorous bottom-up methodology, beginning with the safety-critical management of individual battery cells and expanding to the comprehensive management of smart home energy networks.

The first major achievement was the development of a robust model predictive control (MPC) scheme for the fast discharging of retired lithium-ion batteries (LIBs). By modelling internal parameter uncertainties as bounded lumped disturbances within a robust invariant set, we transformed the pre-treatment phase of recycling into an active, safety-guaranteed process. Building upon this component-level expertise, we subsequently designed a lifelong reinforcement learning (RL) strategy for the fast charging of operational EVs. A critical innovation here was the experimental derivation of an SoH-based cut-off voltage map, obtained via constant-current constant-overpotential (CC-COP) experiments. This map was integrated into a high-fidelity simulation environment to enable health-aware charging that balances speed with degradation over the battery's entire lifespan.

At the system level, these technologies were synthesised into a comprehensive HEMS driven by deep RL. This framework coordinates stationary storage and EVs. Unlike theoretical studies that rely on synthetic assumptions, this system was validated using realistic electricity pricing structures and survey-based stochastic driving data. The results demonstrated that the system effectively minimises household electricity bills and battery degradation costs, while simultaneously ensuring user comfort, adapting to user behaviour uncertainties without requiring perfect future knowledge.

Progress beyond the state of the art

Robust control for retired battery discharging

State-of-the-art discharging methods in recycling plants typically rely on fixed resistors or standard constant-current constant-voltage (CC-CV) profiles. These approaches are either inefficiently slow or lack the adaptability to handle the significant parameter inconsistencies found in retired cells, leading to potential safety hazards such as overheating. We advanced this field by developing a robust MPC framework that achieves fast discharging of batteries while strictly respecting temperature constraints, explicitly accounting for the model mismatch between nonlinear battery dynamics and the linearised model. The proposed approach offers a promising solution for the recycling and second-life applications of lithium-ion batteries.

Lifelong health-aware fast charging

Existing fast-charging strategies often focus on single-cycle optimisation or rely on simple physical constraints that do not fully capture long-term degradation mechanisms like lithium plating. Furthermore, they rarely optimise for the entire battery lifespan. We progressed beyond these limitations by introducing a lifelong RL framework integrated with a high-fidelity electrochemical model. Crucially, we conducted laboratory experiments to derive a novel SoH-based cut-off voltage map using the CC-COP method. By incorporating this experimentally validated constraint, the agent learns to protect the battery over thousands of cycles. This strategy extended the useful cycle life by approximately 22.9% while maintaining a charging speed competitive with standard methods, proving that longevity can be significantly improved without sacrificing efficiency.

Stochastic home energy management

Many existing HEMS solutions rely on rule-based logic or deterministic optimisation that fails to account for the stochastic nature of user behaviour or the heterogeneous degradation costs of different battery types. We advanced the state of the art by synthesising component-level controls into a unified deep RL framework. The system utilises a Lagrangian soft actor-critic (SAC) agent to manage power flows, explicitly accounting for the different degradation characteristics of different batteries and randomness of EV arrival, departure times and daily driving distances. Validated against realistic grid data and user behaviour surveys, our approach outperforms rule-based benchmarks. It simultaneously minimises electricity bills and battery degradation costs, offering a robust solution that remains effective under the uncertainties of residential energy usage.

Potential impacts and key needs

Potential impacts

The results of this project offer significant environmental and economic benefits. By extending the operational life of EV batteries through health-aware charging and ensuring the safe, efficient recycling of retired cells, the work directly supports the circular economy and reduces the carbon footprint associated with battery manufacturing and disposal. Economically, the HEMS framework provides households with a measurable reduction in operational costs by intelligently shifting loads and managing storage. For the recycling industry, the robust discharging scheme increases throughput, potentially lowering the cost of critical material recovery. Furthermore, the intelligent integration of EVs and stationary storage facilitates a more stable grid by smoothing out the intermittency of PV generation.

Identification of key needs for further uptake

Moving forward, pilot demonstration and hardware integration represent a crucial next step. Although the charging constraints were experimentally derived and the HEMS was validated using realistic data, these algorithms must now be deployed on physical testbeds and pilot sites. Demonstrating the robust model predictive control on full-scale retired battery packs and the HEMS on occupied smart homes is essential to validate real-time computational feasibility on embedded hardware.

In addition to technical deployment, establishing supportive regulatory frameworks is necessary for the HEMS to reach its full potential. This involves streamlining dynamic pricing structures and regulations concerning vehicle-to-home (V2H) connectivity. Implementing clear policies that incentivise consumers to permit automated control of their devices for grid services will accelerate market uptake.
Alongside regulatory advancements, commercialisation and intellectual property rights (IPR) support are vital as the algorithms progress from technology readiness level (TRL) 4-5 towards higher maturity. Securing IPR for the specific control structures will be critical to attract industrial partners for technology transfer.
Il mio fascicolo 0 0