We made significant advances on all three scientific objectives.
1. Hybrid (reasoning/learning) methods
Constraint Acquisition, we reduced expert queries by up to 72% using ML guidance (AAAI24), with further reductions via generalisation queries (AAAI25). We also developed preference-learning methods applied in our Manufacturing use case.
Decision-Focused Learning (DFL): We created the first DFL method for planning (ECAI24) and a multi-stage robust optimisation approach that reuses existing deterministic solvers yet matches the solution quality of scenario-based optimisation while being two orders of magnitude faster (KBS24). This was applied to the Energy use case to increase robustness to demand uncertainty.
Learning to Plan: We pioneered graph-learning heuristics (AAAI24, ICAPS24, IJCAI24) that solve 160% more problems than prior learned heuristics and rival or outperform state-of-the-art classical/numeric planners (ICAPS25, NeurIPS24). This was applied to Beluga. We further contributed theoretical and empirical insights into graph learning for planning (AAAI24, ICAPS24, AAAI25).
2. Verification and testing
We developed a neural network (NN) architecture enabling conservative verification with controllable cost and guarantees even out-of-distribution (AAAI25).
We introduced the first correct multiclass verification for decision-tree ensembles (DTE), and a compression method reducing size ensemble size with minimal accuracy loss (AAAI24, ICML25).
We designed two multi-step safety-verification algorithms for NN and DTE policies, achieving 2–5 orders of magnitude speed-ups and showing DTEs can be 3+ orders faster.
We built a policy-safety debugging loop that identifies unsafe runs, fixes unsafe DTE actions, and substantially reduces unsafe behaviour, enabling full verification in multiple domains. It transforms unsafe Beluga policies into provably safe ones.
3. Explainable planning and scheduling
We extended conflict-based explanations to numeric and probabilistic planning and developed faster/approximate methods (ICAPS24, ECAI24, AAAI25), supporting Manufacturing, Beluga, and Flight Diversion.
We built IPEXCO, an explanation-driven interactive planning platform with LLM conversational interface (XAI25), used in Beluga. We devised CPMpy.tools.explain to provide conflict computation, resolution, and stepwise explanations, and used it n Airbus Manufacturing (CP25).
We advanced explanations of ML policies, including distilling GNNs into C2 logic formulas and extending abductive explanations to NN policies (IJCAI23).
We built demonstrators for all use cases, integrating our research with use-case-specific solutions.
* Airbus Manufacturing: workforce allocation and scheduling via CP/PB/MILP and learned-heuristic guidance; includes disruption generation and conflict-based explanation (CP25 best application paper).
* Airbus Beluga Logistics: deterministic and hybrid planning algorithms, incomplete methods matching competition winners, policy verification/testing, and conflict-based explanations (ECAI25).
* Airbus Flight Diversion: hybrid RL/A* route computation, policy testing, and conflict-based explanations for fuel, time, cost, and safety.
* Optit Waste Collection: scalable 2-stage city-scale CVRP solution with interactive “destroy and repair” exploration, contrastive explanations, and robustness simulation.
* Optit Energy Management: robust Unit Commitment planner using DFL, without needing access to the optimization problem or the solver state; applied to both Optit and simplified planners.
* SciSports Squad Management: robust squad-planning tool integrating verification and tree-compression techniques, deployed in an end-user environment for safe, transparent use.
We released the TUPLES toolkit, aimed at assisting the design and development of trustworthy DSS. It includes:
* Self-Assessment Tool for trustworthiness evaluation;
* TUPLES Lab Python package with robust simulation environments;
* GitLab repository with code for code platforms and demonstrators;
* Contributions to Scikit-Decide, enhancing planning, scheduling, and RL.
We organised a scalability/explainability competition on Beluga, providing generators, simulators, evaluators, baseline solvers, and a new competition platform.
Most technologies advanced to TRL4, some demonstrators TRL5–7, and the competition platform TRL7–8.