Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Controlling Quantum Experiments with Reinforcement Learning

Periodic Reporting for period 1 - ConQuER (Controlling Quantum Experiments with Reinforcement Learning)

Reporting period: 2020-10-01 to 2022-09-30

Artificially intelligent systems are integrated with various technologies we use on a near-daily basis. Whether it is in route-planning or in recommendation systems for ads or TV shows, the underlying algorithm likely uses some form of “learning from data”. These types of learning algorithms are likely going to be ever more present in future technologies. Relevant for this project in particular is the future of quantum technologies. Quantum computers are the prime example of such quantum technology, in which a fundamentally new way of performing computations is done that relies on the quantum mechanical phenomena. This promises to outperform classical computers in certain tasks, speeding up algorithms and thereby enabling computations that are infeasible on our current classical hardware.

Building a quantum computer requires exquisite control over its building blocks: quantum bits. These quantum analogs of classical bits are difficult to control, because the mere act of 'looking at them' (measuring them) destroys the quantum information stored in them. Quantum systems are very susceptible to noise, causing them to decohere. Once fully incoherent, the quantum mechanical nature of the system is lost. This project investigated the use of artificial intelligence (using "reinforcement learning" (RL)) for controlling quantum systems, directly integrating it with experiments, to try and preserve the coherence of a quantum system.

In reinforcement learning approaches to controlling quantum systems, an AI needs to learn values of control parameters (e.g. a magnetic field). For example, the agent must learn to say: "For the next period, set the magnetic field to value X". Or, "For the next second, set this voltage to value Y". This project aimed at testing a different type of control, in which the agent says: "Next, add amount B to the magnetic field", and "Now, keep the magnetic field at this value for the next second".

The objectives for this project were to:

* Introducing a novel reinforcement learning agent, training it on a quantum memory problem.
* Implementing a noisy variant of the quantum memory problem, and investigating how the agent learns to perform in such a scenario.
* The explicit integration of a reinforcement learning algorithm with ongoing spin-qubit experiments at Copenhagen university.

The conclusion for this project is twofold. An agent that is replaced by an optimization algorithm (called "CMA-ES") works just as good. This was initially meant as a benchmark problem, though it seems that this method by itself may be enhanced still. On the other hand, the integration with experiments is still ongoing and will be feasible in the near future. The integration of an optimization routine directly with the experiment was completed, and what is left now is to swap the CMA-ES with RL.
Achieving the goals of this project was divided into three major milestones, of which milestones 1 and 3 have in practice been realized. The second milestone, focusing on noisy environments, has been subsumed into milestone 3 (integration with experiment) because the scenario of milestone 3 is inherently noisy. The completion of the full project is on-track with the proposed timeframe, and will continue in collaboration with the University of Copenhagen despite early termination. Funding from the EU will continue to be acknowledged in subsequent dissemination of the results.

For these milestones:
1) The first major milestone revolved around the implementation of a reinforcement learning (RL) agent for quantum control. Here, the focus was on implementing an RL agent with a novel set of actions. As a problem to develop this method on, I started the ‘quantum cartpole’ project (in a collaboration with the original MSCA supervisor, Prof. Mark Rudner), in which the RL agent must learn to stabilize a quantum system by pushing it left and right. The benchmark problem of having the agent select pre-determined pushing-strengths has been completed, and we now test if agent can decide whether it wants to increase/decrease the push-strength. These results show that the newer action-set is more difficult to train, but produces more stable outcomes. This has not yet been published.

2&3) The direct integration of an AI optimizer with experiments at Copenhagen University meant an integration with existing controllers. Rather than having a non-tested RL agent optimize an experiment, we started with an evolutionary strategy (CMA-ES). We have successfully demonstrated that CMA-ES can optimize a quantum point contact experiment in-situ, and with the next set of experiments will obtain final data for a publication. The algorithm is robust to noise, and does not require the evaluation of gradients. This setting is similar to an RL agent, turning the original question into: “Can the RL agent outperform CMA-ES by using fewer queries to the experiment?”.

Overview of the results and their exploitation and dissemination:
* The integration of an optimization routine (CMA-ES) directly with the experiment has been completed. A publication on this is being written but has not yet been submitted. Exploitation will follow in the form of a public open-source code library.
* A reinforcement learning agent can indeed learn to control a quantum memory system. A scientific publication will follow, but results are being disseminated at the public website www.quantumcartpole.com.
* A physics-inspired neural network (for future use as a submodule in a reinforcement learning agent) was developed, and published (DOI: 10.1103/physrevresearch.4.l022032)
* Insights into neural networks as classifiers was also gained during this project, and was pubilshed (DOI: 10.21468/scipostphys.11.3.073)
* A novel type of reinforcement learning agent neural network was developed and trained on the problem of quantum error correction (DOI: 10.21468/scipostphys.11.1.005)

In addition, the following other events took place:
• Workshop: https://indico.fysik.su.se/event/7771/(opens in new window)
• Exhibition: Quantum Games at the CultureNight event in Copenhagen
• Training: UCPH ERC Writing Seminar (October 2021)
• Website: www.quantumcartpole.com
• Conferences: 1) Summer school Toulouse: https://mlqmb.sciencesconf.org/(opens in new window) and 2) CRC Meeting Berlin: https://www.crc183.uni-koeln.de/crc-183-berlin-conference-2022/(opens in new window)
The state-of-the-art in RL controllers for quantum systems uses absolute values for control parameters as actions. That is, the agent can choose to “set the magnetic field to value X”, where X is chosen by the user to have a fixed set of possibilities. Several other options exist, such as agents with continuous control parameters. In most of these scenarios, the agent is more difficult to train because it has more actions to choose from. Instead, this project moved beyond the state-of-the-art by keeping the number of actions low, but by making those actions more flexible. Hence, instead of setting absolute values, the agent now has the action “Increase (or decrease) the magnetic field by a small value Y”. By learning to repeatedly choose this action, the agent can still change the magnetic field to any multiple of Y.

Having an agent that is more interpretable and more trainable will reduce training cost (which translates directly to monetary costs), and will pave the way towards better control of quantum systems. The latter, in turn, paves the way toward better quantum technologies overall.
Project Logo
My booklet 0 0