Community Research and Development Information Service - CORDIS

Periodic Report Summary 1 - PICSSAR (Development of a new generation of highly scalable and accurate 3D Particle-In-Cell codes)

• Introduction:

The success of PetaWatt (PW) laser facilities presently under construction, which aim at producing promising particle and light sources from relativistic laser-plasma interactions, will rely on the strong coupling between experiments and large-scale simulations with Particle-In-Cell (PIC) codes. Standard PIC codes currently in use fail to accurately describe these new interaction regimes partly because the finite difference Maxwell solver used to compute electromagnetic fields generates strong instabilities when particles move at relativistic velocities (e.g numerical dispersion, numerical Cerenkov instability) . At present, the mitigation of these instabilities requires the use of very high resolution, which dramatically increases the computation time, and prevents realistic 3D modeling. Our project aims at building a new generation of highly accurate PIC codes, which will enable realistic 3D simulations of these yet unexplored interaction regimes.
These PIC codes will use highly precise very high order/pseudo-spectral methods to solve Maxwell’s equations. Despite their accuracy, such methods have however hardly been used so far, due to the low MPI scalability of the global Fourier transform to 10,000s of cores only, which is not enough to take advantage of supercomputer architectures required for 3D modeling.
To break this barrier, we propose to apply the cartesian domain decomposition technique currently used with low order field solvers (small stencils) to very high order/pseudo-spectral solvers (large stencils). In this technique, the simulation domain is usually divided into independent subdomains with small guard regions of length nguards and Maxwell's equations solved locally on each subdomain either using local convolution or local FFT's (instead of Global FFTs in past methods) when the order of the field solver is very large. For order 2 field solvers, this method only requires exchanging a few cells at subdomain boundaries and massive scaling of these solvers have been demonstrated on up to a million cores.
For higher order field solvers, the fundamental argument legitimating this method is that physical information cannot travel faster than the speed of light. Choosing large enough guard regions should therefore ensure that spurious signal coming from stencil truncations at subdomains boundaries would remain in guard regions and would not enter the simulation domain after one time step. For not too large guard regions, this method will enable the use of pseudo-spectral solvers at large scale.
However, there are two major challenges that need to be addressed before its use in PIC codes. These two challenges constitute the main scientific objectives of this Marie Curie project and are summarized below:
• Summary description of the project objectives:
Scientific objective 1 - High performance implementation of arbitrary order (pseudo-spectral) solvers: The first challenge is a "local or shared memory implementation" (objective 1.1) of field solvers using FFT's or spatial convolution on each subdomain which would offer the best scaling. This requires the use of larger MPI subdomains to reduce the total volume of guard cells exchanges. Indeed, for large order field solvers, we want to reduce the total volume of guard regions exchanges between MPI subdomains by increasing significantly the size of MPI subdomains to benefit from the surface/volume ratio. This implies the achievement of a very good shared memory implementation of the code (field solver and particle routines) as well as good memory locality to efficiently reuse cache.
Scientific objective 2 - Truncation error characterization: The first potential drawback is the non-locality of Gibbs oscillations that arises when a signal is sharply truncated. Indeed, depending on guard cells sizes, truncation errors may affect the entire simulation domain and instabilities may appear through time. This could prevent the use of the new technique in laser-plasma simulation with PIC codes. As a consequence, truncation errors have to be characterized and controlled to enable the use of the technique in PIC codes.
• Description of the work performed and main results achieved so far:
The fellow made significant advances on both objectives of this project during the first 24 months:
1. He developed a high performance Kernel PICSAR 3D (first milestone) that was tested on NERSC supercomputers at Berkeley and fully optimized for both standard (small/moderate MPI-subdomains) and arbitrary order solvers (large MPI/subdomains) on new computer architectures. PICSAR 3D has been parallelized using Hybrid MPI/OpenMP programming model and exploits the three levels of parallelization accessible on modern computers (Internode/Intranode/Vectorization). Field solver and particle routines were efficiently parallelized using this programming model. In addition, the fellow developed an efficient blocking or tiling technique for particles and grid quantities that allows for optimal cache reuse. The tiling technique gave a factor of x3 (on Intel Ivy Bridge) and a x5 speed-up (on intel MIC) for the whole code. It is thus a significant improvement that speeds-up pseudo-spectral simulations requiring larger than usual MPI subdomains for which cache reuse is a bottleneck. Thanks to all these optimizations, the pseudo-spectral Kernel PICSAR has demonstrated scaling on up to 800,000 cores on the MIRA cluster at Argonne National Laboratory as part of a director discretionary award obtained in 2016. PICSAR now became a library used by a large user community and has been coupled back to the WARP legacy code for speeding up production simulations.

2. The fellow developed an analytical model thanks to which we can analytically predict the errors coming from stencil truncations at subdomain boundaries when using very high order Maxwell solvers with the classical Cartesian domain decomposition technique. This model is a significant step towards the use of high order/pseudo-spectral solvers in electromagnetic simulations as we can now exactly predict the number of guard cells and the order of the Maxwell solver that are required to obtain a given accuracy and inhibit error growth coming from stencil truncations.

• Expected final results and their potential impact and use:

The result is now a 3D pseudo-spectral PIC code which has the potential to considerably improve the accuracy and efficiency (in FLOPS) of most electromagnetic plasma simulations and hence to enable 3D simulations, not otherwise possible with present PIC codes. By reducing the number of space and time steps required for an accurate description of the physics, this code would considerably reduce the time to solution in both the design of devices and the study of fundamental science in the area of plasma and electro-energetic physics, including -but not limited to- laser plasma acceleration and relativistic optics (e.g. filamentation, high harmonic generation and ion acceleration). This would make 2D simulations accessible on small clusters while 3D simulations will at last become accessible to most groups working in this field. Algorithms such as those proposed here, which eliminate dispersion and minimize heating errors, are especially important to the conception of next generation PW laser plasma acceleration experiments where the orders of magnitude higher beam quality (in emittance and/or energy spread) required, would necessitate a similar increase in numerical accuracy.

Contact

Jean-Christophe Coste, (Financial Officier at IRAMIS)
Tel.: +33 169089097
Fax: +33 169082199
E-mail

Subjects

Life Sciences
Record Number: 193161 / Last updated on: 2017-01-16
Follow us on: RSS Facebook Twitter YouTube Managed by the EU Publications Office Top